Advertisement
Help Keep Boards Alive. Support us by going ad free today. See here: https://subscriptions.boards.ie/.
If we do not hit our goal we will be forced to close the site.

Current status: https://keepboardsalive.com/

Annual subs are best for most impact. If you are still undecided on going Ad Free - you can also donate using the Paypal Donate option. All contribution helps. Thank you.
https://www.boards.ie/group/1878-subscribers-forum

Private Group for paid up members of Boards.ie. Join the club.

R statistics help

  • 15-02-2009 10:55PM
    #1
    Closed Accounts Posts: 39


    Ok so I need to analyse data for my undergrad project. I have experience of working with R but whats really confusing me is the format my data should be in before I convert it to a tab delimited file to import onto R.

    I have two genotypes which I am comparing %infection at 6 different treatments. I think I should be using a two way interaction ANOVA for this. I have books from the library but they all deal with R directly not with the data in excel.

    If anyone could help me I would be so grateful. Its just the basics, i know, but im really stuck!

    Thanks


Comments

  • Registered Users, Registered Users 2 Posts: 1,845 ✭✭✭2Scoops


    I have two genotypes which I am comparing %infection at 6 different treatments. I think I should be using a two way interaction ANOVA for this. I have books from the library but they all deal with R directly not with the data in excel.

    It's pretty standard that you would set up 3 columns. One is a grouping variable for genotype. So, if you have 10 values for genotype#1 and 10 for genotype#2, label 10 cells each in the column with g1 or g2. Column 2 will be a grouping variable for treatment, so put in a label for each treatment and match it up to the number of times it occurs with each genotype. Column 3 will have your actual %infection values.


  • Closed Accounts Posts: 39 nearly_there


    Thaks 2Scoops,

    you see thats what I have exactly but when i import it into R, I cant find normality of each treatment for each genotyope. Should i just check for this in excel first or is it possible to use R?

    Im trying to use the tapply function
    iV entered this

    > tapply ( infection[treat = "1"], type, mean)

    where 1 is the first treatment, and type means genotype and iv gotten

    Error in tapply(infection[treat = "1"], type, mean) :
    arguments must have same length

    is there something glaringly obvious im doing wrong?


  • Registered Users, Registered Users 2 Posts: 1,845 ✭✭✭2Scoops


    you see thats what I have exactly but when i import it into R, I cant find normality of each treatment for each genotyope. Should i just check for this in excel first or is it possible to use R?

    Not sure, tbh. I tend to use GUIs with R so not 100% up to speed with the code. If you have a low N, it may not calculate it. As this is a side-measure, I don't see the problem with doing it quickly in Excel and leave the heavy lifting to R. :pac:


  • Closed Accounts Posts: 39 nearly_there


    I dont suppose theres any chance anyone could translate this for me? i think il have to start doing them out longhand....:(



    Df Sum Sq Mean Sq F value Pr(>F)
    type 1 73 73 0.1785 0.6731
    treat 1 118360 118360 291.0505 <2e-16 ***
    Residuals 197 80113 407
    ---
    Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

    Tukey multiple comparisons of means
    95% family-wise confidence level

    Fit: aov(formula = nec ~ type + treat)

    $type
    diff lwr upr p adj
    M-G 1.205 -4.419154 6.829154 0.6731014

    Warning messages:
    1: In replications(paste("~", xx), data = mf) : non-factors ignored: treat
    2: In TukeyHSD.aov(model) :
    'which' specified some non-factors which will be dropped


  • Registered Users, Registered Users 2 Posts: 1,845 ✭✭✭2Scoops


    Df Sum Sq Mean Sq F value Pr(>F)
    type 1 73 73 0.1785 0.6731
    treat 1 118360 118360 291.0505 <2e-16 ***
    Residuals 197 80113 407

    Reading across, it shows the degrees of freedom, sum of squares, mean square, F-value and p-value. Looks like treatment is significant.
    Tukey multiple comparisons of means
    95% family-wise confidence level

    Fit: aov(formula = nec ~ type + treat)

    $type
    diff lwr upr p adj
    M-G 1.205 -4.419154 6.829154 0.6731014

    These are the results from a Tukey post-hoc test. It shows the mean difference, the 95% CI and the p-value.


  • Advertisement
  • Closed Accounts Posts: 39 nearly_there


    Thanks!!


Advertisement