MixedEffects.Rmd

---
title: "An introduction to mixed-effects models"
author: "Richard Nichols & Hannes Becher"
date: "11th June 2015"
output: "html_document"
---

## Grow your own apples
We will introduce some ideas about mixed effects models using the example of apples growing on different trees within an orchard.  First lets take a sample of apples from a single tree.  We will use this R code

```{R}
# Set your sample size to 20
NumApples=20	 

# Choose a mean apple size for the tree. The code allocates a value around 20g 	
# plus a small random deviation so the value is unique to you				
TreeMean1=20+rnorm(1,sd=3)		

# Chose a standard deviation of around 5g in the same way
TreeSD1=5+rnorm(1,sd=1)			

# Generate a sample of apples from tree 1
apples1<-rnorm(NumApples,
				 mean=TreeMean1,
				 sd=TreeSD1)

# Draw a histogram of your sample
hist(apples1,
	main='Weights of a sample of Apples from Tree 1',
	breaks=seq(0,150,5))		
````							

Because you have a relatively small sample from the tree, the mean of your sample
and the stadard deviation of your sample would be expected to deviate from the
true values for all the apples on the tree.  Lets use the functions mean() and
sd() to show that this is the case.  We can just print them out

```{R}
mean(apples1); TreeMean1; sd(apples1); TreeSD1
```
Here they are set out in a table

```{R Display Table, results='asis', echo=FALSE}
library(stargazer)
sampleMean1<-mean(apples1)
sampleSD<-sd(apples1)
stargazer(matrix(	c(TreeMean1,TreeSD1,sampleMean1,sampleSD),
				nrow=2,ncol=2,byrow=F,
				dimnames<-list(c('Mean','SD'),c('True Values ','.       Sample Values.'))
				),
			type='html',
			title='Sample statistics vs true values'
			)	
```
&nbsp;

  
  
* Point to remember:  Its obvious, but easy to forget that the mean weight and standard deviation of _all_ the apples on the tree are different from the mean and standard deviation of your sample.

Now lets use the function lm, which fits a 'linear model' to summarise this sample.

```{R}
mod1<-lm(apples1~1)
summary(mod1)
```

This statement created a linear model called 'mod1' (you could have called it anything you like). The model is described by
<BR>&nbsp;&nbsp; `apples1~1`
<BR>
One way of interpreting the code is to read the '~' symbol as 'is explained by'.  The value '1' indicates that all the apples' weights are distributed around the same mean (they are not expected to have identical weights, but to vary around this mean).  You will see from the printout generated by the summary() function that lm() found the same estimated mean as the mean(apples1).  The Standard Error that it reports, however is __not__ the standard deviation of the sample.

* __Very important point to remember:__ the 'Std. Error' is a value that indicates the reliability of your estimate of the mean.  Recall the estimated mean (sample mean) differs from the true mean of all the apples on the tree - so you have an intrest in evaluating how

You can calculate the Standard Error yourself: divide the sample standard deviation by the square root of the sample size.

```{R}
sd(apples1)/sqrt(NumApples)
```