Individual Exercise
- Take a grouped dataset (observations within states, individuals, regions, years, etc.) that is of interest to you and use reshape2to obtain group means and standard deviations, andplyrto conduct a no pooling analysis of a response variable. Report the estimated coefficients and standard errors in dataframes.
## example w/ baseball data
library(plyr)
library(reshape2)
bb <- baseball; bb$team <- NULL; bb$lg <- NULL
bb_molt <- melt(bb, 'id')
bb_mean <- dcast(bb_molt, id ~ variable, mean)
bb_sd <- dcast(bb_molt, id ~ variable, sd)
- Using the Broken Function.Rscript on Sakai, work withbrowser()and the debugger to find all the mistakes in the function. Fix them so that the last two lines of the script return a vector of zeroes. Include the corrected function in your notebook and demonstrate that it works.
## correct function 
index.means <- function(x, rows = T) {
  
  if (class(x) != 'data.frame' && class(x) != 'matrix') stop('needs two dimensional input')
  simple.mean <- function(x) sum(x) / length(x)
  
  if (rows == T) {
    
    output <- numeric(nrow(x))
    for (i in 1:nrow(x)) output[i] <- simple.mean(x[i, ])
    
  } else {
    
    output <- numeric(ncol(x))
    for (i in 1:ncol(x)) output[i] <- simple.mean(x[, i])
    
  }
  
  output
  
}
## generate fake data
mat <- matrix(rgamma(200, 2, 3), 20, 10)
## these both yield vectors of zeroes (+/- floating point errors)
index.means(mat) - apply(mat, 1, mean)
##  [1]  0.0e+00  1.1e-16 -5.6e-17  0.0e+00  0.0e+00 -1.1e-16  5.6e-17
##  [8]  0.0e+00  0.0e+00 -1.1e-16  0.0e+00  2.2e-16  0.0e+00  0.0e+00
## [15]  1.1e-16  0.0e+00  0.0e+00 -5.6e-17  0.0e+00  0.0e+00
index.means(mat, rows = F) - apply(mat, 2, mean)
##  [1] -5.6e-17  0.0e+00 -1.1e-16  0.0e+00  1.1e-16  0.0e+00  0.0e+00
##  [8]  0.0e+00  0.0e+00  0.0e+00