To analyse national student data collected for International Civic and Citizenship Education Study (ICCS 2016) one should take into consideration its complex survey sampling design, “two-stage sampling procedure whereby a random sample of schools is selected at first stage, and one or two intact target grade classes (…) are sampled at the second stage”, as well as multiple imputation of civic knowledge proficiency test scores. Details can be found in ICCS 2016 User Guide for the International Database and ICCS 2016 Technical Report. This post shows how to prepare data for analysis using survey and mitools R packages.
You will need to register at IEA Data Repository to get the data files. This example uses datasets in SPSS format which are imported using read.spss()
function in foreign
package. There are four relevant national datasets for Croatia and they are read and merged into an object called 'data'
.
# path = 'path/to/data/dir'
filenames <- list.files(path, pattern = ".sav$", full.name = TRUE)
data.list <- lapply(filenames, foreign::read.spss, use.value.labels = FALSE, to.data.frame = TRUE,
use.missings = TRUE)
data <- Reduce(function(...) merge(..., all = TRUE), data.list)
To analyse test scores, note that students did not take a complete test and the final population estimate was derived using plausibile values methodology based on multiple imputation technique. There are five sets of plausibile values provided in the dataset (columns PV1CIV
to PV5CIV
) and estimation should always be done using all of them, as recommended in technical documentation. To deal with multiple imputation in a complex survey design context, the first step is to create five versions of the same dataset, each with a new column representing a different set of plausibile values. In the second step, survey design object ‘des’
is created based on provided replicate weights using svrepdesign()
function from survey
package and a list of datasets using imputationList()
function from mitools
package.
data.imp <- lapply(1:5, function(x, d = data) {
d$pvciv <- data[, paste0("PV", x, "CIV")]
return(d)
})
library("mitools")
library("survey")
des <- svrepdesign(repweights = "^SRWGT", type = "JK1", scale = 1, combined.weights = TRUE,
weights = ~TOTWGTS, data = imputationList(data.imp))
To combine plausibile values of test scores, use functions from survey
package wrapped in with()
and MIcombine()
functions from mitools
package.
MIcombine(with(des, svymean(~pvciv)))
## Multiple imputation results:
## with(des, svymean(~pvciv))
## MIcombine.default(with(des, svymean(~pvciv)))
## results se
## pvciv 531.2111 2.470435
Analysis of other variables should be done in the same way, for example to get proper proportions and standard errors as shown on page 86 of the report Becoming Citizens in a Changing World you could do something like this:
des <- update(des, is3g16b = car::Recode(IS3G16B, "1:2='Yes';3='No'"), is3g16c = car::Recode(IS3G16C,
"1:2='Yes';3='No'"), is3g16e = car::Recode(IS3G16E, "1:2='Yes';3='No'"))
x <- MIcombine(with(des, svymean(~is3g16b + is3g16c + is3g16e, na.rm = TRUE)))
cbind(percentage = round(coef(x) * 100, 1), SE = round(SE(x) * 100, 2))[c(2, 4, 6),
]
## percentage SE
## is3g16bYes 91.0 0.55
## is3g16cYes 20.2 1.02
## is3g16eYes 57.6 1.12
Please do read linked technical documentation! If you have any feedback, make a contact.