NEWS | R Documentation |

Added support for the 2009 High School Longitudinal Study (see

`downloadHSLS`

and`readHSLS`

). These do not support Restricted Use Data (RUD).Added support for the 2002 Education Longitudinal Study (see

`downloadELS`

and`readELS`

).Added the ECLS Kindergarten Class of 1998-1999 Study. Thes datasets can be downloaded with

`downloadECLS_K`

and read in with`readECLS_K1998`

. This was added in 2.3.0 but first added to the NEWS for 2.4.0.Added support for ePIRLS (see

`read_ePIRLS`

and`download_ePIRLS`

). This was added in 2.3.0 but first added to the NEWS for 2.4.0.

Added support for 2018 for the existing

`readICILS`

function. Thanks to Jeppe Bundsgaard of Aarhus University, Danish School of Education, for contributing the code for this.OECD moved the PIAAC data to a new location, and the

`downloadPIAAC`

function now uses the new URL.The PISA 2015 data cache could have been formed incorrectly; that is fixed. When PISA 2015 datasets are first used with 2.4 it will take time to re-cache the data. The process now also uses far less memory.

The PISA data had incorrect PSU and stratum variables for most years. They are all fixed except for 2000, which we do not believe has a PSU variable on the file. Several strata have only one PSU, so the design still needs to be edited by the user to get correct Taylor series sampling variance estimates.

The

`mixed.sdf`

function now correctly aggregates results and has its methodology documented. It no longer supports binomial models and has several arguments deprecated as a result.The

`gap`

function argument`varMethod`

is deprecated. The function uses only jackknife variance estimation.The

`gap`

function now accounts for linking error between NAEP paper and digitally based assessments.The

`subset`

function used to fail when a global variable shared a name with a column on the data; it now works.The

`percentile`

function has been updated to a formula that generates survey percentiles that are robust to transformation. For example, if the values are multiplied by a constant, then the percentiles move by that same constant.The

`lm.sdf`

function header now prints the number of plausible values used (when they are used) as well as the number of plausible values used in the sampling variance (`jrrIMax`

).The

`levelsSDF`

function now makes more informative warnings when passed a`light.edsurvey.data.frame`

.The

`print`

function for`cor.sdf`

now prints a final new line.When

`searchSDF`

was called with more than one search string and the`levels`

argument was set to`TRUE`

, a useless warning was issued. The warning was removed.More

`EdSurvey`

functions wrap to the width of the console.

The

`rq.sdf`

function adds quantile regression to the package. See the`rq.sdf`

documentation for more information.The functions

`getStratumVar`

and`getPSUVar`

were added to give users stratum and PSU variables for surveys and Taylor series analyses.The

`summary2`

function now accepts a vector of variables.The

`searchSDF`

function now accepts a vector`string`

input so that results can be filtered.The formula for degrees of freedom when Taylor series variance estimation is used has been updated. The new formula is derived in the statistics vignette. See https://www.air.org/sites/default/files/EdSurvey-Statistics.pdf.

PISA 2015 is now supported.

The

`waldTest`

function allows the user to test composite hypotheses–hypotheses with multiple coefficients involved–even when the data include plausible values. Because there is no likelihood test for plausible values nor residuals, the Wald test fills the role of the likelihood ratio test, ANOVA, and F-test.The

`mvrlm`

function adds multivariate regression (a regression with multiple outcomes) to the package. See the`mvrlm`

documentation for more information.Survey weighted mixed models can now be fit with the

`mixed.sdf`

function. Both linear and logistic models can be fit. These models are limited to 2 levels (one level with random effects).Regressions can now output standardized regression coefficients using

`summary(myLm, src=TRUE)`

. When the`lm.sdf`

call includes`standardizeWithSamplingVar=TRUE`

the standardized error of the standardized regression coefficient account for the sampling error and measurement error (when applicable). Otherwise, the standard deviations are assumed to be measured without error.Added

`summary2`

function to produce unweighted and weighted descriptive statistics of a variable in`edsurvey.data.frame`

or`light.edsurvey.data.frame`

Added $ variable access to edsurvey.data.frame. e.g. sdf$dsex

Updated covariance matrix estimation in the

`vcov`

function for`lm.sdf`

and`glm.sdf`

to work when`varEstInputs`

was not returnedAdded covariance matrix estimation for

`lm.sdf`

when Taylor series variance estimation was used.Added

`rebindAttributes`

function to make dplyr interaction more smooth. See the`rebindAttributes`

documentation for an example.When printing an

`edsurvey.data.frame`

it now says the survey, year, subject, and country at the top. The dimensions are moved down.In

`gap`

, achievement levels can be specified with partial matches. This helps out when achievement levels have long names.

Added the ECLS Kindergarten Class of 2010-2011 Study. Longitudinal datasets can be downloaded with

`downloadECLS_K`

and read in with`readECLS_K2011`

.PIRLS 2016 is now supported.

Added

`returnNumberOfPSU`

in`achievenemtLevels`

,`percentile`

,`lm.sdf`

, and`gap`

to report the number of primary sampling units (PSUs) used to calculate a statistic.Added

`oddsRatio`

helper function for logit results to show odds ratios.When running a Pearson correlation on a discrete variable,

`cor.sdf`

function by default condenses the occupied response codes to be consecutive integers. This can now be turned off by setting the`condenseLevels`

argument to`FALSE`

so that the code book levels are used instead.

The

`glm.sdf`

function now uses the`glm2`

package to fit models. This package converges on a broader class of models.The

`EdSurvey`

package nolonger sets the number of threads used by the required`data.table`

package to one when EdSurvey is being attached. The issue is now fixed in`data.table`

.Across the download functions, when a file is missing the warning text was homogenized. Additionally, all download functions now support a

`verbose`

argument that can be used to make downloads silent.The

`lm.sdf`

and`glm.sdf`

functions now accept formulas that use the`I()`

function or other unevaluated expressions to the left of the tilda. Previously only a single variable could be named.-
`recode.sdf`

now checks that each recode has only a`to`

and`from`

in it. The

`edsurveyTable`

function now works without RHS variable, allowing the formula`y ~ 1`

to return overall means.The

`percentile`

function used to produce output even if asked to produce a percentile outside of the valid range (0 to 100). Now it prints out message if users input at least percentile outside of the valid range, and stops if all percentiles are invalid.The

`getData`

function now removes rows with omitted levels after being recoded.All SPSS (.sav) file reads using the

`haven`

package set the`user_na = TRUE`

flag to ensure no defined missing/omitted values are automatically converted to`NA`

values prematurely.For consistency with other download functions,

`downloadPISA`

now uses a`years`

argument instead of a`year`

argument.

Running an

`edsurveyTable`

on an`edsurvey.data.frame.list`

used to have the potential to create invalid (unprintable) output if the factor levels did not agree on every element of the`edsurvey.data.frame.list`

. They now return printable output.An

`edsurveyTable`

could produce a standard error when there was data from only one stratum. It now produces an`NA`

standard error.

Works with the Trends in International Mathematics and Science Study (TIMSS), TIMSS Advanced, the Progress in International Reading Literacy Study (PIRLS), and the International Computer and Information Literacy Study (ICILS), International Civic and Citizenship Education Study (ICCS), the Civic Education Study (CivEd), the Program for International Student Assessment (PISA), the Program for the International Assessment of Adult Competencies (PIAAC), and the Teaching and Learning International Survey (TALIS).

International datasets can be downloaded with

`downloadTIMSS`

,`downloadTIMSSAdv`

,`downloadPIRLS`

,`downloadICILS`

,`downloadICCS`

,`downloadCivEDICCS`

,`downloadPISA`

,`downloadPIAAC`

,`downloadTALIS`

.International datasets can be loaded with

`readTIMSS`

,`readTIMSSAdv`

,`readPIRLS`

,`readICILS`

,`readICCS`

,`readCivEDICCS`

,`readPISA`

,`readPIAAC`

,`readTALIS`

.

Added

`logit.sdf`

and`probit.sdf`

functions with support for survey item responses.Added

`gap`

codegap that compares the average, percentile, achievement level, or percentage of survey responses between two groups that potentially share members.Added

`percentile`

that calculates the percentiles of a numeric variable.Added

`showCodebook`

that retrieves variable names, variable labels, and value labels for an`edsurvey.data.frame`

,`light.edsurvey.data.frame`

, or`edsurvey.data.frame.list`

.

Redesigned

`achievementLevels`

,`edsurveyTable`

function for significant faster computation with much smaller memory footprint. We also made error messages and outputs more informative.More informative error message and output for

`cor.sdf`

,`levelsSDF`

,`getPlausibleValue`

,`print.edsurvey.data.frame`

,`searchSDF`

,`showPlausibleValues`

,`showWeights`

, and`getData`

.-
`lm.sdf`

and`glm.sdf`

are now S3 method extended from`stats::lm`

so users can call the function using`lm`

and`glm`

. Added the

`contourPlot`

for regressions diagnostic.Added the

`recode.sdf`

function for recoding levels within variables.Added the

`rename.sdf`

function for modifying variable names.Added the

`append.edsurvey.data.frame.list`

function to return a list of sdfs from either an edsurvey.data.frame.list or a single edsurvey.data.frame.

Manual documentation was refreshed.

Moved vignettes to AIR webiste at https://www.air.org/project/nces-data-r-project-edsurvey or see links in the vignette inlcuded in this package.

Added a new vignette on “Exploratory Data Analysis on NCES Data” provides examples of conducting exploratory data analysis on NAEP data.

Added new vignette on “Calculating Adjusted p-Values From EdSurvey Results” to the AIR website describing the basics of adjusting p-Values to account for multiple comparisons.

Added new vignette on “Using EdSurvey to Analyze TIMSS Data” to the AIR website describing; an introduction to the methods used in analysis of large-scale educational assessment programs such as Trends in International Mathematics and Science Study (TIMSS) using the EdSurvey package. The vignette covers topics such as preparing the R environment for processing, creating summary tables, running linear regression models, and correlating variables.

Added new vignette on “Using EdSurvey for Trend Analysis” to the AIR website describing the methods used in the EdSurvey package to conduct analyses of statistics that change over time in large-scale educational studies.

Added new vignette on “Producing LaTeX Tables From edsurveyTable Results With edsurveyTable2pdf” to the AIR website detailing the creation of pdf summary tables from summary results using the edsurveyTable2pdf function.

Added new methodology documentation on “Methods Used for Gap Analysis in EdSurvey” to the AIR website convering the methods comparing the gap analysis results of the EdSurvey package to the NAEP Data Explorer.

Added new methodology documentation on “Methods Used for Estimating Percentiles in EdSurvey” to the AIR website describing the methods used to estimate percentiles.

Added new methodology documentation on “Weighted and Unweighted Correlation Methods for Large-Scale Educational Assessment: wCorr Formulas” to the AIR website detailing the methodology used by the wCorr R package for computing the Pearson, Spearman, polyserial, and polychoric correlations, with and without weights applied. See https://www.air.org/resource/weighted-and-unweighted-correlation-methods-large-scale-educational-assessment-wcorr.

Fixed connection issue associated with closing connections to an

`LaF`

.

readNAEP function now works on a case sensitive file system.

Vignettes now should appear in numerical order on CRAN.

Examples now name most arguments.

Vignettes now name most argument.

some print functions had number of plausible values added.