University of Washington College of Education Professor Chun Wang has received a $764,000 federal grant to develop and disseminate more efficient statistical models and software for analyzing complex educational assessments.

Wang, director of the Psychometrics and Measurement Lab at the UW, will lead the project together with Professor Gongjun Xu from the University of Michigan over the next three years with funding from the National Center for Education Research. As part of the project, Wang’s team will share their software and theoretical work with other education researchers and assessment practitioners to improve the field’s ability to create effective assessments of student learning.

Wang answered questions about the project and its impact in a recent interview.

What suite of statistical learning methodologies are you proposing in the project?

Let me begin with a little background first. Developing, refining and validating educational assessments that are directly or indirectly related to measures of student academic outcomes is one core goal, known as the measurement goal, of the Institute of Educational Sciences. Psychometric methods and tools have always been an integral part of achieving this goal. When the constructs that these assessments try to measure become multifaceted, multidimensional item response theory (MIRT) provides a unified framework and convenient statistical tool for item analysis, calibration, and scoring. 

However, the increasing scale and complexity of survey designs, especially in large scale assessment (LSA), require MIRT models with many latent factors. The advancement of computational and statistical techniques helps promote the usage of MIRT models. Yet even with state-of-the-art algorithms, the computation can still be time-consuming, especially when the number of factors is large. 

In this project, we propose a family of innovative Gaussian variational expectation maximization (GV-EM) methods for high-dimensional MIRT models that reflect assessment and data collection designs of LSA more so than current methods. Variational approximation methods are mainstream methodology in computer science and they have been applied to diverse areas including speech recognition, genetic linkage analysis and document retrieval. Recently, there is an emerging interest in developing and applying variational methods in statistics. However, the variational methods are far less explored in psychometrics and educational measurement.

How will the new methodologies advance educational researchers’ capacity for handling data?

We aim to contribute to the practitioner’s toolkit for assessment design and analysis by providing innovative statistical learning methods, software and guidelines that are suitable for high dimensional assessment data, large sample sizes, large item banks and intricate designs.

In addition to item calibration, items for large-scale standardized testing are routinely scrutinized for differential item functioning (DIF) to ensure equitable comparison of assessment outcomes among different student groups. We intend to provide a suite of DIF detection methods within the MIRT framework and provide benchmarks of the new effect size measure for categorizing DIF items. The new sets of methods will be implemented in a user-friendly software package available as an R Shiny app, which researchers can use to conduct MIRT item calibration, DIF detection and student scoring.

Overall, it is expected that the proposed computation algorithms and the new DIF detection methods will greatly popularize MIRT applications, such that practitioners can truly benefit from MIRT advances because the analysis result not only provides validity evidence but also insight that can be cycled back into the assessment development process.

What is the outreach plan to ensure the proposed methodologies reach a broader audience?

We believe methods from this study will be of interest to methodologists, psychometricians, national and state assessment practitioners, as well as to the education research community. First, we will share our findings to academic audiences through national conference presentations and publications for organizations such as, but not limited to, the American Educational Research Association, National Council on Measurement in Education, Psychometric Society and Joint Statistical Meeting. Second, we will broadly disseminate the Shiny app, along with the manual that documents detailed user cases and recommended guidelines, to students, researchers and practitioners nationwide via online modules and in-person classes and workshops. The vetted source code will be shared on GitHub timely and project updates will be posted on the UW Psychometrics and Measurement Lab website. Finally, we will collaborate with the Washington State Office of Superintendent of Public Instruction to implement the MIRT approach for producing student growth measures in one of their assessment products.


Chun Wang, Assistant Professor of Education

Dustin Wunderlich, Director of Marketing and Communications