On the Cover
2016; Wiley; Volume: 35; Issue: 3 Linguagem: Inglês
10.1111/emip.12121
ISSN1745-3992
Autores Tópico(s)Climate Change and Health Impacts
ResumoThe Remastered Item Person Map, created by Richard Feinberg and Daniel Jurich of the National Board of Medical Examiners, is featured on the cover. This visualization was one of the winning submissions in the 2016 EM:IP Cover Graphic/Data Visualization competition held earlier this year. Feinberg and Jurich developed a visual tool for informing the standard-setting process, particularly for evaluating the alignment of proposed cut scores with the test items and presenting the information in the context of the intended purpose(s) of a test. In an effort to optimize decisions based on test scores, classification methods have been developed to maximize the test's information around meaningful points on the score scale, such as the “passing” or “proficient” score on a licensure or certification test. Similar standard-setting challenges are found for the National Assessment of Educational Progress (NAEP) and many state accountability testing programs that report percentages of students scoring within different proficiency categories, such as Basic, Proficient, and Advanced. Depending on the standard-setting method used, the distribution of the examinees’ proficiency estimates, and the distribution of the test items’ difficulty estimates, may not be optimally aligned given the intended purpose of the test. Graphically presenting the degree of alignment would benefit both test developers and those charged with setting “proficiency” standards. This value is illustrated well by the Remastered Item Person Map, created in R (R Core Team, 2016) using the ggplot2 R package (Wickham, 2009). The Remastered Item Person Map is more aesthetically appealing than graphics typically produced by psychometric software. It is also customizable. For instance, the display highlights a common situation in classification tests where the primary test use is to identify individuals who are minimally competent. Overlaying the plot with the passing score, represented by the vertical dashed line in the graphic, illustrates that the majority of the test items in this example are targeted around the most important score point—maximizing decision accuracy at the possible cost of increased measurement error for other, secondary uses of the test, such as distinguishing among high performers at the upper end of the proficiency distribution. Additionally, to aid in the interpretation of the relative locations of the item difficulty distributions, the reporting scale can be labeled on the x-axis, as we see in this graphic with the scale scores in parentheses. Thus, this item-person map can be designed to incorporate additional information to help facilitate classification decisions and communicate valid inferences to a range of stakeholders. Feinberg and Jurich took this year's Cover Graphic/Data Visualization competition as an opportunity to provide practitioners with an easy to create, eye-catching visual tool that could assist in standard setting, and the subsequent evaluation of the alignment of the estimated person ability and item difficulty distributions given the location of actual (or potential) cut scores and the intended purpose of the test. As shown in the R code in the Online Supplemental Material, by simply changing the “cut-score” value, multiple versions of the plots can be produced on demand to illustrate the effects of possible cut scores on the proportion of “passing” students and in light of the difficulty of the items comprising the test. As noted by one of the judges in the competition, this graphic may not be entirely distinct from tools typically available in standard settings. However, using it well does not require proprietary software and the R code can be modified easily to add (or remove) graphical and informational features, such as printing the actual proportion of students falling in each performance level in the person density distribution. Moreover, this graph could be extended to include item and person distributions across several consecutive grade levels for vertically scaled assessments, allowing direct comparisons of cut score locations across grade levels to verify whether they align with increased mastery along a common continuum. Such an extended graphic would be particularly useful to those states and school districts that use a transition matrix or value table model when reporting student progress over grade levels. Yet another useful extension would be overlaying examinee distributions for different subgroups of interest, such as by socioeconomic status or English language learner status, to determine if there are different effects of cut score placement on the pass/fail rate across subgroups, which could arise due to different variances in their distributions. The usefulness of this graphic also depends in part on a number of features of the assessment, such as how the assessment is scaled. If the items are calibrated using the Rasch model, then the comparison of person ability and item difficulty estimates is more meaningful, because the rank ordering of items by difficulty will be the same for all examinees, regardless of ability estimate. We also note that relatively large numbers of items and examinees are needed to produce the clean-looking, smoothed distributions shown in the cover graphic. If the number of items is small or moderate, the item distribution may have several peaks and valleys that may (1) distract the viewer and (2) make it difficult to evaluate the extent the placement of the cut scores maximizes information around meaningful score points. Even with these considerations, the Remastered Item Person Map offers a great visual tool to inform the common practice of standard setting. In submitting their graphic and sharing their R code, Feinberg and Jurich show a commitment to advancing data visualization tools that inform psychometric analyses and facilitate communication with stakeholders—a value directly in line with the intent of the EM:IP Cover Graphic/Data Visualization competition. Let us know what you think by emailing Katherine Furgol Castellano (KEcastellano@ets.org) or Howard Everson (howard.everson@sri.com). Disclaimer: Supplementary materials have been peer-reviewed but not copyedited. Additional Supporting Information may be found in the online version of this article at the publisher's website: R Code Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article. In the Fall 2015 issue of EM:IP, we printed the R code for creating the Histogram of Weighted Effect Sizes for Meta-Analysis by Frederick Oswald, Rice University, and Seydahmet Ercan, Bulent Ecevit University. The histogram should show the effect sizes weighted by the inverse of the effect size variances, but the code involves weighting them by the effect size variances themselves. An additional line of R code, shown below, is needed to correctly weight the effect sizes. effect.var.inverse <- 1/effect.var
Referência(s)