Abstract
Tests, surveys, and questionnaires are used everywhere every day. Governments and industries use tests to select personnel and assess their performance. In marketing, (cross-country) surveys are administered to understand people’s attitudes or shopping habits to target potential customers. In educational settings, large-scale assessment programs have been implemented over the past few decades to monitor achievement trends globally to establish and revise current curriculum and achievement standards as well as stimulate educational reforms. To ensure measurement fairness and equality across groups or countries before comparisons are made, differential item functioning (DIF) has been widely used for this purpose. Once examinees from different groups that show differing probabilities of success on or endorsing an item after being matched on the latent ability that the item is designed to measure, DIF occurs, and the item favors one group and disadvantages another. Test scores thus are not comparable, and misinformed decisions will be made.In a DIF assessment, participants are matched on levels of the latent ability, called a matching variable. The purpose of a matching variable is to create a common metric to put groups of participants on the same scale, and the responses to the studied item are compared. When a mismatch between groups occurs, DIF-free items might be identified as DIF items and therefore be removed from the test, and time and money are wasted on test development. True DIF items might be kept in a scale, and the test scores are misused in comparisons. Usually, a matching variable is total scores of a scale and must be modified under some conditions in order to create a more accurate common metric, such as the multidimensionality of data or two or more grouping factors involved in investigations. Previous studies have indicated that participants tend to systematically respond to test items other than what the instrument is intended to measure, referred to response styles. Response styles are consistent across attitude and personality assessments, and patterns and degrees vary across groups and cultures. Psychometric studies have shown that response styles are detrimental to the validity and reliability of an assessment tool, and estimates of item parameters are biased. Thus, total scores are contaminated and cannot be used as a matching variable for DIF assessment when response styles are involved. Conventional DIF approaches, which do not take into account the impact of response styles, fail to establish an appropriate matching variable and yield problematic DIF detections. Different performance on tests cannot be interpreted as group differences. Response styles and DIF have been received much attention in academia. Searches in PsyInfo using keywords such as “response styles” and “differential item functioning” found more than 10,000 publications addressed response styles and nearly 1,700 articles investigated DIF. Although response styles have been recognized as an important factor that distorts psychometric properties of scales, no studies have examined its impact on the performance of standard DIF methods, except for my two pilot studies. Findings from my two pilot studies suggested that without taking the influence of response styles into account conventional DIF assessments (e.g., logistic discrimination function analysis [LDFA] and ordinal logistic regression) yielded serious inflations of false positive error rates (Type I error rates). A proposed procedure concerning the impact of response styles improved the performance of LDFA and showed that Type I error rates were well controlled. Thus, the aims of this project are (a) to resolve the limitations of conventional DIF assessments, (b) to develop a new procedure for assessing DIF, (c) to evaluate the performance of a new procedure under various conditions, and (d) to develop free software for practitioners and researchers use. Computer simulation studies and publicly accessible datasets will be analyzed to evaluate applications of the new framework.
| Original language | English |
|---|---|
| Publication status | Published - 31 Oct 2014 |
| Event | annual meeting of the Taiwan Education Research Association - Kaohsiung, Taiwan, China Duration: 31 Oct 2014 → 1 Nov 2014 |
Conference
| Conference | annual meeting of the Taiwan Education Research Association |
|---|---|
| Place | Taiwan, China |
| City | Kaohsiung |
| Period | 31/10/14 → 1/11/14 |