Abstract
Cognitive diagnosis, a fundamental task in education assessments, aims to quantify the students' proficiency level based on the historical test logs. However, the interactions between students and exercises are incomplete and even sparse, which means that only a few exercise scores of a specific student are observed. A key finding is that the pattern of this missingness is non-random, which could induce bias in the estimated proficiency value. To this end, we formulate cognitive diagnosis with a sample selection problem where observations are sampled through non-random probabilities that correlate with both the student's response correctness and the features of the student and exercise. We proposed a simple but effective method called HeckmanCD, adapting the Heckman two-stage approach to mitigate this endogeneity issue. We first employ an interaction model to predict the occurrence probability of a specific student-exercise pair. After that, a selection variable, derived from this interaction model, is incorporated as a controlled independent variable in the cognitive diagnosis framework. Our analysis reveals that the vanilla estimations of the item response theory model are inherently biased in the existence of confounders, and our method can correct this bias by capturing the covariance. The proposed HeckmanCD can be applied to most existing cognitive diagnosis models, including deep models, and the empirical evaluation demonstrates the effectiveness of our method while no other auxiliary information is required such as textual descriptions of exercises. © 2024 ACM.
| Original language | English |
|---|---|
| Title of host publication | CIKM '24 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management |
| Place of Publication | New York, NY |
| Publisher | Association for Computing Machinery |
| Pages | 768-777 |
| ISBN (Print) | 9798400704369 |
| DOIs | |
| Publication status | Published - Oct 2024 |
| Event | 33rd ACM International Conference on Information and Knowledge Management (CIKM 2024) - Boise Centre, Boise, United States Duration: 21 Oct 2024 → 25 Oct 2024 https://cikm2024.org/ |
Publication series
| Name | International Conference on Information and Knowledge Management, Proceedings |
|---|---|
| ISSN (Print) | 2155-0751 |
Conference
| Conference | 33rd ACM International Conference on Information and Knowledge Management (CIKM 2024) |
|---|---|
| Abbreviated title | CIKM '24 |
| Place | United States |
| City | Boise |
| Period | 21/10/24 → 25/10/24 |
| Internet address |
Bibliographical note
Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).Research Keywords
- cognitive diagnosis
- Heckman model
- sample selection bias
Fingerprint
Dive into the research topics of 'HeckmanCD: Exploiting Selection Bias in Cognitive Diagnosis'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver