Data-mining textual responses to uncover misconception patterns

Joshua J. Michalenko, Andrew S. Lan, Andrew E. Waters, Phillip J. Grimaldi, Richard G. Baraniuk

Research output: Contribution to conferencePaperpeer-review

3 Scopus citations


An important, yet largely unstudied problem in student data analysis is to detect misconceptions from students’ responses to open-response questions. Misconception detection enables instructors to deliver more targeted feedback on the misconceptions exhibited by many students in their class, thus improving the quality of instruction. In this paper, we propose a new natural language processing-based framework to detect the common misconceptions among students’ textual responses to short-answer questions. We propose a probabilistic model for students’ textual responses involving misconceptions and experimentally validate it on a real-world student-response dataset. Experimental results show that our proposed framework excels at classifying whether a response exhibits one or more misconceptions. More importantly, it can also automatically detect the common misconceptions exhibited across responses from multiple students to multiple questions; this property is especially important at large scale, since instructors will no longer need to manually specify all possible misconceptions that students might exhibit.

Original languageEnglish (US)
Number of pages6
StatePublished - Jan 1 2017
Event10th International Conference on Educational Data Mining, EDM 2017 - Wuhan, China
Duration: Jun 25 2017Jun 28 2017


Other10th International Conference on Educational Data Mining, EDM 2017


  • Learning analytics
  • Markov chain Monte Carlo
  • Misconception detection
  • Natural language processing

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems


Dive into the research topics of 'Data-mining textual responses to uncover misconception patterns'. Together they form a unique fingerprint.

Cite this