Contextual multi-armed bandit algorithms for personalized learning action selection

Indu Manickam, Andrew S. Lan, Richard G. Baraniuk

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Scopus citations

Abstract

Optimizing the selection of learning resources and practice questions to address each individual student's needs has the potential to improve students' learning efficiency. In this paper, we study the problem of selecting a personalized learning action for each student (e.g. watching a lecture video, working on a practice question, etc.), based on their prior performance, in order to maximize their learning outcome. We formulate this problem using the contextual multi-armed bandits framework, where students' prior concept knowledge states (estimated from their responses to questions in previous assessments) correspond to contexts, the personalized learning actions correspond to arms, and their performance on future assessments correspond to rewards. We propose three new Bayesian policies to select personalized learning actions for students that each exhibits advantages over prior work, and experimentally validate them using real-world datasets.

Original languageEnglish (US)
Title of host publication2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6344-6348
Number of pages5
ISBN (Electronic)9781509041176
DOIs
StatePublished - Jun 16 2017
Event2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - New Orleans, United States
Duration: Mar 5 2017Mar 9 2017

Other

Other2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017
CountryUnited States
CityNew Orleans
Period3/5/173/9/17

Keywords

  • contextual bandits
  • personalized learning

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Contextual multi-armed bandit algorithms for personalized learning action selection'. Together they form a unique fingerprint.

Cite this