Training LLM-Based Tutors to Improve Student Learning Outcomes in Dialogues

Alexander Scarlatos, Naiming Liu, Jaewook Lee, Richard Baraniuk, Andrew Lan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Generative artificial intelligence (AI) has the potential to scale up personalized tutoring through large language models (LLMs), with recent works focusing on training or prompting LLMs to follow effective pedagogical principles. However, these models are not trained to maximize student learning throughout the course of a dialogue, so may engage with students in a suboptimal way. We address this limitation by introducing an approach to train LLMs to generate tutor utterances that maximize the likelihood of student correctness, while still encouraging the model to follow good pedagogical practice. Specifically, we generate a set of candidate tutor utterances and score them using (1) an LLM-based student model to predict the chance of correct student responses and (2) a pedagogical rubric evaluated by GPT-4o. We then use the resulting data to train an open-source LLM, Llama 3.1 8B, using direct preference optimization (DPO). We show that tutor utterances generated by our model lead to significantly higher chances of correct student responses while maintaining the pedagogical quality of GPT-4o. We also conduct qualitative analyses and a human evaluation to demonstrate that our model generates high quality tutor utterances. (This work is partially supported by Renaissance Philanthropy via the learning engineering virtual institute (LEVI) and NSF grants 2118706, 2237676, and 2341948.) (Our code is available at https://github.com/umass-ml4ed/tutorbot-dpo.)

Original languageEnglish (US)
Title of host publicationArtificial Intelligence in Education - 26th International Conference, AIED 2025, Proceedings
EditorsAlexandra I. Cristea, Erin Walker, Yu Lu, Olga C. Santos, Seiji Isotani
PublisherSpringer Science and Business Media Deutschland GmbH
Pages251-266
Number of pages16
ISBN (Print)9783031984136
DOIs
StatePublished - 2025
Event26th International Conference on Artificial Intelligence in Education, AIED 2025 - Palermo, Italy
Duration: Jul 22 2025Jul 26 2025

Publication series

NameLecture Notes in Computer Science
Volume15877 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference26th International Conference on Artificial Intelligence in Education, AIED 2025
Country/TerritoryItaly
CityPalermo
Period7/22/257/26/25

Keywords

  • Large Language Models
  • Math Education
  • Reinforcement Learning
  • Tutor-Student Dialogues

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Training LLM-Based Tutors to Improve Student Learning Outcomes in Dialogues'. Together they form a unique fingerprint.

Cite this