A dynamic theory of acquisition and extinction in operant learning

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

This article offers a new neural network framework for understanding both the transients and the asymptotes of operant (instrumental) learning. The theory shows that interplay between simple short and long-term memory mechanisms is sufficient to explain a large number of operant phenomena. It describes short- and long-term effects of reinforcement and how these effects modulate the operant response, how novel events are detected and processed, and how their consequences also modulate the operant response. The critical features of the present theory are: (1) reinforcement expectancy is defined as the aggregate prediction of response-reinforcement and stimulus-reinforcement associations; (2) reinforcement expectancy controls the rate of increase of the operant response; (3) the response is controlled by a behavioral inhibition unit which integrates the mismatch between expected (long-term) and experienced (short-term) events. The model predicts the general features of several operant phenomena such as response selection, development of preference under different manipulations of reinforcement probabilities, negative contrast, partial reinforcement extinction effect, and spontaneous recovery. Implications of the present theory for other operant conditioning phenomena, classical conditioning, and avoidance behavior are suggested.

Original languageEnglish (US)
Pages (from-to)201-229
Number of pages29
JournalNeural Networks
Volume10
Issue number2
DOIs
StatePublished - Mar 1997

Keywords

  • assignment of credit
  • contingency
  • expectancy
  • long-term memory
  • operant conditioning
  • recurrent choice
  • reinforcement learning
  • short-term memory

ASJC Scopus subject areas

  • Cognitive Neuroscience
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'A dynamic theory of acquisition and extinction in operant learning'. Together they form a unique fingerprint.

Cite this