Inferring distributions from observed mRNA and protein copy counts in genetic circuits

Komlan Atitey, Pavel Loskot, Paul Rees

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Defining distributions of molecule counts produced in the cell can elucidate stochastic dynamics of the underlying biological circuits. For genetic circuits, only a few distributions of messenger RNA and protein counts were reported in literature, so the task is to decide which of these candidate distributions best fit the observed data. In this paper, we present a statistical method to infer distributions of mRNA and protein counts from observed data. The main advantage of this method is that it does not require any prior assumptions or knowledge about underlying chemical reactions. In particular, a given distribution is fitted to the observed copy counts using a histogram with optimized bin sizes in order to reduce the fitting error. The goodness of fit is evaluated by Kolmogorov-Smirnov and chi-square statistical tests to accept or reject the hypothesis that observed molecule counts were generated from given distribution. The distribution fitting also yields the values of distribution parameters, or they can be estimated using the Bayes theorem. These parameters appear to be themselves random processes. The presented statistical framework for analyzing the observed mRNA and protein copy counts is illustrated for a simulated model of lac genetic circuit in Escherichia coli. For reaction rates assumed in the model, the results in literature predict that mRNA and protein counts at steady-state are gamma distributed. Our analysis shows that both mRNA and protein in the lac circuit model can be considered gamma distributed in at least 70% of times from the initial state until steady-state. The shape and scale parameters of observed gamma distributions are also gamma distributed, giving rise to double stochastic processes. More importantly, as shown previously, the distribution parameters are functions of transcription and translation rates, so presented statistical framework can be used to estimate or optimize reaction rates in biochemical systems.

Original languageEnglish (US)
Article number015022
JournalBiomedical Physics and Engineering Express
Issue number1
StatePublished - Jan 2019


  • gamma distribution
  • genetic circuit
  • goodness of fit

ASJC Scopus subject areas

  • Biophysics
  • Bioengineering
  • Biomaterials
  • Physiology
  • Biomedical Engineering
  • Radiology Nuclear Medicine and imaging
  • Computer Science Applications
  • Health Informatics


Dive into the research topics of 'Inferring distributions from observed mRNA and protein copy counts in genetic circuits'. Together they form a unique fingerprint.

Cite this