From genotype to phenotype: instruction or selection?
Jean-Jacques Kupiec and Pierre Sonigo
Génétique des Virus
ICGM-CNRS UPR 415
22 rue Méchain, 75014, Paris
France
Posted on Heraclitean Biology Group
web site March 7, 1999
Copyright 1999 by Kupiec and Sonigo
Abstract
For reductionist biology, macroscopic structures are the result of the integration
of molecular interactions. Because of the specific and univocal character of
each molecular interaction, to a set of molecules only one structure corresponds.
In this essay, an alternative thesis is suggested. Molecular interactions are
not specific. To a set of molecules corresponds several structures, each of
them having a certain probability of being achieved. In this framework the making
of a phenotype, notably during cellular differentiation, is the result of an
epigenetic selection process among all the possible structures coded by a genotype.
Genetics without specificity ?
Specificity is a central concept in biology. It is closely linked to another
widely used concept called information (also referred to as instruction), which
is indispensable to the definition of the "genetic programme". Some
examples of specificity that are frequently evoked include the specific information
that orients the differentiation of a cell, that which targets a molecule to
a subcellular compartment, or that which, contained in a sequence of amino acids,
permits the functional folding of a protein. The field of application of this
notion is therefore extremely broad and can be used to explain a variety of
phenomena. It is, in fact, at the very heart of molecular biology. The specific
information corresponds, physically, to an exclusive recognition and interaction
between two molecules. As it is usually understood, this mechanism is strictly
determinist. It leaves no room for variability because of the univocal character
of the recognition between two molecules (for a more detailed explanation see
the definition of stereospecificity in Monod, 1971).
These concepts underlye the paradigm of molecular biology in its reductionist
version. Macroscopic structures are the results of the integration of all molecular
interactions (Fig. 1A). In this framework, to an ensemble of molecules corresponds
only one structure, as a consequence of the specific character of each interaction
(Fig. 2). Order at the cellular level thus directly and mechanically reflects
order at the molecular level. The consequences of this theory in terms of research
strategy are straightforward and far-reaching: whatever the phenomenon studied,
the search for an underlying gene or protein and the analysis of its interactions
with other proteins, will give the explanation.
![]() |
|
Figure 1: Darwinian Biology as an alternative to Determinist reductionnism. In the reductionnist model the making of a phenotype results from the summation of all the molecular interaction whereas in the darwinian model there is a selection among all the potential stuctures. |
However, this view of biological systems based on specificity is contradicted by experimental data. Indeed, in spite of many efforts to isolate the specific molecules that are thought to underlye biological regulations, only non-specific molecules (i.e. having a wide spectrum of targets, with multiple uses) have been identified. This fact has been established either for signal transduction (kinases or phosphatases) or for gene expression regulation (ubiquitous transcriptional regulators; for reviews on the non-specificity of these regulators see Gehring et al., 1994; Duboule and Wilkins, 1998). In order to resolve the contradiction between the theoretical basis of "determinist reductionnism" which predicts the existence of specific regulators, and the experimental discovery of non-specific ones, specificity could be abandoned and the way molecular interactions are viewed could be modifyed. As an alternative, the notion of diversity and the probabilist concept of chance-selection inherent to the darwinian theory of evolution could be put forth. These concepts, which are well established at the level of populations of organisms, have been ignored at the level of populations of molecules.
![]() |
|
Figure 2: To an ensemble of molecules, only one macroscopic structure corresponds (specificity). |
We support here an alternative thesis whereby molecular interactions do not present a univocal or specific relationship but, on the contrary, are of non-specific character. Randomness, rather than " determinist instruction", guides these interactions, so that each genome carries the potential to produce several macroscopic structures corresponding to different combinations of molecular interactions (Fig. 3). Of this great number, only one or a few are "functional". We suggest that the unique phenotype, or the individual, that is finally produced is the result of a process of "selection" that is exerted on the totality of these potential structures (Fig. 1B). Here, the term "selection" must be taken in a broad sense to signify that there is a sorting process posterior to random molecular interactions, but one that does not necessarily imply a pure and simple elimination of the source of useless variations or interactions.
![]() |
| Figure 3: To an ensemble of molecules, several macroscopic structures may correspond (no specificity). One of them must be selected. |
Instruction, selection and the central dogma of molecular biology
In biology, important theoretical debates have already opposed "instruction"
and "selection". The selective models have been convincing in the
case of the evolution of species, or in the genesis of antibodies (Lederberg,
1988). Nevertheless, the central dogma of molecular biology, that defines the
road leading from genotype to phenotype, remains resolutely instructionist.
Important differences exist between an instructive or a selective conception
of biology. The instructionist models rest on a postulate of spontaneous stability
: any changes necessitate the intervention of external signals. These signals
must carry a specific instruction that will orient the system towards its final
state. This instruction must be adequately produced by the emitting structure
and precisely deciphered by the receiving one. Therefore, the final state must
be virtually preexistant in the initial state, before the instruction occurs,
for example in the form of specific receptors able to recognize it. Instruction
is preformationist and problematic as regard to its evolutionary origin. In
the selective models, variability and instability are postulated as spontaneous
properties of living beings allowing their variation or differentiation without
an outside intervention. These variations are random and not exclusively oriented
towards the final state. Outside interactions are stabilizing as compared to
pre-existing variations and are neither specific nor directly tied to the observed
changes. In contrast to instructionnist systems, the final order emerges from
a preexisting random disorder without being virtually present in the initial
and final states.
Molecular biologists often refer to the theory of evolution and to random variations.
However, the neo-darwinian synthesis, that rests on the polymorphism of populations,
leaves very little room for variability in the field of molecular biology. According
to this theory, only the structure of genes is subject to random variation by
mutations or recombinations whereas the function of the genes is conceived according
to the instructive mode : the biological system is spontaneously stable while
waiting for specific instructions from the genes that will set it in motion
and trigger biosynthetic processes. These instructions correspond to a flux
of information going from DNA to proteins: DNA carries the instructions (informations)
that permit the transcription of RNAs, which in turn carry the instructions
necessary for the synthesis and functional folding of proteins, that in turn
determine the state of cellular differentiation, which in turn instructs the
final structure of the organism.
The intrinsic stability of biological systems, implicit in the central dogma
of molecular biology, has not been tested deliberately. However, the behavior
of cultured cells may give some information on this question. In fact, it contradicts
the postulate of the stability of instructionist models. It is well known that,
when maintained in a constant medium, cells can transform or differentiate spontaneously
(for examples: Chow et al., 1991; Bohme et al., 1995) and that it is most often
necessary to reclone a cell line regularly to avoid spontaneous variations.
This source of variation is usually neglected and considered as a drawback of
experimental systems. We believe that it reflects an essential tendency to variation
of cells which underlies eukaryotic differentiation.
A populational approach for molecular processes
Numerous exceptions and variants as compared to the initially defined rules
of genetic expression have been described: illegitimate transcriptions, alternative
splicing, modification-editing of RNA, multiple translation initiations, ribosomal
frameshifts, transcriptional and translational errors, post-translational modifications,
multiple protein comformations, combinatorial multimerization. In fact, all
these processes increase the epigenetic variability of gene expression. However,
according to the "determinist reductionnism", the organism should
be decipherable through its genome. Everyone realizes that this is not possible
in practice, even in the case of a very simple organism such as a virus. This
failure is attributed to the complexity of the genetic language and to our incapacity
to decipher it. The detailed descriptive work that so preoccupies molecular
biologists is, above all, directed at better understanding this "language",
the existence of which is taken for granted. Is it reasonable to continue to
believe that this language is accessible ?
Would it not be more pertinent to take into account the variability that occurs
not only in structure, but also in the mechanisms of gene expression? Would
it not be more reasonable to consider that the "genetic language"
is degenerate and that this property does not constitute an inconvenience but
a central property underlying molecular and cellular mecanisms ? The consequence
of this hypothesis as applied to research strategies is that a detailed description
of the diversity of molecular interactions without understanding the sorting
mech0anisms that allow the phenotype to emerge may be dangerously misleading.
Following this line of reasoning, the field of possibilities open to biological
systems broadens considerably. In the matter of evolution, the potential for
innovation is determined by the diversity and the size of populations. One may
apply these principles to molecular populations. One billionth of a gram of
an average-sized protein may contain 50 billion molecules. The size of cellular
populations within an organism is of the same order. The evolutive potential
of such large populations is extraordinary, if one accepts the existence of
heterogeneity, including within what is called a molecular or cellular "type".
A darwinian model for cellular differentiation
Cellular differentiation is usually explained by instructionist models. However, a darwinian theory can also be proposed (Kupiec, 1983, 1996, 1997). In the context of instructionist models, cells receive a specific message (or instruction) that provokes a well-defined differentiation (Fig. 4A). This information is assumed to be communicated by membrane interactions, or by diffusible differentiation factors. In the context of these models, based on the concept of specificity, this information is absolutely required to alter the differentiation state of a cell. However, if we abandon the univocal and specific relationship between molecules, it is possible to explain differentiation without the intervention of a specific inducer molecule. If many structures may correspond to a group of molecular interactions as a result of the nonspecific (degenerate) character of these relationships, then in a population of cells, different structures (cell types) will be achieved with frequencies dependent upon the probability of attaining each possible structure (Fig. 4B). In support of this model, a probabilist component has been demonstrated in the differentiation of numerous cell lines (Till et al., 1964; for reviews: Kupiec, 1996; Levenson and Housman, 1981). A noteworthy example is the case of anchor cells of C. elegans, a model that has nevertheless long been considered as the typical example of a rigorously deterministic differentiation (Greenwald and Rubin, 1992).
![]() |
| Figure 4: Differentiation models. A in the determinist model cells differentiate according to the instructions they receive. There is no variability in this response to signals; B in the probabilist model, cells may differentiate according to the various possibilities produced by molecular interactions. In this example, according to which of events a or b occurs, the cell differentiate into a A or B type, respectively. In a cell population, the proprtion of A and B cells will depend on the probability of events a and b. |
A simple theoretical example may be given for gene transcription (Fig. 5). In a cell, there is one molecule of a transcriptional regulator that may interact and activate either gene a, or gene b. The choice of the gene (a or b) to be activated in a given cell is random. In a population of cells, a fraction of cells will express a and another b. The frequency of the corresponding phenotypes A and B will depend on the probability of activation of these two genes. In a more general manner, if there are fewer regulatory molecules than regulatable genes, a combination of possible distributions of regulatory molecules is generated; each distribution corresponds to a group of activated genes, and thus to a potential cell type. MeCP2 furnishes an example of a transcriptional regulator that could participate in this type of mechanism. MeCP2 has as a target the dinucleotide methyl-CG. There are 4x107 copies of this target for only 106 molecules of MeCP2 in the nucleus of a mammalian cell. It is thus unlikely that these molecules would be distributed in an identical manner in all cells (Nan et al., 1997). Such a mechanism can generate a great diversity of cell types without calling upon specific regulators, and could therefore explain the weak specificity and the ubiquity of transcription regulation factors, including those coded by the homeogenes (Gehring et al., 1994; Duboule and Wilkins, 1998). Instead of recognizing the usefulness of this diversity in molecular interactions, the "determinist reductionist" postulate leads us to search for additional co-factors that would explain, in fine, the specificity of gene regulation (for example see discussion in Nan et al., 1997)
![]() |
|
Figure 5: Stochastic model of gene expression. Because of diffusion the regulator moves stochastically; a and b are two genes (or sets of genes) that can be activated by the same regulator. |
In the prevalent instructionist model, one must explain how a cell changes its developmental state and differentiates. On the contrary, in the chance-selection (Darwinian) model one must explain how cells stabilize favorable phenotypes. Referring to the example in Figure 5, each time that the regulator dissociates from its site of interaction (a or b), it can reassociate in a random fashion at another site, changing the cellular phenotype. In embryos of transgenic mice, alleles a and b of human globin are expressed alternatively. During the development of these embryos, there is a progressive stabilization of the expression of allele b (Wijgerde et al., 1995). The phosphorylation of regulators could permit such a stabilization while modifying the stability of regulator-DNA complexes (Kupiec, 1996, 1997). Kinases and phosphatases do not require specific targets or inducers in this model. They act globally and retrospectively to stabilize the system when the "right combination" of cellular phenotypes is expressed.
Darwinism as the antidogma of molecular biology
Although, the death of Darwinism is regularly announced, chance-selection theories have already replaced " Lamarckian " determinist views in evolution, immunology and neurosciences. The ability of selective systems in generating self-organization has been shown by computer simulation (Atamas, 1996). If chance-selection also governs the relation between genotype and phenotype, Darwin discovered not only a law governing the evolution of species but a more general law of the functioning of living organisms at the molecular and cellular level. In this perspective the making of a phenotype does not result from a flux of specific information going from DNA to proteins and macromolecular structures as imposed by the central dogma of molecular biology but instead from an epigenetic selection process among the huge repertoire of random events coded by a single genome.
Bibliography
Atamas, S. 1996. Self-organization in computer simulated selective systems. Biosystems 39(2) : 143-151.
Bohme, K., Winterhalter, K.H. and Bruckner, P. 1995. Terminal differentiation of chondrocytes in culture is a spontaneous process and is arrested by transforming growth factor-beta 2 and basic fibroblast growth factor in synergy. Exp Cell Res. 216: 191-198.
Chow, M., Yao, A. and Rubin, H. 1991. Cellular epigenetics: topochronology of progressive "spontaneous" transformation of cells under growth constraint. Proc Natl Acad Sci 1994: 599-603.
Duboule, D. and Wilkins, A.S. 1998. The evolution of "bricolage". Trends in Genetics 14 (2): 54-59.
Gehring, W.J., Qian, Y.Q., Billeter, M., Furukubo-Tokunaga, K., Schier, A.F., Resendez-Perez, D., Affolter, M., Otting, G. and Wüthrich, K., 1994. Homeodomain-DNA recognition. Cell 78: 211-223.
Greenwald, I. and Rubin, G.M. 1992. Making a difference: the role of cell-cell interactions in establishing separate identities for equivalent cells. Cell 68: 271-281.
Kupiec, J.J. 1983. A probabilist theory for cell differentiation, embryonic mortality and DNA C-value paradox. Specul Sci Technol. 6: 471-478.
Kupiec, J.J. 1996. A chance-selection model for cell differentiation. Cell Death and Differentiation. 3: 385-390.
Kupiec, J.J. 1997. A darwinian theory for the origin of cellular differentiation. Mol. Gen. Genetics . 255: 201-208.
Lederberg, J. 1988. Ontogeny of the clonal selection theory of antibody formation. Annals of the New York Academy of Sciences 546: 175-187.
Levenson, R. and Housman, D. 1981. Commitment: how do cells make the decision to differentiate? Cell 25: 5-6
Monod, J. 1971. Le hasard et la nécessité. , pp80 and pp118. Editions du Seuil, collection "point science".
Nan, X., Campoy, F.J. and Bird, A. 1997. MeCP2 is a transcriptional repressor with abundant binding sites in genomic chromatin. Cell 88: 471-481.
Till J.E., MC Culloch E.A. and Siminovitch L.A. 1964. A stochastic model of stem cell proliferation based on the growth of spleen-colony-forming cells. Proc. Nat. Acad. Sci. USA 61: 29-36.
Wijgerde, M., Grosveld, F. and Fraser, P. 1995. Transcription complex stability and chromatin dynamics in vivo. Nature 377: 209-213.