From genotype to phenotype: instruction or selection?


Jean-Jacques Kupiec and Pierre Sonigo
Génétique des Virus
ICGM-CNRS UPR 415
22 rue Méchain, 75014, Paris
France

Posted on Heraclitean Biology Group web site March 7, 1999
Copyright 1999 by Kupiec and Sonigo

Abstract

For reductionist biology, macroscopic structures are the result of the integration of molecular interactions. Because of the specific and univocal character of each molecular interaction, to a set of molecules only one structure corresponds. In this essay, an alternative thesis is suggested. Molecular interactions are not specific. To a set of molecules corresponds several structures, each of them having a certain probability of being achieved. In this framework the making of a phenotype, notably during cellular differentiation, is the result of an epigenetic selection process among all the possible structures coded by a genotype.

Genetics without specificity ?

Specificity is a central concept in biology. It is closely linked to another widely used concept called information (also referred to as instruction), which is indispensable to the definition of the "genetic programme". Some examples of specificity that are frequently evoked include the specific information that orients the differentiation of a cell, that which targets a molecule to a subcellular compartment, or that which, contained in a sequence of amino acids, permits the functional folding of a protein. The field of application of this notion is therefore extremely broad and can be used to explain a variety of phenomena. It is, in fact, at the very heart of molecular biology. The specific information corresponds, physically, to an exclusive recognition and interaction between two molecules. As it is usually understood, this mechanism is strictly determinist. It leaves no room for variability because of the univocal character of the recognition between two molecules (for a more detailed explanation see the definition of stereospecificity in Monod, 1971).
These concepts underlye the paradigm of molecular biology in its reductionist version. Macroscopic structures are the results of the integration of all molecular interactions (Fig. 1A). In this framework, to an ensemble of molecules corresponds only one structure, as a consequence of the specific character of each interaction (Fig. 2). Order at the cellular level thus directly and mechanically reflects order at the molecular level. The consequences of this theory in terms of research strategy are straightforward and far-reaching: whatever the phenomenon studied, the search for an underlying gene or protein and the analysis of its interactions with other proteins, will give the explanation.

Figure 1: Darwinian Biology as an alternative to Determinist reductionnism. In the reductionnist model the making of a phenotype results from the summation of all the molecular interaction whereas in the darwinian model there is a selection among all the potential stuctures.

However, this view of biological systems based on specificity is contradicted by experimental data. Indeed, in spite of many efforts to isolate the specific molecules that are thought to underlye biological regulations, only non-specific molecules (i.e. having a wide spectrum of targets, with multiple uses) have been identified. This fact has been established either for signal transduction (kinases or phosphatases) or for gene expression regulation (ubiquitous transcriptional regulators; for reviews on the non-specificity of these regulators see Gehring et al., 1994; Duboule and Wilkins, 1998). In order to resolve the contradiction between the theoretical basis of "determinist reductionnism" which predicts the existence of specific regulators, and the experimental discovery of non-specific ones, specificity could be abandoned and the way molecular interactions are viewed could be modifyed. As an alternative, the notion of diversity and the probabilist concept of chance-selection inherent to the darwinian theory of evolution could be put forth. These concepts, which are well established at the level of populations of organisms, have been ignored at the level of populations of molecules.

Figure 2: To an ensemble of molecules, only one macroscopic structure corresponds (specificity).

We support here an alternative thesis whereby molecular interactions do not present a univocal or specific relationship but, on the contrary, are of non-specific character. Randomness, rather than " determinist instruction", guides these interactions, so that each genome carries the potential to produce several macroscopic structures corresponding to different combinations of molecular interactions (Fig. 3). Of this great number, only one or a few are "functional". We suggest that the unique phenotype, or the individual, that is finally produced is the result of a process of "selection" that is exerted on the totality of these potential structures (Fig. 1B). Here, the term "selection" must be taken in a broad sense to signify that there is a sorting process posterior to random molecular interactions, but one that does not necessarily imply a pure and simple elimination of the source of useless variations or interactions.

Figure 3: To an ensemble of molecules, several macroscopic structures may correspond (no specificity). One of them must be selected.

Instruction, selection and the central dogma of molecular biology

In biology, important theoretical debates have already opposed "instruction" and "selection". The selective models have been convincing in the case of the evolution of species, or in the genesis of antibodies (Lederberg, 1988). Nevertheless, the central dogma of molecular biology, that defines the road leading from genotype to phenotype, remains resolutely instructionist. Important differences exist between an instructive or a selective conception of biology. The instructionist models rest on a postulate of spontaneous stability : any changes necessitate the intervention of external signals. These signals must carry a specific instruction that will orient the system towards its final state. This instruction must be adequately produced by the emitting structure and precisely deciphered by the receiving one. Therefore, the final state must be virtually preexistant in the initial state, before the instruction occurs, for example in the form of specific receptors able to recognize it. Instruction is preformationist and problematic as regard to its evolutionary origin. In the selective models, variability and instability are postulated as spontaneous properties of living beings allowing their variation or differentiation without an outside intervention. These variations are random and not exclusively oriented towards the final state. Outside interactions are stabilizing as compared to pre-existing variations and are neither specific nor directly tied to the observed changes. In contrast to instructionnist systems, the final order emerges from a preexisting random disorder without being virtually present in the initial and final states.
Molecular biologists often refer to the theory of evolution and to random variations. However, the neo-darwinian synthesis, that rests on the polymorphism of populations, leaves very little room for variability in the field of molecular biology. According to this theory, only the structure of genes is subject to random variation by mutations or recombinations whereas the function of the genes is conceived according to the instructive mode : the biological system is spontaneously stable while waiting for specific instructions from the genes that will set it in motion and trigger biosynthetic processes. These instructions correspond to a flux of information going from DNA to proteins: DNA carries the instructions (informations) that permit the transcription of RNAs, which in turn carry the instructions necessary for the synthesis and functional folding of proteins, that in turn determine the state of cellular differentiation, which in turn instructs the final structure of the organism.
The intrinsic stability of biological systems, implicit in the central dogma of molecular biology, has not been tested deliberately. However, the behavior of cultured cells may give some information on this question. In fact, it contradicts the postulate of the stability of instructionist models. It is well known that, when maintained in a constant medium, cells can transform or differentiate spontaneously (for examples: Chow et al., 1991; Bohme et al., 1995) and that it is most often necessary to reclone a cell line regularly to avoid spontaneous variations. This source of variation is usually neglected and considered as a drawback of experimental systems. We believe that it reflects an essential tendency to variation of cells which underlies eukaryotic differentiation.

A populational approach for molecular processes

Numerous exceptions and variants as compared to the initially defined rules of genetic expression have been described: illegitimate transcriptions, alternative splicing, modification-editing of RNA, multiple translation initiations, ribosomal frameshifts, transcriptional and translational errors, post-translational modifications, multiple protein comformations, combinatorial multimerization. In fact, all these processes increase the epigenetic variability of gene expression. However, according to the "determinist reductionnism", the organism should be decipherable through its genome. Everyone realizes that this is not possible in practice, even in the case of a very simple organism such as a virus. This failure is attributed to the complexity of the genetic language and to our incapacity to decipher it. The detailed descriptive work that so preoccupies molecular biologists is, above all, directed at better understanding this "language", the existence of which is taken for granted. Is it reasonable to continue to believe that this language is accessible ?
Would it not be more pertinent to take into account the variability that occurs not only in structure, but also in the mechanisms of gene expression? Would it not be more reasonable to consider that the "genetic language" is degenerate and that this property does not constitute an inconvenience but a central property underlying molecular and cellular mecanisms ? The consequence of this hypothesis as applied to research strategies is that a detailed description of the diversity of molecular interactions without understanding the sorting mech0anisms that allow the phenotype to emerge may be dangerously misleading.
Following this line of reasoning, the field of possibilities open to biological systems broadens considerably. In the matter of evolution, the potential for innovation is determined by the diversity and the size of populations. One may apply these principles to molecular populations. One billionth of a gram of an average-sized protein may contain 50 billion molecules. The size of cellular populations within an organism is of the same order. The evolutive potential of such large populations is extraordinary, if one accepts the existence of heterogeneity, including within what is called a molecular or cellular "type".

A darwinian model for cellular differentiation

Cellular differentiation is usually explained by instructionist models. However, a darwinian theory can also be proposed (Kupiec, 1983, 1996, 1997). In the context of instructionist models, cells receive a specific message (or instruction) that provokes a well-defined differentiation (Fig. 4A). This information is assumed to be communicated by membrane interactions, or by diffusible differentiation factors. In the context of these models, based on the concept of specificity, this information is absolutely required to alter the differentiation state of a cell. However, if we abandon the univocal and specific relationship between molecules, it is possible to explain differentiation without the intervention of a specific inducer molecule. If many structures may correspond to a group of molecular interactions as a result of the nonspecific (degenerate) character of these relationships, then in a population of cells, different structures (cell types) will be achieved with frequencies dependent upon the probability of attaining each possible structure (Fig. 4B). In support of this model, a probabilist component has been demonstrated in the differentiation of numerous cell lines (Till et al., 1964; for reviews: Kupiec, 1996; Levenson and Housman, 1981). A noteworthy example is the case of anchor cells of C. elegans, a model that has nevertheless long been considered as the typical example of a rigorously deterministic differentiation (Greenwald and Rubin, 1992).

Figure 4: Differentiation models. A in the determinist model cells differentiate according to the instructions they receive. There is no variability in this response to signals; B in the probabilist model, cells may differentiate according to the various possibilities produced by molecular interactions. In this example, according to which of events a or b occurs, the cell differentiate into a A or B type, respectively. In a cell population, the proprtion of A and B cells will depend on the probability of events a and b.

A simple theoretical example may be given for gene transcription (Fig. 5). In a cell, there is one molecule of a transcriptional regulator that may interact and activate either gene a, or gene b. The choice of the gene (a or b) to be activated in a given cell is random. In a population of cells, a fraction of cells will express a and another b. The frequency of the corresponding phenotypes A and B will depend on the probability of activation of these two genes. In a more general manner, if there are fewer regulatory molecules than regulatable genes, a combination of possible distributions of regulatory molecules is generated; each distribution corresponds to a group of activated genes, and thus to a potential cell type. MeCP2 furnishes an example of a transcriptional regulator that could participate in this type of mechanism. MeCP2 has as a target the dinucleotide methyl-CG. There are 4x107 copies of this target for only 106 molecules of MeCP2 in the nucleus of a mammalian cell. It is thus unlikely that these molecules would be distributed in an identical manner in all cells (Nan et al., 1997). Such a mechanism can generate a great diversity of cell types without calling upon specific regulators, and could therefore explain the weak specificity and the ubiquity of transcription regulation factors, including those coded by the homeogenes (Gehring et al., 1994; Duboule and Wilkins, 1998). Instead of recognizing the usefulness of this diversity in molecular interactions, the "determinist reductionist" postulate leads us to search for additional co-factors that would explain, in fine, the specificity of gene regulation (for example see discussion in Nan et al., 1997)

Figure 5: Stochastic model of gene expression. Because of diffusion the regulator moves stochastically; a and b are two genes (or sets of genes) that can be activated by the same regulator.

In the prevalent instructionist model, one must explain how a cell changes its developmental state and differentiates. On the contrary, in the chance-selection (Darwinian) model one must explain how cells stabilize favorable phenotypes. Referring to the example in Figure 5, each time that the regulator dissociates from its site of interaction (a or b), it can reassociate in a random fashion at another site, changing the cellular phenotype. In embryos of transgenic mice, alleles a and b of human globin are expressed alternatively. During the development of these embryos, there is a progressive stabilization of the expression of allele b (Wijgerde et al., 1995). The phosphorylation of regulators could permit such a stabilization while modifying the stability of regulator-DNA complexes (Kupiec, 1996, 1997). Kinases and phosphatases do not require specific targets or inducers in this model. They act globally and retrospectively to stabilize the system when the "right combination" of cellular phenotypes is expressed.

Darwinism as the antidogma of molecular biology

Although, the death of Darwinism is regularly announced, chance-selection theories have already replaced " Lamarckian " determinist views in evolution, immunology and neurosciences. The ability of selective systems in generating self-organization has been shown by computer simulation (Atamas, 1996). If chance-selection also governs the relation between genotype and phenotype, Darwin discovered not only a law governing the evolution of species but a more general law of the functioning of living organisms at the molecular and cellular level. In this perspective the making of a phenotype does not result from a flux of specific information going from DNA to proteins and macromolecular structures as imposed by the central dogma of molecular biology but instead from an epigenetic selection process among the huge repertoire of random events coded by a single genome.


Bibliography

Atamas, S. 1996. Self-organization in computer simulated selective systems. Biosystems 39(2) : 143-151.

Bohme, K., Winterhalter, K.H. and Bruckner, P. 1995. Terminal differentiation of chondrocytes in culture is a spontaneous process and is arrested by transforming growth factor-beta 2 and basic fibroblast growth factor in synergy. Exp Cell Res. 216: 191-198.

Chow, M., Yao, A. and Rubin, H. 1991. Cellular epigenetics: topochronology of progressive "spontaneous" transformation of cells under growth constraint. Proc Natl Acad Sci 1994: 599-603.

Duboule, D. and Wilkins, A.S. 1998. The evolution of "bricolage". Trends in Genetics 14 (2): 54-59.

Gehring, W.J., Qian, Y.Q., Billeter, M., Furukubo-Tokunaga, K., Schier, A.F., Resendez-Perez, D., Affolter, M., Otting, G. and Wüthrich, K., 1994. Homeodomain-DNA recognition. Cell 78: 211-223.

Greenwald, I. and Rubin, G.M. 1992. Making a difference: the role of cell-cell interactions in establishing separate identities for equivalent cells. Cell 68: 271-281.

Kupiec, J.J. 1983. A probabilist theory for cell differentiation, embryonic mortality and DNA C-value paradox. Specul Sci Technol. 6: 471-478.

Kupiec, J.J. 1996. A chance-selection model for cell differentiation. Cell Death and Differentiation. 3: 385-390.

Kupiec, J.J. 1997. A darwinian theory for the origin of cellular differentiation. Mol. Gen. Genetics . 255: 201-208.

Lederberg, J. 1988. Ontogeny of the clonal selection theory of antibody formation. Annals of the New York Academy of Sciences 546: 175-187.

Levenson, R. and Housman, D. 1981. Commitment: how do cells make the decision to differentiate? Cell 25: 5-6

Monod, J. 1971. Le hasard et la nécessité. , pp80 and pp118. Editions du Seuil, collection "point science".

Nan, X., Campoy, F.J. and Bird, A. 1997. MeCP2 is a transcriptional repressor with abundant binding sites in genomic chromatin. Cell 88: 471-481.

Till J.E., MC Culloch E.A. and Siminovitch L.A. 1964. A stochastic model of stem cell proliferation based on the growth of spleen-colony-forming cells. Proc. Nat. Acad. Sci. USA 61: 29-36.

Wijgerde, M., Grosveld, F. and Fraser, P. 1995. Transcription complex stability and chromatin dynamics in vivo. Nature 377: 209-213.