Origins of life: corroboration of common ancestry via statistics

Just as I wrote the last post, in fact a few days earlier, an article came out in Nature entitled “A formal test of the theory of universal common ancestry“, by DL Theobald. It uses a novel approach to asses the probability that all life shares a “universal common ancestry” (UCA). The authors compare sequences of ubiquitous and conserved proteins across the three domains of life, Bacteria, Archaea and Eukarya. They arrive at the conclusion that the probability of UCA is quantitatively overwhelming:

UCA is at least 10^2,860 times more probable than the closest competing hypothesis

The author notes that multiple origins of life may still be possible, but all current life extant (LAWKI) is likely of common origin. What’s new in this approach is the creative use of data to put a likelihood number on the various qualitative points made since Darwin’s assertion of UCA (and reiterated in the previous post).


Origins of life

In relation to the question, how and why are genes so very much conserved in apparently all of life on Earth, in our last debate I, Jason Woodard and Eric Clemons took off a tangent on the probability of other possible origins of life. Just now an article in the New York Times was published – HT Jason – that echoes many aspects of this discussion, which goes to show that the questions are out there and well understood, but the answers are not forthcoming – yet.

On Earth, the unity of life, or rather, its sole origin, are corroborated by a handful of strong, fundamental, shared features, and a host of less fundamental, but also ubiquitous features. Had life as we know it (LAWKI) several origins, we would have an escalating implausibility: first, that life emerged at all, second, that it emerged several times, third, that all resulting organisms would share many essential features, while coming from different sources.

LAWKI’s fundamental features start with the observation that it is individual-based – at its most basic it forms cells with cell membranes. How fundamental this is, is easily overlooked, and I can think of two examples of thinking of life otherwise right away – one, Stanislaw Lem’s “Solaris”, where life is an unstructured planetary ocean with an equally incomprehensible kind of intelligence, and two, of course, the Gaia hypothesis though really here life would have to be considered more of a society still of individuals. Next, and we are not eve at chemistry yet, LAWKI encodes a recipe about how to repair itself and how to make new LAWKI similar to itself. The only thing that is really replicated when LAWKI grows new individuals is this information, and with that, LAWKI takes what it needs form the chemical environment to make more of itself. Incidentally LAWKI conserves information, it prefers to vary little from individual to individual, and individuals are stable over time. None of these should be self-evident or a necessary condition for some, any, life. Again, if you go by Solaris or Gaia, one can think of forms of life that do not do this kind of thing.

There is a nagging suspicion by some people, that the infamous prion proteins could be thought of as a form of life – they are just molecules, but they have definedĀ  information content – they always have a specific kind of configuration – and make more of themselves. They do not actually “replicate”, rather, they incite other similar proteins to take up their own unusual kind of protein folding. It’s not LAWKI, but then again, maybe we don’t see life if it doesn’t look like LAWKI. Current thinking of life is completely taken in by the computer code metaphor of life encoding its own processes in a kind of data string. Not by coincidence, cybernetics and AI/AL were all “invented” in the same era as the naming of biological information as the “genetic code”, ca. 1940-1960. The possibility of a fundamentally deeper embodiment of some other life’s information into a physical structure is hard to imagine in the age of digital code strings (incidentally computer code of course also implies a form of embodiment – “hardware”: computer code may not be encoded in a structurally embodied way, but it strictly speaking only turns into “code” or heavens forbid, “information”, once it meet suitable hardware, otherwise it is just garbled digits).

The rest of the argument for LAWKI’s unity is essentially all about specific chemistry. All of LAWKI uses the same genetic code encoded into very similar molecules, DNA and RNA, using very close cell chemistry. In building its physical units, LAWKI does not just use carbon and water. LAWKI uses a tiny set of all possible organic molecules: mostly a few sugars and their polymers, cellulose and chitin and starch, plus just 20 odd amino acids and their polymers, proteins, in a very small set of all possible protein foldings. More, both sugars and amino acids have the property of chirality, that means, chemically identical molecules can be symmetric, non superposable isomers (distinguished by whether they deflect polarized light either to one side or the other). LAWKI always only uses one of two possible isomers, and all of LAWKI uses the same, rather than having each phylum, or each species, picking their own preference at random. This is significant because for purely chemical reactions, it does not matter which isomer you use – they both have identical reaction or activation energies etc. But LAWKI cares because LAWKI controls chemical reactions via enzymes, and enzymes are highly sensitive to chirality (because they are 3D structures that physically bring molecules together in a lock and key fashion to make them react: a non superposable mirror-image yet chemically “identical” isomer is strictly useless to them). Note (1) LAWKI would not be usable as food for Martians or other non-LAWKI, and vice versa; (2) this ipso facto also means that likely, all enzymes that make and process the fundamental building blocks shared by LAWKI, also arose only once. Of course here it is possible that some enzymes may have been re-invented after the building block had already become a staple of LAWKI.

So LAWKI really looks like it had only one origin on Earth. Now the question becomes, is it easy to make life (many origins of life in the Universe likely), or is it hard? Current thinking seems to go in the direction that it is not too implausible – SETI and various conferences on the probabilities of alien life forms seem to believe it is very likely. In addition to this, LAWKI seems to have arisen fairly quickly after the planet had cooled down enough. And very sophisticated models such as Stuart Kauffman’s autocatalytic sets, also aim to convince the readership that proto-life is plausible as a form of organization, hence the title of Kauffman’s book , “At home in the universe“. The idea is, life is highly probable, and not a lucky accident.

One of the older ideas about possible origins of life is that life would have to be carbon and water based and that therefore only a tiny set of cosmic conditions would allow life. While this does sound quite anthropocentric already, the larger issue is that even if we accept this, it would be very hard to argue that the specific molecules of any and all should all be made from the same handful of components, coded in the same genetic code, and and use the same chirality organic monomers. So yes LAWKI has one origin. Now if it really were easy to make life at least under Earthly conditions, and if it does not have to look exactly like LAWKI, then where is it? On Earth, namely, since we assume that Earth had pristine conditions?

A number of possibilities come to mind: (1) other life exists in parallel but we don’t call it life (prions? Gaia? the economy as a system? the internet?) (2) other life sprang up repeatedly but was eaten by LAWKI (3) other life sprang up repeatedly but was out-competed by the far more efficient (first mover advantage!) LAWKI. Both (2) and (3) would be a form of pre-emption (Eric Clemon’s words). (4) other life sprang up on Earth but we don’t see it because it lives under the most extreme circumstances on Earth.

Ad (2) is unlikely – to me – because LAWKI is so idiosyncratic in its use of organic molecules. Only LAWKI can feed on LAWKI. The enzymes just won’t work on anything else, unless they adapt, but then, no more first mover advantage.

Ad (3) is unlikely as well in my eyes because of ecological niche theory. The entire theory of evolution is based on the idea of competition over scarce resources leading to natural selection among variants of “the same” organism (note cognitive dissonance if you wish so). Competition is highest among the most similar organisms (highest niche overlap). Recent case in point, “Individuals and the variation needed for high species diversity in forest trees” (Clark et al 2010). This paper concluded that homogeneous environments can produce large numbers of different species precisely because intra-specific competition is even fiercer than inter-specific competition: the individuals are more similar to each other. As a result it is “easier” to “break out” into a new species, than to compete against your true peers (in a sloppy way of putting it).

For some radically different life form this means, its “otherness” would actually have protected it both from predation and from competition, and made it easy for it to co-exist with LAWKI, had it occurred at all.

Ad (4) is of course possible, but it invalidates a little the idea that Earth would have been a favorable environment for LAWKI and by extension, for all life, to begin with. If extreme environments are the last chance we can imagine to find other life, then we might as well postulate that the more extreme the environment the more likely we’ll find it, but this just doesn’t sound right.

And finally, ad (1). This is a real possibility. But here one would have to go beyond chemistry into what theoretical biologists of the same affinities as a Robert Rosen would say, namely that one should not define “life” as a set of molecules and their corresponding metabolism or encoding, but as a form of organization (Robert Rosen, “Life itself“). In this way of thinking, diffused networked intelligence, or old-fashioned structures such as “the economy”, like it or not, may well have features of “life”.

Edit: Mysteries of consciousness – I misspelled my anagram, LAWKI, multiple times. I did not see it when I proofread the post. The next day I woke up and knew that I misspelled it. So when I wrote my post I saw, perceived, and memorized my mistake, but it took me 8 h and some sleep before I knew it. Corrected now, among some other spelling issues.

The conserved genome

Something has to make a start, and a paper we recently discussed at SMU shall be it.

Along the lines of “it’s not about what (genes) you have but how you use them”, I came across this recent article: Systematic discovery of nonobvious human disease models through orthologous phenotypes .

The article itself seems, well, nonobvious at first. The subject is in the area of cross species genome comparisons. In essence, the more becomes known about genomes, the more people find similar genes in very dissimilar organisms. More, many genes appear in functional clusters, and these clusters are also to a large extent preserved in their unity across species and even phyla. Their genes do produce proteins with similar biochemical functions, but those same chemical functions in similar functional clusters produce very different outcomes (phenotypes) depending on the organism in which they occur – they perform totally unrelated organismal functions. So, structure is preserved, function is not, and the more is known about genes the more it appears that true novelty is rare and that most innovation in terms of species, depends on recycling and differently regulating existing genes.

There are many ways of looking at this – the surprising flexibility of genes and gene products themselves for instance to be reused for different functions, so that say, cilia related genes of unicellular organism have functions in producing neuronal networks in higher organisms. There is the aspect of finding new disease models by going from known disease related genes in humans, to finding new genes in the homologous genetic module of some different organism just by virtue of finding them together with unknown genes there.

To me this is yet another piece of evidence for one of my pet theories, that genes seems to be widely conserved, and that patterns of use and gene regulation are much more important for expressed features of organisms, than the nominally encoded genetic information. More and more, genes seem to be a rather generic thing, and that the differences in organisms seems to come about in how these genes are being used. On a higher level, this ties in with other puzzling facts of life, such as, why are plant drugs effective in humans at all? Why are very different looking and acting species so widely genetically similar?

The mix and match way of construction and innovation that life uses over and over – to simply re-use “old” genes that have been “invented” for an entirely different purpose – also offers a nice parallel to the way technology produces innovation by recombination of existing parts. One could have called this article “Life as ‘bricolage’: Innovation by recycling parts.”.

Here is some lighter reading on this paper, with background information.