More biology >

On the Origins of Life and Viruses

Updated 8 Jun 2022 [badly unfinished]

The puzzle

The origins of life on Earth remain obscure. The genes of our cells are built from nucleic acids, proteins from amino acids, membranes and walls from lipids. Which came first? Each seems necessary to the existence of the others; chicken-and-egg arguments abound. It has recently been suggested that all must have appeared alongside but independent from each other, until all the ingredients suddenly clicked into place.[1] Was there and RNA world in which the simpler genes arrived first, or did the famous DNA double-helix appear alongside it from day one?[2] Biologists are at least agreed that we can trace all known life back to a notional Last Universal Common Ancestor, LUCA. It had a somewhat hazily-understood metabolism, and may have been a small ecology rather than a single genome, but all the universal basics of DNA nucleus, RNA messaging, protein microtubule ion transport, double-walled lipid membrane and all the necessary manufacturing, maintenance and reproductive mechanisms, were by now in place. No wonder we have trouble going further back when this is the simplest metabolism we have direct evidence of.

Origins of cellular life

I would suggest that such extreme mutation must also have been a feature of the original proto-life as it emerged from the primal soup. Any lipid membrane must have been highly permeable to allow nutrients through and would also have allowed destructive chemicals in. Metabolism and reproduction would have been haphazard, ill-defined affairs until controlled transport mechanisms through the membrane evolved and stabilised. During that interim period, a strategy of high replication would have been necessary to establish any kind of evolutionary dominance. It would have been particularly effective for those early simple, short genomes and proteins, as the possible mutations are fewer than for longer chains.


There is an elephant in the room that never seems to get a mention in this context. The biomass and diversity of viruses in the world's oceans greatly exceed those of all living organisms across the whole planet. They are much simpler objects, comprising little more than a bundle of genes in a protective coat of protein or lipids. Some larger ones have vestigial metabolic abilities, and all have weaponry capable of hijacking living machinery. Nevertheless, even in the days of LUCA the task of taking over a living cell would have been a vastly complex one. How did these viruses evolve, when and where did they appear?

Viruses represent an intermediate stage on the road to fully self-sustaining metabolism. Might they provide us with useful insights, perhaps even be part of the story itself? Firstly, we don't need lipid membranes, proteins will do fine and are in fact more robust. Secondly, we don't need a DNA nucleus in its own little membrane, free-floating RNA will do fine. Organelles also show us that if some modestly complex feedstocks are in the general soup mix, we don't need to synthesise those either.

Given a virus-free world descended from LUCA, how could the first virus have possibly evolved? As we think happened with organelles (such as mitochondria and chloroplasts), some organisms might have become symbionts living inside host cells and slowly shed the metabolism - and genes - they no longer needed. In the case of viruses the process would continue until only the nucleus and a protective coat were left. Then the virus went rogue, like a cancer. That might have happened, but it does not explain how the rogues could develop sophisticated invasion techniques. And there are difficulties in building a convincing picture for all viruses.

Firstly, LUCA and its successors are all DNA based. Genetic material takes two forms, as DNA - the famous double-helix - and also as RNA, a single-strand form. RNA plays a vital role in the metabolic machinery of replicating DNA and proteins. Some kinds of virus - not least the Covid 19 currently sweeping through us as I write - are wholly RNA based. No DNA life form has ever gone backwards and become RNA based, so why should viruses be any different? Did RNA and DNA viruses evolve independently? It seems more likely that they have been that way all along.

Secondly, a similar argument applies to cell and nuclear membranes. All cells and their DNA nuclei are surrounded by protective lipid membranes. All viruses also have a membrane protecting their genes. However in some viruses the membrane is made of protein. How did that happen? If it evolved from some lipid-coated virus, where are the intermediate forms?

Many viruses, especially RNA ones, are vastly prolific at replication within the host cell but extremely poor at keeping those copes faithful. Mutation rates are extreme, with most copies being non-viable. The benefit of this strategy is that viable copies are frequently mutants, with some able to circumvent the host organism's defensive responses against the original strain. For example surface proteins may change so that antibodies fail to recognise them.

Here's another thought. We know that cancer is deeply implicated in the fundamental metabolism of monocellular life and has been a bugbear of multicellular life from Day One. Indeed, one can see cancer as a return under stress to the tried and trusted earlier state.(Davies 2019) Why should viruses not offer a similar picture? Why should a cell which has been taken over by a virus not represent a reversion to an even more primitive mechanism, in which recognisable cells had not yet emerged from the primeval soup?

[Similarities and] conclusions

One can surmise that virus-like composite clumps of RNA and protein would have evolved in competition with each other. Robbing each other would be a good survival move. Might these have therefore evolved some sort of protective layer against robbers and other rogue metabolic attacks? The proteins used in reproduction offer one possible source, lipids another.

Some of the most basic metabolic processes might also be brought within the protective shell, perhaps even have originated there. The early protective layers would have been leaky, offering only modest protection. As they evolved to be ever more protective, more like tough membranes, the need to actively maintain paths through them for feedstocks and wastes arose, both to create and open those paths and to close them back up. So too did the need to manage the reproduction process when the shell grew so large it split in two. A diversifying range of proto-life emerged, varying in complexity from the simple RNA-protein clump to a self-maintaining cell carrying out all the necessary metabolic activity within. There was as yet no real analogue to the idea of species, just a slowly separating soup of biomolecular cliumps and metabolic activity. As yet, only the RNA strands showed much sign of reliable reproduction.

As larger lipid membranes evolved, the intermediate forms lost out to the two extremes, their protection being inadequate and their reproducibility too ephemeral. The diversity of proto-life forms polarised down the two extreme paths. One specialised in the protective membrane, huddling inside it with a larder of metabolic soup; a living cell. The other instead strengthened its hole-creation armoury, relying on the presence of reproductive metabolism in its environment; a virus. None of this was particularly stable and the cells often gained useful genetic material from unsuccessful viral attacks. Thus, cells and viruses should be seen not as life and parasite but as ecological siblings, an uneasy symbiosis.

A genome supporting the immense complexity of a metabolic cell must value stability over fecundity, so they soon settled down to a more leisurely programme of reproduction and evolution. Meanwhile the viruses were free of such encumbrances and had no reason to slow down and stabilise.

Given these issues, the idea that viruses evolved after LUCA must be cast into serious doubt. The possibility that they evolved earlier must be taken seriously. Many biologists believe that life originated as an "RNA world", or that DNA was at best also present. Indeed the arrival of DNA, perhaps even in the guise of LUCA, might have wiped out RNA World in fairly short order. But, just as we used to think that the dinosaurs had died out until we realised that birds are basically dinosaurs with wings, might some fragment of RNA World be surviving right under our noses? If viruses were around back then, might they be the birds of RNA world's dinosaurs, its sole survivors?

The replication strategy of modern viruses may have simply remained unchanged since the origin of life.

Given their archaic and oddball features, I go so far here as to suggest that they are sufficiently complex, diverse and tied to the fundamental minutiae of living metabolism, that they must be admitted into the chicken-and-egg arguments over the very origins of life. This proposition has enormous implications and is the pivotal thesis of this essay.

Proteins, genes and lipids are so interdependent that it has been suggested they must have somehow all evolved together in some primeval soup. Recent work supports the idea that this soup might have gained a surprising complexity of ingredients with nucleic acids forming short strands of RNA and DNA, amino acids forming similar protein strands, and lipids forming little bubbles. However it would have had to go an extremely long way down this road before some particular random mix turned out to be self-organising and stable enough to kick off cellular life. The apparent unlikelihood of such complex self-organisation spontaneously appearing still puts most biologists off. The problem then is to build a ladder with stable rungs at each evolutionary step up in complexity.

A picture emerges of a rich soup teeming with short and diverse strands of RNA, proteins and lipids, and a wide variety of other organic molecules. Within it, RNA and protein strands begin to interact to create increasingly stable sites around which similar complexes form before dispersing. Complexes of RNA surrounded by protein machinery turn out to provide the best combination of stability and productivity.

Diversity at this higher level of complexity becomes significant. Some complexes evolve more robust protein shells which protect the RNA within, at the expense of slower activity. Others become more reliable replicators, relying on high levels of activity to overcome their short lives. Relatively stable forms begin to appear.

As more complexes are produced the feedstock molecules in the soup begin to thin out and competition for them comes into play. Some of the longer-lived complexes develop the ability to steal them from the less stable ones, even to detach small pieces of the complex itself.

Others begin to replicate not by direct self-assembly but by twisting or mutating those around them. The protein-only variety have survived and come to be known as prions, while the RNA types form the basis for viral infection. Combined with a protein coat for longevity during passive periods, these are effectively the first proto-viruses.


Which came fisrt, RNA or DNA? Theories differ, one of the latest being that they both evolved together. They do not even use the same genetic alphabets; RNA uses adenine (A), cytosine (C), guanine (G) and uracil (U), while DNA swaps the uracyl for thymine (T).

But there are some useful pointers. RNA has many more functions and is more adaptable to variations in form. It is intimately threaded through the cellular machinery; while RNA can function as genetic code, most forms carry out other metabolic functions such as protein synthesis. Indeed, it is so tightly bound into the mechanisms of protein synthesis, and vice versa, that the symbiosis of protein and RNA appears to be one of, if not the, fundamental signatures of life itself.

By comparison, DNA sits relatively aloof as the genetic data store and plays no other fundamental role. But it cannot look after itself, RNA is as deeply involved in nursemaiding it as it is with proteins. The basic metabolism of RNA works fine without DNA, that of DNA would be utterly nonexistent without RNA.

Also, the mechanisms of DNA replication and maintenance are far more complex and sophisticated than for RNA. Again, it is hard to argue that life evolved all that complexity before discovering RNA, simplifying it down and threading it throughout that complexity with no trace left of any DNA-only metabolism. For example RNA replication requires only a basic copy-paste mechanism, it does not need the enveloping complexities of breaking and reassembling the genetic strand. Losing complexities in a kind of reverse evolution can happen if there is a particular benefit to survival, but the usual trend is to add layers of more sophisticated functional metabolism on top, and to leave the tried and trusted base metabolism well alone.

RNA has recently been found to form spontaneously in the presence of volcanic glasses formed from basalt rock. The evidence is not definitive; a brew of nucleoside triphosphate has to be supplied and there are open questions about branching chains, but if protein chemistry is anything to go by the ongoing research will soon fill such details in. I am aware of no similar evidence that DNA can be synthesised via such geological processes.

All this suggests an RNA baseline, upon which DNA was later layered. The burden of proof for any other model, must surely lie with those who would propose it. It is quite possible that DNA in some form evolved very early on and hung around as a curiosity, or as a minor player in other metabolic roles, rather that evolving as some kind of mutated genetic medium, but its presence as the genetic data bank does appear to have been an afterthought.

Life arrives: RNA world

Lipids are a significant byproduct of the basic protein-nucleic acid metabolic reactions. As their concentration builds up, they begin to form rather wobbly and impure bubbles around the reacting complexes. These bubbles help protect the short-lived, high-activity complexes from the aggressions of the protein-shielded ones, but at the expense of also blocking out the feedstocks. Fortunately the lipid membranes are fairly haphazard and impure. Some proteins attach to lipids and get included in the shell as it forms; imperfections such as these provide leaky pathways for fresh soup.

The complexes which any given membrane surrounds are at first pretty arbitrary. However some complexes form symbiotic relationships, their waste products providing feedstock for another kind. These complexes-of-complexes form yet a higher level of complexity. Evolution now kicks off in earnest and life has arrived.

Along with RNA life, RNA viruses evolve side by side in a typical competitive spiral of predator vs. prey. Everything is still pretty wobbly and unstable at this stage; some viruses end up coated in protein, others in lipids. Some do not kill the host they invade but reproduce in small numbers, eventually ending up doing something useful, losing their coat and along with it their identity as a virus.

DNA world

In due course some mutant triggered a bizarre reproduction not of its own RNA but of a double-stranded DNA. Whether this was cellular or viral DNA is moot, or perhaps some intermediate form of half-life now extinct, but it eventually ended up in a viable virus. This could explain how, some time after LUCA, prokaryotic life gained its protective membrane around the nucleus: that virus remained in the cell, replicating in small numbers. Eventually it outperformed the cell's original genome and took ownership. Since it was a lipid-coated virus, the nuclear membrane is also lipid based. Of course, it is possible that the membrane may have arisen some other way, for example the close physical proximity of the nucleus to lipid production may have simply led to a convenient dumping of rogue lipids around the nucleus.

Of course, the whole development pictured here is full of gaps and question marks. I am no biologist and I could be way off beam on many details. Nevertheless, I hope that I have adequately demonstrated the need to incorporate viruses into the heart of any theory of life's origins, and that the challenge of credibility is placed not on those biologists who would treat the origins of the living cell and virus holistically, but on those who would seek to separate them.


  1. 1. parallel evolution 2. RNA/DNA parallel NS 22 Oct 2020 p.24 NS Nov 2020
  2. Paul Davies; The Demon in the Machine, Allen Lane, 2019.
  3. Craig A. Jerome et. al.; "Catalytic Synthesis of Polyribonucleic Acid on Prebiotic Rock Glasses", Astrobiology. Ahead of print, 19 May 2022.
  4. Robert F. Service; Did volcanic ‘glasses’ help spark early life?, News,, AAAS, 3 June 2022.