Updated 29 Sept 2020
The origins of life on Earth remain obscure. The genes of our cells are built from nucleic acids, proteins from amino acids, membranes and walls from lipids. Which came first? Each seems necessary to the existence of the others. Chicken-and-egg arguments abound. Biologists are at least agreed that we can trace all known life back to a notional Last Universal Common Ancestor, LUCA. It had a somewhat hazily-understood metabolism, and may have been a small ecology rather than a single genome, but all the basics of DNA nucleus, RNA messaging, microtubule ion transport, double-walled lipid membrane and all the necessary manufacturing, maintenance and reproductive mechanisms, must have been in place. Wow! No wonder we have trouble going further back when this is the simplest metabolism we have direct evidence of.
But there is an elephant in the room that never seems to get a mention in this context. The biomass and variety of viruses in the world's oceans greatly exceed those of all living organisms across the whole planet. They are much simpler objects, comprising little more than a nucleus of genes and a protective coat of protein or lipids. Nevertheless, the task of crossing a resistant cell wall and hijacking living machinery to one's own purpose is a highly complex one. How did these viruses evolve, when and where did they appear?
Given a virus-free world descended from LUCA, how could the first virus have possibly evolved? As we think happened with organelles (such as mitochondria and chloroplasts), some organisms might have become symbionts living inside host cells and slowly shed the metabolism - and genes - they no longer needed. In the case of viruses the process would continue until only the nucleus and a protective coat were left. That might have happened, but there are difficulties in building a convincing picture for all viruses.
Firstly, why are there no intermediate organisms around? Like organelles, these would have some metabolic function but would require a host in order to survive and replicate. Yet, like viruses, they would be able to break out and invade new host cells. Might they have once existed but died out? There seems no plausible reason why that should have been so, it is perhaps more likely that they never existed in the first place.
Secondly, LUCA and its successors are all DNA based. Genetic material takes two forms, as DNA - the famous double-helix - and also as RNA, a single-strand form. RNA plays a vital role in the metabolic machinery of replicating DNA and proteins. Some kinds of virus - not least the Covid type currently sweeping through us as I write - are wholly RNA based. No cellular life form has ever gone backwards and become RNA based, so why should viruses be any different? Maybe they have been that way all along.
Thirdly, a similar argument applies to cell and nuclear membranes. All cells and their DNA nuclei are surrounded by protective lipid membranes. All viruses also have a membrane protecting their genes. However in some viruses the membrane is made of protein. How did that happen? If it evolved from some lipid-coated virus, again where are the intermediate forms?
Given these issues, the idea that viruses evolved from LUCA must be cast into serious doubt. The possibility that they evolved earlier must be taken seriously. Many biologists believe that life originated as an "RNA world" and DNA only came along later. Indeed, the arrival of DNA might have wiped out RNA World in fairly short order, perhaps even before LUCA moved on. But, just as we used to think that the dinosaurs had died out until we realised that birds are basically dinosaurs with wings, might some fragment of RNA World be surviving right under our noses? Viruses were around back then. Might RNA viruses be the birds of RNA world, its sole survivors?
Given their archaic and oddball features, I go so far here as to suggest that viruses are sufficiently complex, diverse and tied to the fundamental minutiae of living metabolism, that their origin must lie even further back; they must be admitted into the chicken-and-egg arguments over the very origins of life. This proposition has enormous implications and is the pivotal thesis of this essay.
Proteins, genes and lipids are so interdependent that it has been suggested they must have somehow all evolved together in some primeval soup. Recent work supports the idea that this soup might have gained a surprising complexity of ingredients with nucleic acids forming short strands of RNA and DNA, amino acids forming similar protein strands, and lipids forming little bubbles. However it would have had to go an extremely long way down this road before some particular random mix turned out to be self-organising and stable enough to kick off cellular life. The apparent unlikelihood of such complex self-organisation spontaneously appearing still puts most biologists off. The problem then is to build a ladder with stable rungs at each evolutionary step up in complexity.
Viruses represent such an intermediate stage on the road to fully self-sustaining metabolism. Might they provide us with useful insights, perhaps even be part of the story itself? Firstly, we don't need lipid membranes, proteins will do fine and are in fact more robust. Secondly, we don't need a DNA nucleus in its own little membrane, free-floating RNA will do fine. Organelles also show us that if some modestly complex feedstocks are in the general soup mix, we don't need to synthesise those either.
A picture emerges of a rich soup teeming with short and diverse strands of RNA, proteins and lipids, and a wide variety of other organic molecules. Within it, RNA and protein strands begin to interact to create increasingly stable sites around which similar complexes form before dispersing. Complexes of RNA surrounded by protein machinery turn out to provide the best combination of stability and productivity.
Diversity at this higher level of complexity becomes significant. Some complexes evolve more robust protein shells which protect the RNA within, at the expense of slower activity. Others become more reliable replicators, relying on high levels of activity to overcome their short lives. Relatively stable forms begin to appear.
As more complexes are produced the feedstock molecules in the soup begin to thin out and competition for them comes into play. Some of the longer-lived complexes develop the ability to steal them from the less stable ones, even to detach small pieces of the complex itself.
Others begin to replicate not by direct self-assembly but by twisting or mutating those around them. The protein-only variety have survived and come to be known as prions, while the RNA types form the basis for viral infection. Combined with a protein coat for longevity during passive periods, these are effectively the first proto-viruses.
Lipids are a significant byproduct of these reactions and as their concentration builds up, they begin to form rather wobbly and impure bubbles around the reacting complexes. These bubbles help protect the short-lived, high-activity complexes from the aggressions of the protein-shielded ones, but at the expense of also blocking out the feedstocks. Fortunately the lipid membranes are fairly haphazard and impure. Some proteins attach to lipids and get included in the shell as it forms; imperfections such as these provide leaky pathways for fresh soup.
The complexes which any given membrane surrounds are at first pretty arbitrary. However some complexes form symbiotic relationships, their waste products providing feedstock for another kind. These complexes-of-complexes form yet a higher level of complexity. Evolution now kicks off in earnest and life has arrived.
Along with RNA life, RNA viruses evolve side by side in a typical competitive spiral of predator vs. prey. Everything is still pretty wobbly and unstable at this stage; some viruses end up coated in protein, others in lipids. Some do not kill the host they invade but reproduce in small numbers, eventually ending up doing something useful, losing their coat and along with it their identity as a virus.
In due course some mutant triggered a bizarre reproduction not of its own RNA but of a double-stranded DNA. Whether this was cellular or viral DNA is moot, but it eventually ended up in a viable virus. This could explain how, some time after LUCA, prokaryotic life gained its protective membrane around the nucleus: that virus remained in the cell, replicating in small numbers. Eventually it outperformed the cell's original genome and took ownership. Since it was a lipid-coated virus, the nuclear membrane is also lipid based. Of course, it is possible that the membrane may have risen some other way, for example the close physical proximity of the nucleus to lipid production may have simply led to a convenient dumping of rogue lipids around the nucleus.
Also of course, the whole development pictured here is full of gaps and question marks. I am no biologist and I could be way off beam on many details. Nevertheless, I believe that I have adequately demonstrated the need to incorporate viruses into the heart of any theory of life's origins, and that the challenge is not on those biologists who would treat the origins of the living cell and virus holistically, but on those who would seek to separate them.