Translate

Monday 17 July 2017

Endogenous Retroviruses and the Evidence for Common Descent

Overview

Common descent has not been an issue in the mainstream scientific community for over one hundred years. The case for common descent, which was considered solid in the early 20th century, has now become overwhelming based in no small part on the evidence from molecular genetics. The evidence is overwhelming when it comes to endogenous retroviral inclusions. ERVs are remnants of prior retroviral infections that have become integrated into the DNA of organisms. If they pass into the germline of an organism, its descendants can inherit them. Their presence is clear evidence of past infection in an ancestral organism.
Common descent would predict that the descendants of a species infected by a retrovirus, which subsequently became integrated into the germline, would inherit that retroviral inclusion at the same point in the DNA of the descendant organisms. We find evidence of this in organisms ranging from primates to sheep to crocodiles. It is difficult to imagine a more comprehensive demonstration of common descent, particularly when closely related species such as humans and chimpanzees share significant numbers of ERVs at the same point in their genome.
One analogy that should drive home this point is that of a mathematics teacher who receives ten exam papers that not only get the same questions wrong, but show the same errors in the working of the problem, and even share the same spelling errors at the same questions. Copying is the only reasonable conclusion, as it stretches credibility to assume that the ten students independently got the same questions wrong, made exactly the same mistakes in derivation and made the same spelling error. This is similar to what we see with closely related species with shared ERVs at exactly the same point in their genomes.
The following article should serve not only as an introduction to retrovirology for the layperson, but point out in detail the evidence for common descent from ERVs, examples where evolution has co-opted ERV components to perform new functions, and refutations of common evolution denialist arguments against the evidence for common descent from ERVs.
This review is large, but no apology is made for this. Shared ERV elements in related species is overwhelming evidence for common descent, which is why creationists, who are unable to refute this evidence raise objections which while superficially appealing to the layperson do not pass the critical scrutiny of the scientists whose professional life involves working with them. Dismissing the molecular evidence as “circumstantial evidence” is a merely an attempt to evade the burden of proof expected of anyone who opposes a consensus view. If one challenges a long-established position, one needs to understand what one us challenging intimately, and produce hard evidence for critical review by the scientific community. Otherwise, if that is not done, no one will take such attempted rebuttals seriously - and rightly so. No creationist refutation of this evidence from shared ERV elements exists - the consensus view that they support common descent remains unchallenged.
 
Contents


Before we begin, let’s define some terms that will be used frequently:

·      Germ line: sex cells such as sperm or ova
·      Transcription: the process where a DNA segment is copied into RNA as the first part of gene expression
·      Reverse transcription: the process by which a DNA strand is generated from an RNA template.
·      Virus: a pathogen that can only replicate intracellularly - they consist of a protein coat that surrounds the viral genetic material, which is either DNA or RNA. In order to replicate themselves, they invade a host cell, and hijack its replication mechanisms.
·      Retrovirus: an RNA virus that produces DNA from its RNA genetic material. The DNA copy is then inserted into the host DNA, and then replicates each time the host cell divides.
·      Provirus: viral genetic material that has become integrated into the host cell’s DNA.

An endogenous retrovirus is a retrovirus that has become integrated into the host genome. Once integrated, the endogenous retrovirus can be potentially infectious for a short time. However, the proviral sequence usually acquires point mutations, deletions and insertions which render them non-functional, and unable to express the retrovirus. The reason for this is that the proviral sequence is selectively neutral – that is, from the point of view of the organism it is not important for its survival, but just an unimportant stretch of DNA. ERVs abound in vertebrate genomes, numbering in the thousands. In humans, up to 8% of our genome <4> is composed of ERVs or ERV-related elements. Most are non-functional.
How do we know that our genome is littered with thousands of ERVs? The answer is that we know what a retroviral genome looks like. This is the representative genomic structure of a retrovirus:

LTR—gagpolenv—LTR

LTR: long terminal repeat. They are DNA sequences that are involved in the insertion of the retroviral genome into host DNA.
gag: group specific antigen. This codes for retroviral structural proteins
pol: polymerase. This codes for reverse transcriptase, protease and integrase
env: envelope. This codes for the retroviral coat proteins.



Currently, the retroviruses fall into seven genera <5,6>

1. Alpharetrovirus
So far, these have only been identified in jungle fowl (genus Gallus). This group includes both the first retrovirus identified – avian leucosis virus (ALV) – and the first oncovirus – Rous sarcoma virus (RSV), which causes sarcoma (cancer) in chickens.

2. Betaretrovirus
Betaretroviruses to date have been isolated only from mammals. Examples of these include mouse mammary tumour virus (MMTV) and the simian retroviruses (SRV-1 to SRV-6). SRV-1 and SRV-2 in some macaques can induce simian AIDS. MMTV can be transmitted via the mother’s milk to its young through infected lymphocytes and induced the formation of benign mammary tumours. MMTV also possesses an extra gene (sag) that encodes a superantigen.

3. Gammaretrovirus
Unlike the other retroviruses, the gammaretrovirus group contains retroviruses found in more than one vertebrate class. They were first identified in mice and were associated with murine leukaemia and murine sarcoma. Three subgroups to date have been classified:

·      Moloney murine leukaemia virus (MoMLV), gibbon ape leukaemia virus (GaLV), koala retrovirus (KoRv) and feline leukaemia virus (FeLV) are examples of gammaretroviruses found in mammals. FeLV infection is associated with immunosuppression in cats which can be fatal. Xenotropic murine leukaemia virus-related virus (XMRV) <7> is the first known gammaretrovirus to infect humans. Evidence for <8> and against <9> an association with prostate cancer exists.
·      Reticuloendotheliosis viruses (REVs) are found in birds, and take their name from the virus of the same name. Another example is spleen necrosis virus (SNV) <10> which can kill ducklings and suppress the immune system of older ducks.
·      Reptilian gammaretroviruses include viper retrovirus

4. Deltaretrovirus
These are restricted to mammals and to date have been found only in primates and cattle. Deltavirus infection is characterised by a long incubation period prior to the onset of disease. The virus tends to remain in the host indefinitely. Human T-lymphotrophic virus 1 (HTLV-1) and human T-lymphotrophic virus 2 are two of the better known deltaviruses. HTLV-1 is oncogenic, causing T-cell lymphoma and T cell leukaemia. Bovine leukaemia virus (BLV) is closely related to HTLV-1 and only occasionally causes a B cell leukaemia. Generally, it induces only a benign disease.

5. Epsilonretrovirus
The epsilonretrovirus class infects fish. The walleye dermal sarcoma virus (WDSV) infects walleyes, and as the name suggests is associated with walleye dermal sarcoma. Other epsilonretroviruses include walleye epidermal hyperplasia virus 1 and 2 (WEHV-1 and WEHV-2).

6. Lentivirus
Lentiviruses, like deltaviruses occur only in mammals and are also characterised by a long incubation period and slow pattern of disease. Unlike deltaviruses, lentiviruses have been isolated from a wide range of mammals including primates, cats, and domestic ungulates including sheep, goats, horses and cattle. The best-known lentiviruses are the human immunoviruses HIV-1 and HIV-2, infection with which often leads to AIDS. Feline immunodeficiency virus and simian immunodeficiency virus (FIV and SIV) are the best known non-primate lentiviruses.

7. Spumavirus
Spumaviruses are widely found across the mammalian species,but to date are not definitely linked with disease. Examples include feline spumavirus (FeSV), bovine spumavirus (BSV) and chimpanzee foamy virus (CFV).



Most retroviral infections occur in the somatic cells (that is, the cells of the body apart from the germ cells). However, if the germ cells are infected, then the retroviral proviral sequence will become part of the germline DNA, and can then be passed down from generation to generation. These are referred to as endogenous retroviruses <11>.
As these proviral sequences are alien to the host, one would reasonably expect some consequences. One author has pointed out that:
Proviral inheritance might have numerous consequences for the host. Some stem from the insertion of multiple copies of DNA sequences containing signals capable of modifying transcription or RNA processing. Thus proviruses might act to cause chromosomal rearrangement by homologous recombination, as a source of novel control sequences for cellular genes or as insertional mutagens. Alternatively, there might be consequences from viral gene expression, with either pathogenic or possibly beneficial effects. In the extreme case, transcription may lead to virus activation and the formation of virally induced tumours, as has been well documented with certain endogenous murine leukemia viruses and mouse mammary tumour viruses. <12>
In other words, if proviral expression causes the host organism to develop a debilitating disease, then the host organism will be less likely to survive and the ERV will likewise soon be removed from the gene pool. Conversely, if the presence of the proviral sequence or some of its components confers a selective advantage <13>, then these components will tend to remain in the host genome.
If the ERV does not unduly harm the host, then it may be passed on to subsequent generations. While the proviral sequence retains the ability to replicate, its numbers in the host genome will tend to increase either by re-infection or by retrotransposition. In this case, since the ERV does not confer a selective advantage to the host, then it will accumulate mutations over time. This will first render the proviral sequence incapable of re-infecting the host, and then over time cause it to decay away as it accumulates mutations.
Endogenous proviral sequences have been found in every vertebrate class as well as most invertebrate species studied so far. There were around 80000 proviral sequences or their remnants identified in the human genome as of late 2003, which is around 8% of the genome. This is around twice the amount of space that actual coding DNA takes up in our DNA. Coffin notes that “in other words, there are more proviruses in us than there is us in us.” <14>
Endogenous retroviral classification is still in some flux: one classification method is to cluster them with the retrovirus genera from which the retrovirus that originally infected the ancestral genome came. Gifford and Tristem state that:
Some ERVs clearly represent endogenised variants of exogenous viruses and are grouped within the seven recognised genera…However, infectious counterparts have not been identified for most ERVs and, at present, there is no consensus as to how these endogenous retroviruses (many of which are only fragmentary sequences) might be incorporated into the existing retroviral taxonomy. The situation is further complicated by the historical development of distinct and sometimes inappropriate classification schemes for ERVs in particular hosts. <15>
This approach – of classifying ERVs according to their similarity to retrovirus genera – has led to one <16> classification scheme:
Class I: ERVs clustering with gamma and epsilonretroviruses
Class II: ERVs clustering with lentiviruses, alpha / beta / deltaretroviruses
Class III: ERVs clustering with spumaviruses.
Most ERVs are derived from alpharetroviruses, betaretroviruses, and gammaretroviruses. To date, only a limited number of lentivirus-related ERVs have been discovered <17,18> while no deltaretrovirus-related ERVs so far have been found. ERVs distantly related to spumaviruses and epsilonviruses have been discovered. A parallel classification scheme for human ERVs (HERV) exists, which as Gifford and Tristem point out has historically complicated attempts to formulate a classification scheme for ERVs.
An alternative way in which ERVs can be classified <19> is to group them into recent and ancient. Recent ERVs were integrated into the host genome after speciation, while ancient ERV integration occurred before speciation. Unlike recent ERVs, which can still give rise to infectious retroviruses, ancient ERVs have accumulated inactivating mutations that generally prevent them yielding retroviruses.



Arguably the most powerful demonstration of common descent is the presence of ERV inclusions in the same position in the genome of related species. Coffin - an acknowledged expert in virology - notes that:
Because the site of integration in the genome, which comprises some three billion base pairs in humans, is essentially random, the presence of an ancient provirus at exactly the same position in different, but related, species cannot occur by chance, but must be a consequence of integration into the DNA of a common ancestor of all the species that contain it. It evolution of retroviruses follows, therefore, that we can infer what viruses were present millions of years ago by examining the distribution of endogenous proviruses in modern species. <20>
In a frequently cited paper, Coffin and Johnson pointed <21> out how retroviral inclusions could be employed to reconstruct primate phylogenies or evolutionary family trees using the principle that retroviruses, once fixed in the genome of a species will be inherited by its descendants.
Coffin and Johnson used human endogenous retroviruses - most of the HERV families are found in apes and Old World monkeys. The HERVs used in their study are:
the result of integration events that took place between 5 and 50 million years ago, as indicated by the distribution of specific proviruses at the same integration sites (or loci) among related species. The evolution of primates has been the subject of intense study for well over a century, providing a well established phylogenetic consensus with which to compare and evaluate the performance of ERVs as phylogenetic markers. <22>
What Coffin and Johnson are pointing out towards the end is that we have a fairly reliable evolutionary family tree based on physical characteristics which they used to see how well family trees constructed using ERV data compared with the consensus tree.
Although the technical aspects of constructing such molecular trees are sophisticated, the basic principle behind it is fairly straightforward. An ERV proviral sequence that is selectively neutral will eventually accumulate mutations. If two species share a recent common ancestor, then the ERV inclusions they share will differ by only a small number of mutations, while species that share a remote ancestor will have their common ERV inclusion differing by significantly more mutations. Using sophisticated statistical tools, a family tree can be constructed from the molecular data.
ERV inclusions as Coffin and Johnson point out <23> have three sources of information that can be used to construct phylogenetic trees:
·      The distribution of ERV inclusions among related species
·      Accumulated mutations in proviral sequences, which allow an estimate of genetic distance
·      Sequence divergence between the LTRs at each end of the ERV inclusion, which is a source of information unique to endogenous retroviruses.
With respect to the first point, both the huge size of the vertebrate genome and the random nature of retroviral integration, it is highly unlikely that one will find multiple ERV inclusions at the same location. As the authors point out:
Therefore, an ERV locus shared by two or more species is descended from a single integration event and is proof that the species share a common ancestor into whose germ line the original integration took place. Furthermore, integrated proviruses are extremely stable: there is no mechanism for removing proviruses precisely from the genome, without leaving behind a solo LTR or deleting chromosomal DNA. The distribution of an ERV among related species also reflects the age of the provirus: older loci are found among widely divergent species, whereas younger proviruses are limited to more closely related species. <24>
The second point is fairly straightforward, as it is similar to the principle underlying other sequence-based phylogenetic analytical methods. As proviral sequences are selectively neutral, they will tend to accumulate mutations at the rate of their occurrence. Two species that have only diverged recently will only differ by a small number of mutations at their common proviral sequence, while those that diverged in the remote past will differ by a larger number of mutations.
The final point - unique to ERVs - is the sequence divergence between the LTRs at either end of the proviral sequence. Of importance here is the fact that as a consequence of reverse transcription, both LTRs will be identical at the time of integration. Johnson and Coffin again:
Furthermore, both clusters are predicted to have similar branching patterns as determined by the phylogenetic history of the host species, with similar branch lengths. Thus, each tree displays two estimates of host phylogeny, both of which are derived from the evolution of an initially identical sequence. As we shall see, deviation of actual trees from this prediction provides a powerful means of testing the assumptions and detecting events other than neutral accumulation of mutations in the evolutionary history of a species. <25>
What did Johnson and Coffin find? Lets look at the distribution of the ERVs analysed. In their words:
Three of the loci, HERV-KC4, HERV-KHML6.17, and RTVL-Ia, were detectable in the genomes of OWMs and hominoids, but not New World monkeys, and therefore integrated into the germ line of a common ancestor of the Old World lineages. HERV-K18, RTVL-Ha, and RTVL-Hb were found exclusively in humans, gorillas, chimpanzees, and bonobos, and thus are consistent with a gorilla/chimpanzee/human clade. None of the loci was detected in New World monkeys. <26>
This data is perfectly explained by common descent. To reiterate an “ERV locus shared by two or more species is descended from a single integration event and is proof that the species share a common ancestor into whose germ line the original integration took place.” Johnson and Coffin found many loci shared by these primate species, some shared only by humans, chimps, bonobos and gorillas, some shared only by old world monkeys and hominoids (humans and great apes). This data is consistent with an evolutionary origin of these species, but impossible to explain by special creation without invoking a credulity-stretching concatenation of coincidence.
Estimates of integration time from data obtained from the 5-LTR and 3-LTR sequences were also consistent with predictions from common descent:
To estimate the age of each provirus the human/chimpanzee distances from each tree were used to calibrate the rate of molecular evolution at each locus…The most recent common ancestor of humans and chimpanzees lived approximately 4.5 million years ago…so dividing the distance between the human and chimpanzee sequences (substitutions per site) by this number gives rates ranging from 2.3 to 5.0 x 10-9 substitutions per site per year. These numbers are similar to the estimated rates of evolution for pseudogenes and noncoding regions of mammalian genes…Applying each rate to the divergence between the 5 and 3 LTRs of the same locus gives integration times consistent with estimates based on species distribution. <27>
Most of the ERVs analysed produced phylogenetic trees consistent with expectation. Their conclusions:
The HERVs analyzed above include six unlinked loci, representing five unrelated HERV sequence families. Except where noted, these sequences gave trees that were consistent with the well established phylogeny of the old world primates, including OWMs, apes, and humans… Phylogenetic analysis using HERV LTR sequences gives rise to trees with a predictable topology, on which is superimposed the phylogeny of the host taxa, and allows ready detection of conversion events. <28>
In short - we share ERVs with primates at identical places in our genomes which are readily explained by common descent. Furthermore, this is a powerful tool which allows generation of evolutionary family trees. Hardly circumstantial evidence.
The idea that ERVs are ancient, and proof of common descent is entirely uncontroversial in the world of molecular biology. Jha <29> states that:
Most endogenous retroviruses (ERV) are millions of years old, and insertions are shared by multiple animal species (Mariani-Costantini et al. 1989; Lander et al. 2001). ERV-K, a transcriptionally active family of endogenous retroviruses, are at least 28 million years old and can be found in the genomes of humans, apes, and Old World monkeys (Costas 2001; Reus et al. 2001).
Herniou et al <30> examined the distribution of ERVs in vertebrates. As early as 1998, the date of publication of their paper, ERVs were identified across six classes. In opening, they observe that:
Vertebrate genomes contain numerous parasitic genetic elements, many of which undergo vertical germ line transmission and are capable of remaining in the same locus for millions of years. <31>
Polavarapou et al in a letter outlining the discovery of newly identified HERV families:
Using a primate pseudogene nucleotide substitution rate of 0.16% divergence/million years, the relative integration time or age of any full-length HERV can be estimated from the level of sequence divergence existing between the element’s 5 and 3  LTRs. Using this method, the estimated ages of the new families of HERVs described here range from 18.0 to 49.5 million years, indicating that members of these families have not been transpositionally active in the primate lineage since well before chimpanzees and humans diverged from a common ancestor (6 million years ago) <32>
Barbulescu et al showed over ten years ago that many human ERVs of the HERV-K class (present in humans, apes and old world monkeys) are unique to humans.
Two proviruses, HERV-K105 and HERV-K110/HERV-K18 were detected in both humans and apes (Figures 2 and 3). HERV-K110 was present in humans, chimpanzees, bonobos and gorillas but not in the orangutan (Table 1). Thus, this provirus formed after orangutans diverged from the lineage leading to gorillas, chimpanzees, bonobos and humans, but before the latter species separated from each other. HERV-K105 was detected in humans, chimpanzees and bonobos, but not in gorillas or the orang-utan. The preintegration site, however, could not be detected in gorillas or orang-utans using several different primers based on the human sequences that flank this provirus. It is therefore unclear from this analysis whether this provirus formed after gorillas diverged from the human–chimpanzee–bonobo lineage, or if it formed earlier but was subsequently deleted in one or more lineages leading to modern apes. It is clear that at least one full-length HERV-K provirus in the human genome today has persisted since before humans, chimpanzees, bonobos and gorillas separated during evolution, while at least eight formed after humans diverged from the extant apes. <33>
Belshaw et al, looking at the long-term reinfection of the human genome by ERVs note that:
Within humans, the most recently active ERVs are members of the HERV-K (HML2) family. This family first integrated into the genome of the common ancestor of humans and Old World monkeys at least 30 million years ago, and it contains >12 elements that have integrated since the divergence of humans and chimpanzees, as well as at least two that are polymorphic among humans…This recent activity makes this family ideal for distinguishing between the alternative mechanisms of proliferation. <34>
This may appear a little repetitive, but the point is that the use of ERVs as evidence of common descent is entirely uncontroversial in the scientific world. Far from being “circumstantial evidence” – a term which strongly hints at a sub-par grasp of the issues, they are overwhelming proof of common descent. Again, the burden of proof is on the creationist to show – from the scientific literature – that this is not the case (and one would need to show that the majority of scientists accept that interpretation of the evidence.)
Let's get to a fundamental point. An ERV that infected the common ancestor of mammals should be present in the same loci in all mammals, while ERVs that infected a more recent common ancestor (that of the primates) should only be found in primates. Do we see this pattern? We do.
The evolutionary biologist Catherine Dunn, whose published literature includes work on human ERVs <35> elegantly points this out:
Let's imagine how ERVs would behave within a model of evolution by common descent. An ancient creature, let's call it the common ancestor of all modern mammals, is infected by a retrovirus that becomes endogenous. All of the animal's descendants (i.e. all mammals) would be expected to carry the same ERV insertion (ERV1) in the same chromosomal location.
Fast forward in evolutionary time. Different lineages have evolved and diverged from the original common ancestor and there are now many different types of mammal in existence, all carrying ERV1. A small rodent, let's call it the common ancestor of mice and rats, is again infected by a species-specific retrovirus that becomes endogenous. This is ERV2. In a parallel event in a different lineage, the common ancestor of all great apes acquires a third insertion, ERV3.
Moving forward again, a fourth ERV appears in some of these new-fangled human thingies that are running around in Africa, but not in their hairier relatives who will eventually evolve into modern chimpanzees. The early humans spread out, and a fifth and (don't worry) final ERV arises in a population that is isolated in a discrete geographical location. The infection does not spread to other human populations.
So what would we expect? Humans, chimps, mice and rats should all possess ERV1. The mouse and rat genomes will also contain ERV2, the virus that infected their common ancestor, but not the primate-specific ERV3, 4 or 5 insertions. All great apes will share an identical ERV3 insertion; all humans will also possess an ERV4 insertion that is not found in chimps or other apes. In addition, some, but not all, humans will carry an insertion of ERV5. The rodent-specific ERV2 insertion will not be found in any primate species. <36>
This is precisely what we see – further confirmation of what common descent would predict. There exist ancient retroviral inclusions that are found in orthologous loci in mice and men which were detected when the mouse genome was sequenced. <37>
Mager and Freeman investigated the age and origin of the HERV-H ERV family:
We have isolated a 1.6-kb genomic DNA segment from the New World monkey marmoset that had been PCR amplified using human HERV-H primers. DNA and protein comparisons and database searches indicate that this marmoset clone is more closely related to human HERV-H elements than to any other sequence, indicating that HERV-H-related sequences do exist in New World monkeys. In contrast to the high copy numbers of deleted elements in Old World primates, Southern blot analysis shows that such elements are present in less than 50 copies in two different species of New World monkey. To estimate evolutionary ages of the common deleted form of the element, a selected DNA segment from the pol region was compared from multiple human HERV-H elements. This comparison suggests that many HERV-H elements of the abundant deleted subfamily integrated approximately 30–35 million years ago. Very similar percentage divergence values between 5 and 3 long terminal repeats of individual elements of the deleted subfamily also suggest that these elements are close in age. These results indicate that HERV-H elements first appeared in the germline prior to the New World/Old World divergence over 40 million years ago. Interestingly, they remained in low numbers in the New World branch while a subfamily underwent a major amplification in Old World primates before the time of divergence of hominoids from Old World monkeys. <38>
In other words, we have evidence of ERVs found only in primates – what one would expect if the common ancestor of primates was infected by a retrovirus that became fixed in the genome, and was passed down to all descendants.
There exist ERVs that are found in higher primates only, while others are found only in chimps and humans. Another ERV is found only in humans and not in other primates:
For example, the AF001550 LTR of cluster 3 is not present in Old World monkeys but is present in gibbon and all higher primates. In contrast, the AC003023 cluster 8 LTR is found only in chimpanzee and human, indicating a more recent integration (Fig.2). Initial results with primers flanking three of the integrated LTRs of cluster 9 resulted in the expected amplification products in human DNA but not in any of the other primate DNAs (Fig.2). To demonstrate that sequences of cluster 9 were unique to human DNA, primers flanking the other six identified LTRs of this cluster, including the full-length HERV-K10 element, were used in the amplification of primate DNA. Indeed, all were detected only in human DNA (Table 1), indicating that sequences derived from this cluster integrated after the divergence of the human lineage from the great apes.<39>

Approximate integration times of HERV-K elements. Arrows indicate the lineage in which a particular LTR was first detected, and numbers refer to the cluster as identified in Fig. 1 Time estimates for divergence of the different primate lineages were taken from Bailey et al. (From Ref. 39)
The odds of these ERVs integrating in the same place in the genomes of primates purely by chance are negligible. The most parsimonious explanation is infection of a common ancestor, with inheritance of the ERVs by the descendant species.
ERVs have also been found in some but not all human beings, evidence of even more recent retroviral infection and endogenisation. <40> This is precisely what one would expect with common descent. There is no credible creationist explanation for this evidence.
Since ERVs have been found in every vertebrate whose genome has been examined, it is not unreasonable to look for both evidence of active ERV infection and examples of ERV inclusions in other related species. This is in fact what we see. Take the subject of active retroviral infection. The koala genome is currently being colonised by the koala retrovirus (KoRV). Research <41> has shown that:
They show that KoRV is present, at variable copy number, in the germline of all koalas found in Queensland, but that animals from some areas of southern Australia lack the provirus. Most notably, KoRV appears completely absent from koalas on Kangaroo Island off the coast of South Australia. This island was stocked with koalas in the early part of the twentieth century and has remained essentially isolated since then; it appears most likely that the small founding population was entirely free of KoRV. Tarlington et al suggest that an ongoing process of infection and endogenization is now occurring, spreading from a focus in northern Australia that quite possibly initiated within the last 100 to 200 years. <42>
This by the way is not something that is of particular immediate benefit to the koala:
KoRV appears to be associated with the fatal lymphomas that kill many captive animals. It may also be immunosuppressive, thereby contributing to the chlamydial infections that afflict many koalas. <43>
Until recently, endogenous lentiviruses were unknown – this hindered attempts to investigate the origin of the lentiviral group. The discovery of RELIK (rabbit endogenous lentivirus type K) has not only aided this effort, but shown that the RELIK ancestral lentivirus integrated into the germline of the common ancestor of the rabbit lineage more than 7 million years ago:
This study was designed to test the primary prediction of the hypothesis of a 10-My or longer endogenous history of RELIK by answering the question of whether or not RELIK was present in lagomorph species other than Oryctolagus c. cuniculus….
The failure of amplification of RELIK-gag from Ochotona is in accordance with divergence times estimated by Katzourakis et al, which imply that the RELIK insertion into the leporid ancestor must be largely posterior to the Ochotona-Leporidae split (35 My or 40 to 50 My ago, according to molecular or fossil data, respectively). The presumed absence of RELIK-related sequences in pikas was furthermore supported by the fact that intensive screening of the WGS trace archives representing a twofold coverage of the genome of Ochotona princeps (project 19235) did not reveal a single sequence remotely similar to RELIK. Equally negative results were obtained by BLAST searching the WGS archives for horses, cats, or pikas with the entire RELIK sequence (8.5 kb) rather than with GagC (0.7 kb).
In conclusion, the present results provide factual evidence that, as predicted by the phylogenetic inference methods of Katzourakis et al., RELIK was already present in a common ancestor of the Lepus, Sylvilagus, and Oryctolagus and Bunolagus lineages. It opens the door to more in-depth phylogenetic studies of the ancient history of this important viral group. <44>
This is not just a primate phenomenon.



As the literature comprehensively demonstrates, there is overwhelming evidence for common descent from shared ERV inclusions at homologous loci. This does not mean that ERV elements cannot be co-opted by evolution.
As long ago as 1996, virologists were openly speculating about whether ERV elements had acquired any biological significance. Lower et al postulated that:
During evolution, resistance to superinfection by the pathogenic exogenous counterparts may have imparted a survival advantage to the progeny of those individuals in which integration into the germ cell lineage occurred. Such integration would have indirectly helped survival of retroviruses, which by virtue of their endogenous nature are no longer subject to the selective pressure previously exerted on their exogenous strains. Resistance to superinfection in the long term may contribute to the eradication of the exogenous counterparts. <45>
Once HERVs have been integrated, they may have also contributed to the evolution of their hosts. Genomes are not static entities. In phylogeny, genomic changes are a precondition for selection and adaptation. While mutations are slow and therefore unsatisfactory tools for genomic modification, plasticity is more efficiently achieved by rearrangements driven by recombination and transposition.
Reverse transcription may be instrumental in inducing variations, as approximately 10% of the human genome consists of reverse transcribed and transposed sequences. HERVs, together with retroposons and retrotransposons, may be the main source of RT activity. <46>
One thing needs to be pointed out right now. The fact that evolution has co-opted some elements of ERVs does not disprove common descent. Certainly, the scientists who report on the possible function of ERVs do not think this invalidates their use as phylogenetic markers:
HERVs may be regarded as sequences that were accidentally integrated into the genome of Old World progenitors of subhuman primates. They seem to be irrelevant to their hosts, as indicated by their rapid mutation and deletion. As HERVs are fossils and their exogenous counterparts probably have long vanished (or still remain to be detected), it is nearly impossible to trace back their putative former biological functions. <47>
This is a common blunder made by creationists, and demonstrates a fundamental lack of understanding of molecular biology and virology. As mentioned before, the presence of an inactivated, functionless ERV sequence at the same position in the genomes of relates species indicates that the common ancestor of these species was infected by a retrovirus that became integrated in the germline, inactivated and then passed onto the descendant species. This does not preclude elements of the ERV from then being co-opted by evolution for other functions.
Evidence for this is not hard to see. Bekpen et al show that the IRGM gene, part of the immunity-related GTPase family and of importance in targeting intracellular pathogens was inactivated around 40 million years ago after an Alu segment (a short piece of mobile DNA) disrupted the gene in the common ancestor of new world monkeys (NWM), old world monkeys (OWM), humans and apes, rendering it non-functional. Around 20 million years later, it became functional again in the common ancestor of humans and apes after an ERV element integrated into the genome adjacent to this non-functional segment, in effect acting as it promoter sequence. (A promoter is a section of DNA that allows a gene to be transcribed). As the authors say:
…the IRGM gene became nonfunctional ~40 million years ago (leading to pseudogene copies in Old World and New World monkeys) but was resurrected, 20 million years ago in the common ancestor humans and apes…In addition to the genetic and functional data, several lines of evidence support this seemingly unusual scenario. First, we find evidence of a restored ORF in humans and African great apes. Second, this change coincided with the integration of the ERV9 element that serves as the functional promoter for the human IRGM gene. <48>


The structures of the IRGM loci are shown in the context of a generally-accepted primate phylogenetic tree. ORF, ERV9, intronic sequence, Alu sequence, and 5 untranslated region (UTR) depicted in green, black, white, yellow and blue colors respectively. A red color denotes pseudogenes based on the accumulation of deleterious mutations in the ORF. Shaded orange color indicates an atypical GTPase because of mutations leading to the loss of a canonical GTPase binding motif (see Figure S1). The first ATG codon (green arrow) after the Alu repeat sequence is used as putative start codon for the open reading frame of IRGM. The transcription start site is marked with green flag. FS indicates frameshift mutation. TGA and TAA denote the position of stop codons (arrows). The shaded white, blue and green colors indicate predicted intron, UTR or exon, respectively. The genomic loci are not drawn to scale with the exception of the full-length sequence of IRGM ORF. (From Ref. 48)

Here is a perfect example of how ERVs are overwhelming evidence for common descent, while showing how ERV elements can be co-opted by evolution:

·      In OWM, NWM, ape and human genomes, an Alu insertion rendered the IRGM family non-functional. The IRGM paralogs in mice and other mammals is however intact – proof that the Alu insertion happened in the common ancestor of the primates
·      NWM and OWM IRGM gene family remains broken – pseudogensied.
·      Apes and humans have a restored IRGM family with an ERV element acting as the necessary spare part for the gene to become active again – consistent with infection in the common ancestor of apes and humans, but after the NWM and OWM lineage had diverged.
·      There is no credible creationist explanation for this, other than “God did it that way” which immediately raises the following questions.
·      Why are apes and humans saddled with an IRGM gene which has a retroviral sequence acting as a promoter when other species have the equivalent gene without both the Alu sequence in the middle and a normal promoter?
·      Why do Old World and New World monkeys not even have an ERV promoter sequence to resurrect their IRGM gene sequence which in them is a functionless pseudogene?
·      Why are the Alu and ERV9 sequences found in exactly the distribution that would be expected if they invaded the genomes of the ancestor of the primates and ancestor of apes / humans, respectively?

Once again, the most parsimonious explanation (to be perfectly honest, the only credible explanation) is common descent.
Creationist misunderstanding of ERVs to the point where research papers that are supportive of evolution are used as proof of creation is not rare. One example comes from the OEC organisation Reasons to Believe, who misused work <35> by Catherine Dunn to support creationism:
My paper concerns the regulation of a human gene by DNA derived from an endogenous retrovirus (ERV). An ERV is a viral sequence that has become part of the infected animal's genome. Upon entering a cell, a retrovirus copies its RNA genome into DNA, and inserts the DNA copy into one of the host cell's chromosomes. Different retroviruses target different species and types of host cells; the retrovirus only becomes endogenous if it inserts into a cell whose chromosomes will be inherited by the next generation, i.e. an ovum or sperm cell. The offspring of the infected individual will have a copy of the ERV in the same place in the same chromosome in every single one of their cells.
This happens more often than you might think; 8% of the modern human genome is derived from ERVs. Repeated sequences of this kind were formerly considered to be non-functional, or “junk” DNA. However, we're gradually finding more and more examples of viral sequences that appear to have some kind of function in human cells. For example, many ERV sequences play a role in human gene regulation. ERVs contain viral genes, and also sequences - known as promoters - that dictate when those genes should be switched on. When an ERV inserts into the host's chromosome, its promoter can start to interfere with the regulation of any nearby human genes. In the example that I researched, the ERV promoter has become responsible for most of the expression of a particular human gene in the large intestine.
My particular favourite ERV is found in various primate species, and therefore must be at least 25 - 30 million years old. I compared the sequences and activities of the same ERV promoter in the human, chimp, gorilla, and baboon genomes. Despite some minor “single-letter” point mutations caused by DNA copying errors, the promoter had essentially the same function in all four species. I struggle to understand why any kind of designer would decide to use different codes to perform the same function in different species, but there it is. I hypothesised that the ERV was only allowed to persist (that is, its meddling in gene regulation didn't kill the first organism in which it inserted, which was therefore able to pass the insertion on to its offspring) because the incoming ERV promoter behaved in a very similar way to the original host cell's gene promoter. I wasn't able to do the experiments I wanted in order to investigate this point, but another group subsequently did, and their findings supported my hypothesis. That's what happens when you make and test falsifiable predictions. <36>
Dunn objected to the misuse of her work <49> by RTB, who to their credit later removed <50> any reference to her paper on their website. It shows, sadly, that if a creationist organisation can so utterly fail to understand a paper to the point that they believe it supports creationism when the author (who should be expected to know what the paper states!) clearly states that it in fact supports evolution. The following extract makes this clear:
The mechanism of LTR promoter regulation is of particular interest in the colon, where the majority (74%) of β3Gal-T5 transcripts are driven by the LTR. This is highly unusual compared with other reported LTR-promoted genes such as apolipoprotein C-I, EDNRB, and Mid1, where the LTR contributes a maximum of 15%, 30%, and 38% of total transcripts, respectively, in selected tissues. As an increasingly large number of LTR gene promoters are being identified, it seems clear that LTR elements are not always deleterious to the organism and, in fact, may enhance the range of transcriptional regulatory signals available to a nearby gene. In the case of β3Gal-T5, the ERV-L LTR is deeply fixed in the primate genome as it is found in higher apes and Old World monkeys (unpublished observations). Thus, this element has been retained over millions of years and now plays a major role in expression of β3Gal-T5, particularly in the large intestine. This is an intriguing example of adoption of an ancient retro-element for usage by the host. <51>
That a creationist organisation could not understand the words above highlighted strongly support evolution does not inspire confidence in their competence to handle the literature.
ERV promoters have been co-opted many times – this is hardly a rarity and once again shows that evolution quite often works as a ‘tinkerer’, adapting anything at hand irrespective of how elegant that solution is.
Recently, Conley et al looked at retroviral promoters in the human genome:
One way that ERVs have affected the function and evolution of the human genome is by donating regulatory sequences that control the expression of nearby genes. The gene regulatory effects of ERVs were first uncovered in a number of anecdotal studies on specific genes (reviewed in Bannert et al., 2004; Medstrand et al., 2005). For instance, the long terminal repeat (LTR) of a human ERV (HERV-E) was shown to serve as an enhancer element that confers parotid-specific expression on the amylase gene (Samuelson et al., 1990).
Later, more systematic computational analyses of the human genome sequence revealed that many human genes contained ERV-derived regulatory regions, suggesting an even greater contribution of retroviruses to human gene regulation (Jordan et al., 2003; van de Lagemaat et al., 2003). Continued efforts to characterize ERV-derived promoters have turned up several new cases in recent years (Dunn et al., 2003, 2006; Romanish et al., 2007). Nevertheless, the full extent of the contribution of ERV sequences to the initiation of transcription in the human genome has yet to be appreciated.
Initiation of transcription by ERV promoters often results in the production of alternative transcripts that are both tissue-specific and lineage-specific. For instance, testis-specific expression of the human gene encoding the neuronal apoptosis inhibitory protein (NAIP) is driven by an LTR promoter sequence, whereas a distinct LTR promoter in rodents confers constitutive expression of the orthologous gene (Romanish et al., 2007). An ERV LTR sequence also serves as an alternative promoter that drives expression of the beta1,3-galactosyltransferase five gene specifically in colorectal tissue (Dunn et al., 2003). <52>
So, we have examples where ERV elements act as promoters (DNA segments that facilitate gene transcription) – evidence that ERV elements can be co-opted by evolution. Conley et al examined the human genome – they found that:
Our analysis revealed that retroviral sequences in the human genome encode tens-of-thousands of active promoters; transcribed ERV sequences correspond to 1.16% of the human genome sequence and PET tags that capture transcripts initiated from ERVs cover 22.4% of the genome. These data suggest that ERVs may regulate human transcription on a large scale. However, it is a formal possibility that many of the ERV derived promoters identified here represent leaky transcription, i.e. noise, which is not functionally significant. Definitive proof of biological activity for individual ERV-TSS may have to await experimental confirmation via knock-out data or promoter swapping…
Our analysis uncovered more than 100 cases of novel ERV-derived promoters that initiate chimeric ERV-human gene transcripts and several thousand more that are likely to do so. ERV-derived promoters are characterized by their ability to promote alternative transcripts that are expressed in a way that is tissue-specific, lineage-specific and distinct from related paralogous genes. These data underscore the extent to which retrovirus activity has shaped the human transcriptome. <53>
In short, we have promoter sequences that clearly arose from ERV elements that initiate ERV-human gene transcripts. As the authors point out in the conclusion, this shows how much retroviral activity has shaped our evolution. It poses not a few difficulties for creationists. These elements are clearly viral-derived (remember gag, pol, env and LTR?) and not human. They are evidence of retroviral integration into the germline. Yet, they have been co-opted. Why, given that other genes function happily with normal promoters do these genes have ERV elements acting as promoters? Again, evolution provides the most parsimonious answer.
Evidence exists that some ERV elements have been co-opted to protect against retroviral infection. Arnaud et al note:
The function of endogenous retroviruses is not completely clear, but some ERVs can block the replication cycle of horizontally transmitted “exogenous” pathogenic retroviruses. These observations lead to the hypothesis that ERVs have protected the host during evolution against incoming pathogenic retroviruses. Here, by characterizing the evolutionary history and molecular virology of a particular group of endogenous betaretroviruses of sheep (enJSRVs) we show a fascinating series of events unveiling the endless struggle between host and retroviruses. In particular, we discovered that: (i) two enJSRV loci that entered the host genome before speciation within the genus Ovis (~ 3 million y ago) acquired, after their integration, a mutated defective viral protein capable of blocking exogenous related retroviruses; (ii) both these transdominant enJSRV loci became fixed in the host genome before or around sheep domestication (~ 10,000 y ago); (iii) the invasion of the sheep genome by ERVs of the JSRV/enJSRVs group is still in progress; and (iv) new viruses have recently emerged (less than 200 y ago) that can escape the transdominant enJSRV loci. This study strongly suggests that endogenization and selection of ERVs acting as restriction factors is a mechanism used by the host to fight retroviral infections. <53>
The summary speaks for itself, but any creationist who argues that this is evidence of design needs to read the paper entirely and note that the authors show that the enJSRV loci entered the sheep genome before speciation around 3 million years ago, then after that acquired a mutation which allowed it to block related retroviral infection. The paper itself strongly supports evolution, so if one cites this paper as evidence that ERVs are divinely inserted to protect against viral infection, one is obliged to accept that these ERVs are proof of common descent, which completely destroys the creationist argument.
There is also the significant problem of God inserting an ERV into many sheep species in order to protect them against infection from retroviruses that He would have created. One wonders why such a mechanism was not created to protect all life against all retroviral infection.



There is no doubt that viruses – including retroviruses – are linked with disease. Up to 20% of all cancers are causally linked to viruses. For example, adult T cell leukaemia and hairy cell leukaemia are linked to the retroviruses HTLV-I and HTLV-II. HIV-1 and HIV-2 are associated with various lymphomas as well as Kaposi's sarcoma. <55>
Human ERVs have also been linked with other cancers such as small cell lung cancer, seminomas, testicular teratocarcinomas and various leukaemias. There is also a link with autoimmune diseases such as systemic lupus erythematosus, primary biliary cirrhosis, systemic sclerosis and Sjogren's syndrome. While it is likely that there may be a causal link, one needs to bear in mind the possibility that the ERV elements found in the tissues of people with these diseases may be released as a result of the malignant or inflammatory state. <56>
Nelson et al in their review article <57> on human ERVs likewise looked at the possible benefits and pathogenic effects of these elements. The possible methods by which carcinogenesis may be initiated or aided by HERVs are:
...by virtue of the expression of HERV mRNA, functional proteins, or retroviral-like particles. They may also be associated with the generation of new promoters or the activation of proto-oncogenes. The expression of HERV-R mRNA is increased in some cases of small cell lung carcinoma. In addition, a teratocarcinoma cell line has been shown to possess a HERV-K sequence and to secrete retroviral-like particles. Testicular germ cell tumours (TGCTs) have been shown to contain proteins of the HERV-K family and patients with TGCT often exhibit a specific immune response to gag and env proteins. It has been suggested that HERV-K may be important in the progression of TGCT through inhibition of an effective immune response, and the HERV env genes have been shown to encode immunosuppressive proteins. It is clear that overexpressed HERV proteins can elicit high titre IgG responses in some settings (for example, HERV-K10 in patients with renal cancer), as detected by the SEREX method (serological identification of expressed genes), suggesting that HERV proteins may in the future provide targets for antitumour immunotherapy. <58>
One of the largest malignant causes of death in women is breast cancer. There is evidence – as I mentioned earlier – that HERVs may be implicated. The authors note that:
HERV-K might be important in the pathogenesis of human breast cancer. It has been shown that the T47D human mammary carcinoma cell line produces retroviral particles with reverse transcriptase activity. Both the HERV-K10 related sequences of T47D cells and the reverse transcriptase activity are increased by steroid hormone treatment, which is thought to be the result of transcriptional activation via binding of the progesterone receptor to regions on the HERV-K genome that correspond to progesterone and glucocorticoid response elements. <59>
Human choriocarcinoma which is an aggressive malignancy mainly of the placenta (and occasionally testicle). HERVs have also been suspected in its aetiology:
In choriocarcinoma, it has been shown that a HERV type C is inserted into the human growth factor gene, pleiotrophin (PTN). This results in the generation of a novel tissue specific promoter, which results in the expression of HERV–PTN fusion transcripts, leading to the production of biologically active PTN protein. Expression of the PTN protein (which is normally expressed only at very low amounts in a few normal adult tissues) appears to be responsible for the aggressive and invasive growth of human choriocarcinoma. <60>
If these elements were deliberately inserted in the genome by a designer, then given the evidence suggesting a link between disease and ERV elements, one would be entitled to ask whether that design was entirely rational.
An article in Retrovirology <61> shows what the current opinion is on the role of ERVs in human health and disease. Thierry Heidmann, a virologist at the Universite Paris-Sud and Institut Gustave Roussy in Villejuif in Paris who is the 2009 Retrovirology prize winner puts this clearly in an interview:
Your question could even be re-formulated in a more general way as follows: are mobile elements negative or positive? And the answer is they are both! Being insertional mutagens, mobile elements (and these include prokaryotic elements, retrotransposons, ERVs, etc.) are positive at the level of evolution, by generating diversity. In this respect it is remarkable that mobile elements are in general strongly repressed in the somatic cells, but repression is released to some extent in the germline, and this is true from Drosophila to mammals. And the germline is the right place for mutations to occur and generate variant offspring, which then will be subjected to Darwinian selection. But clearly mutations by insertion can be deleterious, and the best example is related to the insertional mutagenesis produced at the somatic cell level by simple oncogenic retroviruses, which can trigger tumors just in this way. And in Drosophila, the I retrotransposon can even induce embryonic lethality by excess retrotransposition. But of course mutations by insertion are not the sole possible effects of RVs and ERVs that indeed encode viral proteins which per se can have biological effects. These can be positive - the syncytin case - and they may be negative – the tumor case, via inhibition of immune surveillance. <62>
Any creationist who seizes on evidence for ERV element function (co-option in placentogenesis, alternative promoters for genes or possible role in blocking related exogenous retroviral infection) will need to explain why this design feature also is linked with cancer and auto-immune disease. It would be difficult to call this competent design. Certainly, any drug marketed that caused cancer as a side-effect would be swiftly removed from the market. If ERVs are specially inserted for any positive effect, they would appear to be of limited benefit when their potential cancer-causing properties are factored in. Competent design is not the word I would use to describe this.
            Furthermore, some of these benefits are directly related to evolution - mobile genetic elements including ERVs generate diversity which drives evolution. As well as that:
It became rapidly clear that these retroviral envelopes have been co-opted by their host - more than 25 My ago - for a physiological function in relation with the formation of the syncytiotrophoblast layer at the materno-fetal interface. Our laboratory has been involved in the characterization of syncytin-2, the oldest syncytin found in all simians, with the identification of its cognate receptor and evidence for its possible involvement in the << in-fusion >> of the mononucleated cytotrophoblasts into the syncytiotrophoblast. < 63>
An ERV element was captured over 25 million years ago, and tamed by the body - it is now of critical importance for placental development. If one uses this as proof that ERVs were divinely inserted, one is in effect endorsing a form of evolution. One cannot have it both ways - claiming the benefit without the evolutionary reasoning behind it.



It is not hyperbole to say that if the fossil record did not exist, common descent would be convincingly demonstrated ably by the evidence of ERV inclusions and other retro-elements at orthologous loci in various species, since the only plausible explanation is the infection of an ancestral species with a retrovirus that became fixed in the genome, and later inherited by descendant species.
Adding further weight to this is the fact that closely related species have ERV inclusions that differ by only a few mutations, while more distantly related species differ by more mutations. A family tree constructed using this data will almost always agree with the family tree derived from morphology – this consonance is cogently explained only by common descent. Shared errors and evidence of past infection are not proof of design.
Given this powerful evidence in favour of common descent, creationists have attempted to rebut it with a number of arguments, none of which have even remotely interested mainstream science. The common arguments and their refutations are listed below.



The reasoning employed here follows: common descent predicts that all the species descended from an ancestor that acquired an ERV should all share that ERV at the same point. If one of these species is missing that ERV, then common descent is invalidated.
One example is that of a HERV-K ERV <64> that is found in chimpanzees, bonobos and gorillas but not humans, which is not entirely consistent with predictions of common descent. One should not overstate the case since this ERV is found at the orthologous location in the other great apes, consistent with them having a common ancestor. However, irrespective of creationist abuse of this paper, scientific integrity alone demands that we find an answer.
One possibility is that the relevant area of the human genome may have suffered a large deletion, taking out the ERV. A good analogy is finding in a book which we expect to have a spelling error missing that entire page. We know that is not the case since “humans contain an intact preintegration site at this locus.” <65> To continue the analogy, the page is present, but no spelling error exists.
Another possibility is that the proviral segment itself was deleted, but it “is highly unlikely that the provirus was deleted in humans, as the retroviral integration process is irreversible.” <66> What other possibilities exist? Gene conversion or an unequal crossover event could readily produce what we see. The authors regard this as a definite possibility:
Another possibility was that the provirus was replaced in the human lineage by a gene conversion or unequal crossover event. In particular, the preintegration site may have been duplicated either in tandem or at another position within the genome of the common ancestor of Homo, Pan, and Gorilla. A recombination event in the genomes of gorillas, bonobos, and common involving the duplicated locus could then have replaced the 9.5 kb provirus in humans with a sequence similar to the preintegration site. In this regard, analysis of the human sequence flanking the HERV-K-GC1 integration site in Pan and Gorilla indicated that the ape provirus lies within an older L1 retrotransposon and that several L1 elements and an Alu element lie within a 5 kb stretch flanking the insertion site of the provirus. This particularly raised the possibility that gene conversion from an L1 element at a nonorthologous position might have replaced the provirus in the human lineage. <67>
A brief word on gene conversion for those unfamiliar with basic genetics will be useful at this point. Gene conversion occurs when genetic information is copied from one DNA strand to another. In this case here, if one allele has a proviral sequence while the opposite allele is normal, gene conversion could ensure that the normal allele is used to ‘write over’ the proviral allele:


Careful analysis of the relevant genomes however ruled out this possibility to account for the missing HERV-K proviral sequence in humans. As the authors put it:
The data are consistent with the conclusion that these genera lack an appropriate locus for a putative gene conversion event that could have eliminated the provirus within the human lineage. <68>
That however does not exclude the possibility of a gene conversion event for technical reasons:
We also considered the possibility that a putative recombination event involved a duplication of a sequence flanking the provirus insertion site that was too short to be detected with the PCR primers used. <69>
The authors point out that a theoretical means by which gene conversion could have theoretically removed the missing HERV provirus:

·      The pre-integration locus underwent a duplication event in the common ancestor of humans, chimps and gorillas
·      The proviral sequence was then formed by retroviral infection of one of the two copies of the loci of this common ancestor
·      The gorilla ancestor diverged from the lineage of the human / chimp common ancestor
·      The human and chimp lineages diverged
·      Two independent recombination events occurred in the gorilla and chimp lineages to eliminate the proviral-free locus.
·      In the human lineage, recombination reversed the original locus duplication, giving rise to the current situation where the human locus does not carry the proviral sequence <70>

This is theoretically possible, and would explain what we currently see, but the authors point out that there is a far simpler way to explain the absence of this HERV provirus in the human genome, using an allelic segregation model.

·      The provirus was inserted just prior to the separation of the gorilla lineage
·      The provirus allele was fixed in the gorilla lineage
·      Both proviral and pre-integration site alleles remained in the common ancestor of chimps and humans until the lineages deviated
·      The proviral allele was fixed in the chimp lineage, while the pre-integration allele was fixed in the human lineage <71>

As the authors point out, this involves three fewer recombination events – and the principle of parsimony would favour such an explanation.
…the presence of HERV-K-GC1 in gorillas and chimpanzees, but not humans, is best explained by the maintenance of the preintegration site in the human lineage since before the time when the provirus formed in the common ancestor of chimpanzees and gorillas. This leads to the conclusion that, for some fraction of the genome, the gorilla and chimpanzee genomes are more closely related to each other than either is to humans. <72>
Far from being an insoluble problem that invalidates of using ERVs at orthologous loci to show common descent, the absence of a proviral sequence at the human locus orthologous to gorillas and chimps as the authors readily show can be easily explained. In fact, the authors not only provide a ready explanation for this event, but point out that it shows the value of HERV-K for studying human evolution:
The significance of the work presented here is the demonstration of the utility of HERV-K as a marker for studying human evolution, the conclusion that HERV-K was active at about the time that the three lineages were evolutionarily separating, and the very strong experimental evidence that, in some fraction of the genome, chimpanzees, bonobos, and gorillas are more closely related to each other than any of them is to humans. HERV-K and other retrotransposable elements should contribute to determining what that fraction is. <73>
Cell biologist, cancer researcher, and devout Christian Graeme Finaly comments on this same event, noting that the anomalous trees are readily explained by incomplete lineage sorting:
But if speciation occurs rapidly relative to the time required by an ERV to become fixed, then a parental species may diverge into two (or more) new species at a time when copies of the ERV-containing chromosome constitute only a fraction of the total number of copies of that chromosome. If speciation occurs when an ERV is unfixed (such that the chromosomes with and without the insertion co-exist), then the ERV can be randomly lost or fixed in each diverging lineage. This is known as incomplete lineage sorting, and may produce anomalous trees.
The finding with ERV-K-GC1 indicates that this particular insertion event occurred near the time when the human, chimp and gorilla lineages were branching from the ancestral population…In this situation, both ERV-integrated and pre-integrated alleles were present as the ancestral population diverged. The integrated allele was lost from the human lineage, but independently fixed in the chimp and gorilla lineages. These data suggest that the gorilla, chimp and human lineages diverged closely in time. This conclusion is confirmed by incomplete lineage sorting of other genetic markers in the African great ape genomes (see later). And the availability of the gorilla genome sequence in 2012 established the reality of incomplete lineage sorting in the African great apes on a genome-wide basis. Thus the ERV-K-GC1 insert breaks the expected pattern in a way that provides further insights to our evolutionary history. Incomplete lineage sorting is not seen at most branching points of our primate history, indicating that the gorilla–chimp–human branching point was an unusually close near-trifurcation. <74>


   There are two problems with this reasoning. The first is that creationists confuse ERV elements such as gag, or env with an intact and functioning ERV that is capable of reinfecting the genome it has parasitised. Retroviruses include examples such as HTLV-1 which can cause T-cell lymphoma and T cell leukaemia, while HIV-1 and HIV-2 can cause AIDS. If a creationist cannot even tell the difference between an ERV element such as env that has been co-opted by the genome for another function and an active ERV, then their interpretation of the paper can be safely dismissed since they clearly know nothing about the subject. (This is why if one opposes the consensus view on a subject, one needs to know the subject intimately at a level where one is actively participating in science by publishing and reviewing papers. Otherwise, no informed critic will take that person seriously.)
The second problem – again – is that the presence of a functioning ERV element does not preclude the use of that element as a phylogenetic marker. For example, if an ERV integrates near a gene, it is entirely possible that the gene will adopt a portion of that ERV as its promoter. If the organism in which this event undergoes speciation, then those species who share the original organism as a common ancestor will have at the same location a functional ERV element. This happens more times than one would imagine, and in no way invalidates common descent. The failure of creationists both to tell the difference between ERV elements and functional ERVs, as well as the ability of ERV elements co-opted by the genome for another function to act as phylogenetic markers again shows that their claims on ERVs should always be regarded with the highest suspicion.
A recent example is the creationist misuse of the paper by Conley et al that I cited earlier to show how the genome can co-opt elements of an ERV to function as promoters. I would refer you to the earlier discussion for details on how ERV elements have been co-opted as promoter elements. However, I will make a few further general points which need to be asked of any creationist claiming that a paper supports ID / creationism.
The first thing that any creationist citing a paper as proof for their view is obliged to show is: do the authors believe that their paper makes those claims? If the authors do not, then the creationist is either uninformed on the issue and has no credibility, or is being dishonest and can no longer be trusted.
Again, the example of Catherine Dunn’s work should be noted – creationists often cite papers as proof against evolution when the authors state that they in fact do the opposite. The opening paragraph of Conley et al is informative:
Approximately 5% of the human genome sequence is derived from retroviruses (Lander et al., 2001). Retroviral genomic sequences are remnants of past infections that resulted in the integration of provirus genomes into the DNA of germline cells (Bock et al., 2000; Bromham, 2002). The abundance of these so-called endogenous retrovirus sequences (ERVs) testifies to the extent that human evolution has been shaped by successive waves of viral invasion (Sverdlov, 2000). <75>
Note the closing line – human evolution has been shaped by successive waves of viral invasion. The authors clearly have not abandoned evolution as a result of their paper. Rather, they have pointed out what is common knowledge in virology, that our evolution owes much to viruses – after all, there is more ERV-related material in our genome than coding DNA. Conley et al continue:
One way that ERVs have affected the function and evolution of the human genome is by donating regulatory sequences that control the expression of nearby genes. The gene regulatory effects of ERVs were first uncovered in a number of anecdotal studies on specific genes (reviewed in Bannert et al., 2004; Medstrand et al., 2005). For instance, the long terminal repeat (LTR) of a human ERV (HERV-E) was shown to serve as an enhancer element that confers parotid-specific expression on the amylase gene (Samuelson et al., 1990). <76>
Samuelson’s paper is worth examining just to show that we have known ERV elements have been co-opted by evolution for another function for some time, and that creationists are not a little late in trumpeting this fact:
The human genome contains several thousand endogenous retroviruses. The three retroviruses in the amylase gene cluster appear to be members of the 4-1 family, which contains approximately 50 members and is related to the baboon endogenous virus and the Moloney murine leukemia virus. Retroviral insertion into the amylase-associated gamma-actin pseudogene occurred approximately 40 million years ago. Transcription of the AMY1 genes is initiated within the gamma-actin pseudogene at a position only 250 bp downstream of the retroviral LTR…We have not detected any transcripts originating from this position of the AMY2B gene, which lacks the retroviral insert. It thus appears that insertion of the retrovirus resulted in activation of a cryptic promoter within the gamma-actin pseudogene. It is interesting that three independent insertions of retroviral elements into mouse gamma-actin pseudogenes have also been reported. <77>
Samuelson’s conclusion once again echoes a repeating theme of how evolution can be driven by DNA insertion, deletion and duplication:
Analysis of the 5'-flanking regions of the human amylase genes has revealed a series of molecular events during the evolution of this gene cluster. The results demonstrate the contributions of DNA insertions, deletions, and duplications to rapid molecular change in mammalian evolution. <78>
Conley again:
Later, more systematic computational analyses of the human genome sequence revealed that many human genes contained ERV-derived regulatory regions, suggesting an even greater contribution of retroviruses to human gene regulation (Jordan et al., 2003; van de Lagemaat et al., 2003). Continued efforts to characterize ERV-derived promoters have turned up several new cases in recent years (Dunn et al., 2003, 2006; Romanish et al., 2007). Nevertheless, the full extent of the contribution of ERV sequences to the initiation of transcription in the human genome has yet to be appreciated. <79>
As the authors point out, we have known that ERV elements can acts as promoters for 20 years. None of this is remotely new to anyone working in virology or molecular biology. ERV elements can serve as phylogenetic markers as well as act as promoter sequences. Of course, if one wants to further investigate the scope of ERV element usage in the genome, the revolution in genomics over the last 20 years now allows researchers to do this with more speed and efficiency:
The application of novel high-throughput techniques for the analysis of gene expression has revolutionized the study of the human transcriptome and revealed far more regulatory complexity than previously imagined… We used human CAGE and PET data to more thoroughly evaluate the contribution of ERVs to the initiation of transcription in the human genome. <80>
To recap, this is what their research has shown:
Our analysis uncovered more than 100 cases of novel ERV-derived promoters that initiate chimeric ERV-human gene transcripts and several thousand more that are likely to do so. ERV-derived promoters are characterized by their ability to promote alternative transcripts that are expressed in a way that is tissue-specific, lineage-specific and distinct from related paralogous genes. These data underscore the extent to which retrovirus activity has shaped the human transcriptome. <81>
Nothing in there that remotely threatens evolution. Once again, one needs to ask the authors whether their paper provides support for creationism or refutes common descent. This paper does neither:
The lineage-specific regulatory effects of ERV promoters can be attributed to the fact that ERV sequences result from past germline infections, many of which occurred relatively recently along specific evolutionary lineages. In fact, most of the ERV sequences in the human genome are primate-specific (Sverdlov, 2000), while most human genes are far more ancient and share orthologs with distantly related species (Lander et al., 2001). This means that regulatory effects exerted by ERV promoters will often lead to expression differences between primate and non-primate orthologs or between deeper evolutionary lineages for more ancient ERVs. In other words, ERV promoters are likely to drive evolutionary changes in gene expression, long thought to be an important determinant of species divergence (King et al., 1975). <82> emphasis mine
For a creationist to use this paper as support for his position shows that he simply has not read or understood the paper. The authors clearly point out how ERV evidence not only supports common descent, but is an important driver of speciation. Once again, we have an example of creationist misuse of work which not only fails to support creationism but is further evidence in the formidable battery of evidence supporting evolution.


The evidence from ERV inclusions at orthologous loci in relates species is as I have pointed out in its own right overwhelming evidence in favour of common descent. The logic behind this is irrefutable:

·      These are clearly alien to the body – genetic analysis shows that these sequences are retroviral in origin
·      They are found in exactly the same place in related species
·      Analysis of the mutations accumulated by these retroviral inclusions shows that closely related species differ by only a small number of mutations. More distantly related species differ by a larger number of mutations.
·      A family tree constructed from this distance data agrees with the family tree constructed from morphology. There is no reason to expect this from special creation. Common descent predicts this.

Knowing the power of this argument, creationists have tried to prove that retroviral insertion is not random. In other words, there are ‘hot spots’ where viruses will integrate, meaning that retroviruses are going to integrate preferentially in only a fixed number of locations, making it quite likely that they will integrate in related species in such a way as to simulate the predictions of common descent. This too is based on a rudimentary understanding (at best!) of molecular biology. Let’s look at some papers creationists cite in order to bolster this assertion.
The first <83> is by Mitchell et al: “Retroviral DNA Integration: ASLV, HIV, and MLV Show Distinct Target Site Preferences”. Again, this is another example of the creationist tactic of combing the literature for papers with titles that appear to support their assertions.
Even a brief reading of the paper shows once again that the creationists have failed to understand the paper, which shows that the retroviruses mentioned have particular preferences for insertion at genes – some will insert near promoters for example. What it does not show is a preference for integration at only a subset of the total number of genes, which is what the creationists mistakenly claim. The specific insertion sites are random. As Mitchell et al show:
We report that ASLV, MLV, and HIV have quite different preferences for integration sites in the human chromosomes. HIV strongly favors active genes in primary cells as well as in transformed cell lines. MLV favors integration near transcription start regions and favors active genes only weakly. ASLV shows the weakest bias toward integration in active genes and no favoring of integration near transcription start sites. We expect that these same patterns will be seen for MLV and ASLV integration in different human cell types, because all four HIV datasets yielded similar results, though more data on additional cell and tissue types will be helpful to further evaluate the generality. <84>
In other words, HIV likes to insert in active genes, MLV will integrate near the transcription start regions while ASLV has only a weak preference for integrating near active genes. However, which genes these retroviruses will bind near is entirely random – this negates the creationist assertion that these retroviruses insert preferentially at certain genes. There is a difference between preferring to insert at a particular site of a gene, and preferring to insert in particular genes. Which gene will be active at the time of retroviral infection is of course entirely random. Furthermore, the study looked at three retroviruses only – there are considerably more than three retroviruses.
Another paper abused by creationists is Cantrell et al <85> “An ancient retrovirus-like element contains hot spots for SINE insertion” SINE by the way stands for Short INterspersed Element. SINEs are a class of retrotransposon. Retrotransposons are mobile genetic elements that can cut and paste themselves throughout the genome, but differ from transposons in that they copy themselves to RNA, then copy back to the genome via reverse transcription.
SINEs are short – being less than 500 bases in length – and as they don’t have a copy of reverse transcriptase require the assistance of other genetic elements to provide this function. Around 13% of the human genome consists of these repetitive, largely functionless elements. (Almost half the human genome consists of retrotransposed elements.) A small number of SINEs may have been co-opted by the body but no function has been found for most of them, and in fact they have been linked with disease.
As with ERVs, the presence of SINEs in orthologous loci in related species has been used to construct phylogenetic trees. Of course, if SINEs show a preference for insertion, then it is possible that independent insertion events may be confused with the pattern seen with a SINE insertion in an ancestral species.
Cantrell et al found that:
In this study we find that a mys insertion, mys-9, originally found in P. leucopus (Wichman et al. 1985), appears, on the basis of both a 20.3% uncorrected sequence difference between its LTRs and its presence in multiple species of Peromyscus, to be ancient. Phylogenetic analysis of 13 orthologous copies of this element is consistent with the accepted species phylogeny. We see a surprising range of mys-9 allele sizes at this locus caused by a large number of SINE insertions. Within this locus we find two incidents of independent, multiple SINE insertion events at identical sites. These results have major repercussions for phylogenetic analyses based on SINE insertions, indicating the need for caution before interpreting shared SINE insertions as incontrovertible evidence of common ancestry. <86>
Now, note what they say and don’t say. The authors point out that they have evidence of independent SINE insertions at identical sites, showing that hotspots exist for SINE insertions. However, they do not say that phylogenetic analyses based on molecular data are invalid – they point out that in certain cases, SINE insertions may be due to independent insertion events rather than evidence of common ancestry. In short, one needs to be careful with one’s analysis. What it does not show is that the entire concept of constructing phylogenies from molecular data is invalid – that is a creationist misreading of the data that the authors do not provide as the following shows:
Although same-site insertions are probably rare, these results suggest that SINEs exhibit a greater specificity for insertion at specific sites than previously recognized, to the extent that multiple identical insertions can indeed occur at single sites. The presence of a retrotransposon at a single locus in multiple taxa remains an extremely powerful phylogenetic marker, but caution is required before concluding that the existence of a particular SINE at a particular locus in multiple individuals is indicative of common ancestry (Hillis 1999). Such caution is particularly warranted in cases where a single insertion event is the sole support for a specific phylogenetic hypothesis. <87> emphasis mine.
Note the following:

  • Same site insertions are rare, but still need to be considered
  • The use of retrotransposns at a single locus in multiple species remains an extremely powerful phylogenetic marker
  • The use of a single insertion event as the sole support for a hypothesis is not wise. (However, multiple insertion events providing the same phylogenetic tree would be extremely powerful support.)
·      The conclusion drawn by creationists simply is not present, and represents yet again another example of incompetent misreading of a paper in order to support the creationist thesis.

This is analogous to the YEC abuse of papers that show how using a particular radiometric dating method on young material gives unfeasibly old results. The message there is that while radiometric dating is reliable, it needs to be used intelligently. Creationists have abused those papers by blithely asserting that all radiometric dating methods are wrong. This paper in short is a trade note to other scientists to watch out for possible confounding factors when they do their phylogenetic analyses, rather than a blanket statement dismissing them.



Mura et al <88> show that in sheep, an endogenous retrovirus helps protect again infection by its related retrovirus:
In this paper, we have described an endogenous retrovirus of sheep (enJS56A1) with a dominant negative Gag protein that interferes with its exogenous counterpart (JSRV) at a postassembly level. This blockade represents a previously uncharacterized mechanism of retroviral interference. The other known ERV-mediated blocks are all at early stages of the retroviral replication cycle such as entry. <89>
Once again, we need to look at the facts creationists ignore:

·      The presence of a function for ERV elements does not preclude their use as phylogenetic markers.
·      The authors looked at specific ERVs that have an exogenous retrovirus.
·      If God intended to insert ERVs as a “vaccine” against retroviruses, then the task was not done very well as humans still fall prey to retroviruses. HIV-1 and HIV-2 are a potent refutation of this thesis. It again shows why positing God as a designer is a theologically dangerous stratagem since it leaves us open to the claim that our God is a poor designer.

The most important reason against this creationist argument being viable has to do with an important aspect of the immune system <90>. The human immune system is composed of many different cells - some of these recognise invading microorganisms and trigger an immune response to ward off the attack. These cells (classes of T cells and B cells) achieve this by having a special receptor uniquely shaped to recognise a particular molecular configuration.
Of course, the human body has no way of knowing in advance that it will be infected by a particular pathogen - millions of T and B cells are made with each having a randomly different recognition molecule. By chance, one of them will be a custom fit for an invading pathogen. This also means that by chance, B cells and T cells will be created that recognise parts of the body. This is not desirable, since these cells would consider human tissue an invading entity, and trigger an auto-immune response. Fortunately, the body has a mechanism whereby such B and T cells which react to the body are picked off and deleted.
Now, if a protein coded by an ERV is released while the immune system is developing, this protein would be regarded as part of the human body, and any T or B cell that recognised this would be deleted. This would completely negate the whole point of using ERVs as an inbuilt vaccine, since the T and B cells that would recognise the related retrovirus would not exist. The retrovirus would be able to invade with greater ease in the absence of these cells. Hardly a brilliant design strategy since it would ensure that large chunks of the immune system that would defend against retroviral invasion would go missing.



Shared identical endogenous retroviral elements at the same location in the genomes of related species are powerful evidence for common descent for the same reason that shared identical pseudogenes and retrotransposons are strong evidence for common descent; they are evidence that the modern species shared a common ancestor in which the pseudogenisation / retrotransposon or retroviral insertion first took place, and was then subsequently inherited. Endogenous retroviral elements add an extra element to this in that they are clearly alien to the host genome given their viral origin.
The odds of just a single ERV integrating purely by chance in the same place in the genomes of two species is billions to one against. When we multiply this by the vast number of ERVs shared between humans and apes (to cite the example of most interest in the evolution-creation debate), the odds become so astronomical that they rule out of contention any hypothesis that the evidence for common descent occurred purely by a mass series of luck random integrations that just happened to simulate a pattern of common descent. Just from the ERV data alone, the evidence for common descent is overwhelming.

References

1. Coffin JM “Evolution of Retroviruses: Fossils in our DNA” Proceedings of the American Philosophical Society (2004) 148:3, 264-280
2. Richard Cordaux, Mark A Batzer, Evolutionary Emergence of Genes Through Retrotransposition. Encyclopedia of Life Sciences p 4-5 doi: 10.1002/9780470015902.a0020783
3. Contreras-Galindo R et al Human Endogenous Retrovirus K (HML-2) Elements in the Plasma of People with Lymphoma and Breast Cancer Journal of Virology, Oct. 2008, p. 9329–9336
4. Medstrand P,l van de Lagemaat LN, Mager DL “Retroelement Distributions in the Human Genome: Variations Associated With Age and Proximity to Genes” Genome Res. 2002 October; 12(10): 1483–1495
5. Weiss RA “The discovery of endogenous retroviruses” Retrovirology 2006, 3:67
6. Gifford R, Tristem M “The Evolution, Distribution and Diversity of Endogenous Retroviruses” Virus Genes (2003) 26:3, 291-315
7. Urisman A et al “Identification of a novel Gammaretrovirus in prostate tumors of patients homozygous for R462Q RNASEL variant.” PLoS Pathog. 2006 Mar;2(3):e25. Epub 2006 Mar 31
8. Schlaberg R et al “XMRV is present in malignant prostatic epithelium and is associated with prostate cancer, especially high-grade tumors.” Proc Natl Acad Sci. 2009 September 22; 106(38): 16351–16356.
9. Hohn O et al “Lack of evidence for xenotrophic murine leukemia virus-related virus (XMRV) in sporadic prostate cancer” Retrovirology 2009, 6:9
10. Kewalramani VN et al “Spleen Necrosis Virus, an Avian Immunosuppressive Retrovirus, Shares a Receptor with the Type D Simian Retroviruses” Journal of Virology (1992) May p 3026-3031
11. Gifford op cit p 291-292
12. Stoye JP “Endogenous retroviruses: Still active after all these years?” Current Biology 2001, 11:R914-916
13. Mi S et al “Syncytin is a captive retroviral envelope protein involved in human placental morphogenesis” Nature (2000) 493, 785-789
14. Coffin JM “Evolution of Retroviruses: Fossils in our DNA” Proceedings of the American Philosophical Society (2004) 148:3, 264-280
15. Gifford op cit p 297
16. Gifford et al “Evolution and Distribution of Class II-Related Endogenous Retroviruses” J Virol. 2005 May; 79(10): 6478–6486.
17. Katzourakis A et al “Discovery and analysis of the first endogenous lentivirus” Proc. Natl. Acad. Sci. (2007) 104,15:6261-6265
18. Gifford R et al “A transitional endogenous lentivirus from the genome of a basal primate and implications for lentivirus evolution” Proc. Natl. Acad. Sci. (2008) 105,51:20362-20367
19. Coffin JM op cit p 268
20. Coffin JM op cit p 268-269
21. Johnson WE Coffin JM Constructing primate phylogenies from ancient retrovirus sequences Proc. Natl. Acad. Sci. (1999) 96:10254-10260
22. ibid p 10254
23. ibid p 10255
24. loc cit
25. Johnson WE Coffin JM op cit p 10255-10256
26. ibid p 10256
27. ibid p 10259
28. loc cit
29. Jha AR “Cross-Sectional Dating of Novel Haplotypes of HERV-K 113 and HERV-K 115 Indicate These Proviruses Originated in Africa before Homo sapiensMol. Biol. Evol. 26(11):2617–2626. 2009
30. Herniou E et al “Retroviral Diversity and Distribution in Vertebrates” Journal of Virology (1998) 72:7;5955-5966
31. ibid p 5955
32. Polavarapu N et al “Newly Identified Families of Human Endogenous Retroviruses” Journal of Virology (2006) 80:9;4640-4642
33. Barbulescu M et al “Many human endogenous retrovirus K (HERV-K) proviruses are unique to humans” Current Biology (1999) 9:861-868
34. Belshaw R et al “Long-term reinfection of the human genome by endogenous retroviruses” Proc. Natl. Acad. Sci. (2004) 101:14;4894-4899
35. Dunn CA et al “An endogenous retroviral long terminal repeat is the dominant promoter for human β1,3-galactosyltransferase 5 in the colon” Proc. Natl. Acad. Sci.October 28, 2003 100:22;12841-12846
36. Endogenous retroviruses and the evidence for evolution VWXYNot? June 21 2007 http://vwxynot.blogspot.com/2007/06/endoge...d-evidence.html
37. Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome Nature 420, 520-562 (5 December 2002)
38. Mager DL, Freeman JD HERV-H Endogenous Retroviruses: Presence in the New World Branch but Amplification in the Old World Primate Lineage Virology (1995) 213:2;395-404
39. Medstrand P Mager DL Human-specific integrations of the HERV-K endogenous retrovirus family J Virol. 1998 Dec;72(12):9782-7.
40. Belshaw R et al Genomewide Screening Reveals High Levels of Insertional Polymorphism in the Human Endogenous Retrovirus Family HERV-K(HML2): Implications for Present-Day Activity Journal of Virology, October 2005, p. 12507-12514, Vol. 79, No. 19
41. Tarlington RE, Meers J, Young PR: Retroviral invasion of the koala genome. Nature 2006, 442:79-81.
42. Stoye JP “Koala retrovirus: a genome invasion in real time” Genome Biology 2006, 7:241-3
43. ibid, p 243
44. van der Loo W et al Sharing of Endogenous Lentiviral Gene Fragments among Leporid Lineages Separated for More than 12 Million Years Journal of Virology (2009) 83:5;2387-2388
45. Lower R, Lower J, Kurth R Kurth The viruses in all of us: Characteristics and biological significance of human endogenous retrovirus sequences Proc. Natl. Acad. Sci. (1996) 93:5177-5184
46. ibid, p 5183
47. ibid, p 5182
48. Bekpen C, Marques-Bonet T, Alkan C, Antonacci F, Leogrande MB, et al. (2009) Death and Resurrection of the Human IRGM Gene. PLoS Genet 5(3):e1000403. doi:10.1371/journal.pgen.1000403
51. Dunn op cit
52. Conley AB et al “Retroviral promoters in the human genome” Bioinformatics (2008) 24:14;1563-1567
53. ibid, p 1566
54. Arnaud F, Caporale M, Varela M, Biek R, Chessa B, et al. (2007) A Paradigm for Virus–Host Coevolution: Sequential Counter-Adaptations between Endogenous and Exogenous Retroviruses. PLoS Pathog 3(11): e170. doi:10.1371/journal.ppat.0030170
55. Ryan FP. Human endogenous retroviruses in health and disease: a symbiotic perspective J R Soc Med. 2004 December; 97(12): 560–565.
56. ibid p 561
57. Nelson PN et al Demystified…Human endogenous retroviruses J Clin Pathol: Mol Pathol 2003;56:11–18
58. ibid p 13-14
59. ibid, p 14
60. loc cit
61. Saib A, Benkirane M Endogenous Retroviruses: Thierry Heidmann wins the 2009 Retrovirology prize Retrovirology 2009, 6:108 doi:10.1186/1742-4690-6-108
62. ibid, p 113
63. ibid, p 112
64. Barbulescu M et al A HERV-K provirus in chimpanzees, bonobos and gorillas, but not humans Current Biology 2001, 11:779–783
65. ibid, p 779
66. ibid, p 780
67. loc cit
68. Barbulescu, p 781
69. ibid, p 781
70. ibid p 782
71. loc cit
72. loc cit
73. ibid, p 783
74. Finlay G Human Evolution – Genes, Genealogies, and Phylogenies (2013, Cambridge University Press) 60-61
75. Conley et al, p 1563
76. loc cit
77. Samuelson,L.C. et al. (1990) Retroviral and pseudogene insertion sites reveal the lineage of human salivary and pancreatic amylase genes from a single gene during primate evolution. Mol. Cell. Biol., 10, 2513–2520.
78. ibid p 2519
79. Conley et al, p 1563
80. ibid, p 1563-4
81. ibid, p 1566
82. ibid, p 1563
83. Mitchell RS, Beitzel BF, Schroder ARW, Shinn P, Chen H, et al. 2004 Retroviral DNA Integration: ASLV, HIV, and MLV Show Distinct Target Site Preferences. PLoS Biol 2(8): e234.
84. ibid p 1133
85. Cantrell MA et al An ancient retrovirus-like element contains hot spots for SINE insertion. Genetics. Jun;158(2):769-77.
86. ibid, p 770
87. ibid, p 776
88. Mura M et al Late viral interference induced by transdominant Gag of an endogenous retrovirus Proc.  Natl. Acad. Sci. (2004) 101:30;11117-11122
89. ibid, p 11121
90. Any undergraduate immunology textbook will be able to explain this. See for example Janeway et al Immunobiology (2001 5th Edition Garland Publishing) Chapter 7 “The Development and Survival of Lymphocytes”. Available online at: http://www.ncbi.nlm.nih.gov/bookshelf/br.f...m&part=A797