What is ERV?
Analysis of the human genome revealed that some 45% of it consists of various kinds of transposable elements. Around 8% of the human DNA is derived from retrovirus-like elements. They originate from ancient retroviral infections or are relics of retroviral transposomal activity in the germ-line cells. Human endogenous retroviruses (HERVs) comprise a part of these elements. They have undergone substantial changes such as mutations of all kinds, deletions and insertions of other transposons, recombinations and mini- and micro-satellite expansion. This is why it is often difficult to identify individual retroviral genes and other retroviral DNA regions.
HERVs are classified according to several criteria that are, however, mostly artificial, especially in view of rearrangements and mutations changing the original retroviral DNA sequences.
HERVs became extremely useful tools in studying evolution and the plasticity of primate genomes. Some of them acquired important functions. For example, HERV long terminal repeats (LTRs) serve as transcription regulators, alternative promoters and polyadenylation signals for several cellular genes. Another function proposed for HERVs is in determining resistance to viral infections. The best example proposed for the HERV function is that the product of the HERV-W envelope gene on chromosome 7 is the human syncytin, which is a protein involved in the placenta formation. HERVs are likely to be cofactors in several diseases, such as multiple sclerosis, schizophrenia and cancer.
It is important to analyze the structure and distribution of HERVs and compare them with situations in other genomes, e.g. those of primates and the mouse. These comparisons may help to find genomic elements that contribute to developmental phenotypical differences between humans and other organisms. Precise localization and characterization of insertion sites may be useful for designing retroviral vectors for gene therapy.
|Name||Name of the element (MLT1K)|
|Type||SINE, LINE, ERV, DNA, ...|
Used only by ERV which have format:
Subtype can br:
LTR ... sequence on the edge
INT ... collection of gag-pol-env
|Size||Number of bases|
Part or full element found on genome. Values are almost the same as in RepeatMasker output. Accesible on Repeats.
|Name||The same name as element|
|SW||Smith-Waterman score calculated by RepeatMasker|
|Div percent||% substitutions in matching region compared to the consensus|
|Del percent||% of bases opposite a gap in the query sequence (deleted bp)|
|Ins percent||% of bases opposite a gap in the repeat consensus (inserted bp)|
|Type||The same type as element.|
|Subtype||The same as element. See Elements section for more details.|
|Sequence||Name of the chromosome.|
|Notes||Notes or erros found during parsing RepeatMasker output.
For example length is not match with element.
SEQUENCE: S_BEGIN - S_END (ORIENTATION)
R_BEGIN - R_END (left: R_LEFT)
E_BEGIN - E_END (left: E_LEFT)
|Average SW||Total SW average of all repeats|
|Consensus size||How much of original entity is present on genome (how much remained).|
|Type||Type of elements|
Only on ERV. It represet what parts of entity remains.
LIL = LTR-gag-pol-env-LTR
This section describe how entities are created. The process is very simplified.
2. Creating raw entities
First all element's pieces are joined to create a "raw entity". That will look like this:
3. Create entities
Now LTR and INT are joined based on neighbors frequency.
Now result will look like this
- Endogenous retrovirus in Wikipedia from https://en.wikipedia.org/wiki/Endogenous_retrovirus
- Pačes, J., Pavlíček, A., & Pačes, V. (2002). HERVd: database of human endogenous retroviruses. Nucleic Acids Research, 30(1), 205–206. Full text
- RepeatMasker Documentation from http://www.repeatmasker.org/webrepeatmaskerhelp.html