What is ERV?
Analysis of the human genome revealed that some 45% of it consists of various kinds of transposable elements. Around 8% of the human DNA is derived from retrovirus-like elements. They originate from ancient retroviral infections or are relics of retroviral transposomal activity in the germ-line cells. Human endogenous retroviruses (HERVs) comprise a part of these elements. They have undergone substantial changes such as mutations of all kinds, deletions and insertions of other transposons, recombinations and mini- and micro-satellite expansion. This is why it is often difficult to identify individual retroviral genes and other retroviral DNA regions.
HERVs are classified according to several criteria that are, however, mostly artificial, especially in view of rearrangements and mutations changing the original retroviral DNA sequences.
HERVs became extremely useful tools in studying evolution and the plasticity of primate genomes. Some of them acquired important functions. For example, HERV long terminal repeats (LTRs) serve as transcription regulators, alternative promoters and polyadenylation signals for several cellular genes. Another function proposed for HERVs is in determining resistance to viral infections. The best example proposed for the HERV function is that the product of the HERV-W envelope gene on chromosome 7 is the human syncytin, which is a protein involved in the placenta formation. HERVs are likely to be cofactors in several diseases, such as multiple sclerosis, schizophrenia and cancer.
It is important to analyze the structure and distribution of HERVs and compare them with situations in other genomes, e.g. those of primates and the mouse. These comparisons may help to find genomic elements that contribute to developmental phenotypical differences between humans and other organisms. Precise localization and characterization of insertion sites may be useful for designing retroviral vectors for gene therapy.
API
Repeats, Elements and Entities are accesible via API with JSON format. API end-point: https://herv.img.cas.cz/api
Elements | /api/elements |
---|---|
Repeats | /api/repeats |
Entities | /api/entities |
Swagger definition | https://herv.img.cas.cz/api/swagger |
Apiary | https://ervd.docs.apiary.io |
Get elements
- limit
-
Type: Integer Default: 25 Values: 1..100 - name
-
Type: String - type
-
Type: String Values: DNA, ERV, LINE, Low_complexity, RC, Retroposon, RNA, Satellite, Simple_repeat, SINE, Unknown
GET https://herv.img.cas.cz/api/elements
curl --include \
--header "Accept: application/json" \
"https://herv.img.cas.cz/api/elements"
Get repeats
- limit
-
Type: Integer Default: 25 Values: 1..100 - name
-
Type: String - type
-
Type: String Values: DNA, ERV, LINE, Low_complexity, RC, Retroposon, RNA, Satellite, Simple_repeat, SINE, Unknown - sequence
-
Type: String Values: chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY - sequence_begin
-
Type: Integer Values: 0..4294967295 - sequence_end
-
Type: Integer Values: 0..4294967295
GET https://herv.img.cas.cz/api/repeats
curl --include \
--header "Accept: application/json" \
"https://herv.img.cas.cz/api/repeats"
Get entities
- limit
-
Type: Integer Default: 25 Values: 1..100 - name
-
Type: String - type
-
Type: String Values: DNA, ERV, LINE, Low_complexity, RC, Retroposon, RNA, Satellite, Simple_repeat, SINE, Unknown - subtype
-
Type: String Values: INT, LTR - by_element
-
Type: String - sequence
-
Type: String Values: chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY - sequence_begin
-
Type: Integer Values: 0..4294967295 - sequence_end
-
Type: Integer Values: 0..4294967295
GET https://herv.img.cas.cz/api/entities
curl --include \
--header "Accept: application/json" \
"https://herv.img.cas.cz/api/entities"
Swagger compatible API description
GET https://herv.img.cas.cz/api/swagger
curl --include \
--header "Accept: application/json" \
"https://herv.img.cas.cz/api/swagger"
Swagger compatible API description for specific API
- name
-
Type: String - locale
-
Type: Symbol
GET https://herv.img.cas.cz/api/swagger/:name
curl --include \
--header "Accept: application/json" \
"https://herv.img.cas.cz/api/swagger/:name"
Database description
Elements
Concrete original element such as MLT1K, THE1B or AluSx. Accessible on Elements.
Name | Name of the element (MLT1K) |
---|---|
Type | SINE, LINE, ERV, DNA, ... |
Subtype |
Used only by ERV which have format: LTR-gag-pol(-env)-LTR. Subtype can br: LTR ... sequence on the edge INT ... collection of gag-pol-env |
Size | Number of bases |
Repeats
Part or full element found on genome. Values are almost the same as in RepeatMasker output. Accesible on Repeats.
Name | The same name as element |
---|---|
SW | Smith-Waterman score calculated by RepeatMasker |
Div percent | % substitutions in matching region compared to the consensus |
Del percent | % of bases opposite a gap in the query sequence (deleted bp) |
Ins percent | % of bases opposite a gap in the repeat consensus (inserted bp) |
Type | The same type as element. |
Subtype | The same as element. See Elements section for more details. |
Sequence | Name of the chromosome. |
Notes | Notes or erros found during parsing RepeatMasker output. For example length is not match with element. |
Sequence position |
SEQUENCE: S_BEGIN - S_END (ORIENTATION) |
Repeat position |
R_BEGIN - R_END (left: R_LEFT) |
Entity position |
E_BEGIN - E_END (left: E_LEFT) |
Entities
Entity is "thing" which is made by combination of element's repeats. For ERV type its represet the original retro-virus. More details are in section creating entities. Accessible on Entities.
Name | Format: ELEMENTS-TYPE__ID |
---|---|
Average SW | Total SW average of all repeats |
Consensus size | How much of original entity is present on genome (how much remained). |
Type | Type of elements |
Subtype |
Only on ERV. It represet what parts of entity remains.LIL = LTR-gag-pol-env-LTR |
Creating entities
1. Data
This section describe how entities are created. The process is very simplified.
2. Creating raw entities
First all element's pieces are joined to create a "raw entity". That will look like this:
3. Create entities
Now LTR and INT are joined based on neighbors frequency.
4. Result
Now result will look like this
References
- Endogenous retrovirus in Wikipedia from https://en.wikipedia.org/wiki/Endogenous_retrovirus
- Pačes, J., Pavlíček, A., & Pačes, V. (2002). HERVd: database of human endogenous retroviruses. Nucleic Acids Research, 30(1), 205–206. Full text
- RepeatMasker Documentation from http://www.repeatmasker.org/webrepeatmaskerhelp.html