22 chromosome comments, example 1



Example 1

Example 1 shows an element from the family HERVH with its LTR sequences. There are five deletions in this element, when it is compared with the HERVH family consensus sequence. HERVH LTRs are listed as LTR7 in the Repbase. Original published data suggested three internal sequences of HERVH flanked by two LTR7 (Fig 1). Our analysis with the latest RepeatMasker showed six HERVH internal sequences with two LTR7s (Fig 2).

We suggest that all these sequences are parts of one element. The element has five deletions when compared with the consensus sequence. In the case of such deletions (or insertions), Smith-Waterman similarity search used in RepeatMasker produces several independent hits.

The fragments found by RepeatMasker and shown in this example should not be treated as independent insertions of HERVs but as one element that diverged from the consensus. Using such analysis, the number of elements found in the genome decreases. The element shown in this example represents probably one complete retrovirus instead of three internal sequences with two LTRs (or six internal sequences when using new version of RepeatMasker).

See example 2 for extreme overestimation of a copy number.

Part of unprocessed RepeatMasker output:

   SW  perc perc perc  query        position in query             matching repeat         position in  repeat
score  div. del. ins.  sequence      begin      end     (left)    repeat   class/family    begin  end (left)

 2760   9.0  5.8  5.8  NT_001454  12304823 12305254 (10897837) +  LTR7     LTR/Retroviral      1   448    (2)  
18883   9.8  2.5  2.5  NT_001454  12305257 12308243 (10894848) +  HERVH    LTR/Retroviral      1  2987 (4726)  
  866  14.2  2.0  2.0  NT_001454  12308238 12308385 (10894706) +  HERVH    LTR/Retroviral   3140  3290 (4423) *
 3291  10.7  0.4  0.4  NT_001454  12308376 12308878 (10894213) +  HERVH    LTR/Retroviral   3492  3993 (3720)  
 4161  14.8  0.1  0.1  NT_001454  12308881 12309542 (10893549) +  HERVH    LTR/Retroviral   4488  5149 (2564)  
 1356  11.9  1.8  1.8  NT_001454  12309544 12309762 (10893329) +  HERVH    LTR/Retroviral   5600  5819 (1894) *
 2397   8.5  5.0  5.0  NT_001454  12309761 12310138 (10892953) +  HERVH    LTR/Retroviral   7322  7713    (0)  
 2603  10.4  6.0  6.0  NT_001454  12310139 12310569 (10892522) +  LTR7     LTR/Retroviral      1   448    (2)  

Graphical representation of the RepeatMasker output:

Fig. 1: Original data from the Sanger Center.

Fig. 2: New RepeatMasker data compared to consensus.

Corresponding parts of the chromosome sequence and the family consensus sequence are connected. The heigth of boxes on the chromosome shows Swiss-Waterman score (SW). Blue and orange colors are used for internal and LTR sequences, respectivelly.

Back to comments | Next example



Main page
Webmaster
Last modified: $Date: 2001/10/05 11:21:07 $ site: herv.img.cas.cz