Vega genes in Ensembl
Genes in Vega can be on their own assembly, so for incorporation into Ensembl resources they must first be projected onto the relevant reference assembly. The projection succeeds only if (i) all of the exons in a transcript map to the reference assembly and (ii) the sequence does not change. Not all of the transcripts at a locus need to map for the projection of a gene to succeed, ie a locus can have different numbers of transcripts in the two databases. However the annotation of each transcript must be identical in the two. In practise, it is only genes on non-reference Vega chromosomes, for example human LRC and different mouse strains that fail to transfer.
In addition to the above, there is other Vega annotation missing from Ensembl:
- Transcripts with a biotype of 'artifact'.
- Human LOF genes.
Manual annotation from Vega can be accessed in Ensembl in different ways as part of the merged Ensembl genebuild.
In versions of Ensembl prior to release 89 the data could also be accessed by:
- Using the 'Vega Havana' gene track (note that this is not on by default in Ensembl so must be switched on via the Configuration Panel - 'Configure this page' -> 'Genes and Transcripts'). These genes were not included in the Ensembl comparative analysis, and were not annotated for variations. The relevant gene / transcript can be found by either (i) following the 'Other Browsers' -> 'Ensembl' link in the left hand menu, or (ii) following the 'Jump to this stable ID in Ensembl' on Vega Gene and Transcript Summary Pages. Vega genes / transcripts that are not present in Ensembl do not have these links active.
- Using Ensembl Biomart (chose 'Vega' from the 'Choose Database dropdown menu').
- By downloading MySQL dumps of the data from the Ensembl FTP site