Community survey for phytopathogen resources
FungiDB and Ensembl are teaming up to enhance resources for fungal and oomycete plant pathogen research - a collaboration funded by the BBSRC.
We need feedback from the phytopathogen community to help shape which genomes and functional datasets we prioritise.
Please provide your input by completing our short community survey by November 1, 2025:
https://qualtricsxmpy46tq866.qualtrics.com/jfe/form/SV_dakZl8LMNBvHpVs

Repeat feature annotation

If repeat data is present in INSDC when a genome is loaded, then those features are imported into Ensembl Genomes. For bacterial genomes, this is currently the only source of repeat data. For other divisions, a computational pipeline is additionally run, to annotate three types of repeat:

  • Low-complexity regions (Dust [1])
  • Tandem repeats (TRF [2])
  • Complex repeats (RepeatMasker [3])

Annotating repeats with RepeatMasker requires a repeat library. In most cases, a species-specific library is not available, so the RepBase [4] database of eukaryotic repetitive elements is used. Repeat libraries from the following sources are used and combined where possible:

Viewing and accessing repeat features

By default, repeat features are not displayed in the genome browser; display them by using the Configure this page option. You can view all repeats, or a subset of repeats based on type.

The repeat annotations can be programatically accessed using the Ensembl API. See the RepeatFeature and RepeatFeatureAdaptor documentation for further details.

For Ensembl Plants species only, tandem repeats annotated by the TRF program are not used to soft- and hardmask the genome sequences.

References

  1. Morgulis A et al. (2006) A fast and symmetric DUST implementation to mask low-complexity DNA sequences. J Comput Biol. 13:1028-40
  2. Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27: 573-580
  3. Smit AFA, Hubley R, Green P (1996-2010) RepeatMasker Open-3.0 http://www.repeatmasker.org
  4. Jurka J et al. (2005) Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 110:462-467