Data sources

Internal annotations

Internal annotations are used by the pipeline for prioritisation, variant classification, quality control, and preliminary filtering. These are not visible to the user.

Data source

AION version hg19

AION version hg38

Citations

Repeat definition

RepeatMasker from UCSC 2020-02-20

RepeatMasker from UCSC 2022-10-18 (hg38)

RepeatMasker

Jurka, J et al. 2005

 

Hotspot definition

v9 (based on Clinvar  20/11/2024)

v9 (based on Clinvar 20/11/2024)

Richards, Sue et al. 2015

NMD

v9 Based on ensGene.txt.gz 06-04-2014

v9

Torene, Rebecca I et al. 2024

Uniprot functional domains

UniProt Release 06/2020

Uniprot release Release

05/2023

https://www.uniprot.org/

Global frequent artifact blacklist

v2

v1 native hg38

-

Whitelist of variants to keep with high gnomad freq

v1

v1 liftover

-

PP2 gene list

v9 (based on Clinvar  20/11/2024)

v9 (based on Clinvar 20/11/2024)

Richards, Sue et al. 2015

BP1 gene list

 v9 (based on Clinvar  20/11/2024)

v9 (based on Clinvar 20/11/2024)

Richards, Sue et al. 2015

PVS1 gene list

v9 (based on Clinvar  20/11/2024)

v9 (based on Clinvar 20/11/2024)

Richards, Sue et al. 2015

Coding region BED file

Based on hg19.ensGene.gtf.gz 10-01-2020 + Clinvar  20/11/2024

Based on MANE v1.3Clinvar 20/11/2024

Relevant for MANE: Morales, Joannella et al. 2022

Small variants annotations

Table with annotation sources.

Data source

Version hg19

Version hg38

Citations / link

VEP: effect, ENSP, HGVS, SIFT, Polyphen

111

111

McLaren W et al 2016

ClinVar

20/11/2024

20/11/2024

Landrum MJ et al 2018

gnomAD

2.1.1

4.1.0 (liftover)

3.1.2

4.1.0

gnomAD
Chen, Siwei et al. 2024

dbscSNV

v1.1

v1.1

Jian X et al. 2014

PhyloP

USCS phyloP46

UCSC phyloP100

Pollard KS, et al 2009

PhastCons

USCS phastCons46

UCSC phastCons100

Siepel A, et al. 2005

AION Classification (circe predictions)

v1

lifted over v1

-

Refseq

Ensembl 111

-

RefSeq: NCBI Reference Sequence Database

Grantham scores

From paper

From paper

R. Grantham 1974

BLOSUM62

BLOSUM62

BLOSUM62

Henikoff, S, and J G Henikoff. 1992

Canonical transcript definition

v9 Based on gnomAD v2.1.1 + APPRIS 2022_02.v47

v9 (MANE Select v1.3 + MANE Clinical v1.3)

Rodríguez, José Manuel, et al. 2013
Relevant for MANE: Morales, Joannella et al. 2022

CNV annotation sources

Table with annotation sources.

Data source

Version hg19

Version hg38

Citations

AnnotSV

3.3.6

3.3.6

Geoffroy, Véronique et al. 2018

Exomiser data

2202

2202


Smedley, Damian et al. 2015

gnomAD-SV

v2.1 (20/08/2020)

v2.1 (20/08/2020) liftover

Collins, Ryan L et al. 2020

ExAC

v0.3.1

v0.3.1

Lek, Monkol et al. 2016

ClinVar

20/11/2024

20/11/2024

Landrum MJ, Lee JM, Benson M, et al. 2018

ClinGen

01/12/2024

01/12/2024

Rehm, Heidi L et al. 2015

dbVar

30/10/2023

30/10/2023

Lappalainen, Ilkka et al. 2013

DDD

09/2015 (v9.2)

09/2015 (v9.2)

Firth, Helen V et al. 2009

DGV

25/02/2020

25/02/2020

MacDonald, Jeffrey R et al. 2014

1000 genomes

21/05/2017

21/05/2017

1000 Genomes Project Consortium et al. 2015

Ira M. Hall’s lab

31/12/2018 Static paper  https://www.biorxiv.org/content/10.1101/508515v1.supplementary-material (liftdown)

31/12/2018 Static paper https://www.biorxiv.org/content/10.1101/508515v1.supplementary-material

Abel, Haley J et al. 2020

Children’s Mercy Research Institute

Liftdown 2021-10-27 Static GitHub VCF file

2021-10-27 Static GitHub VCF file (in GRCh38)

Kirsche, Melanie et al. 2023

RefSeq

17/08/2020

17/08/2028

https://www.ncbi.nlm.nih.gov/refseq/

Ensembl

07/02/2014

04/06/2022

https://www.ensembl.org/index.html

Breakpoints

20/03/2009

23/01/2014

Perez, Gerardo et al. 2024

Repeats

20/02/2020

18/10/2022

http://genome.ucsc.edu/cgi-bin/hgTables

Segmental duplications

26/09/2011

14/10/2014

 

ENCODE blacklist

v2 (2018)
wgEncodeDacMapabilityConsensusExcludable.bed.gz

v2 (2018) hg38.blacklist.bed.gz

https://github.com/Boyle-Lab/Blacklist/

GAP

15/10/2021

15/10/2021

http://genome.ucsc.edu/cgi-bin/hgTables

Cytoband

14/06/2009 (GRCh37)

28/10/2022 (GRCh38)

http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/cytoBand.txt.gz

TAD boundaries

15/04/2024

21/11/2017

https://www.encodeproject.org/ 

ACMG

Static paper (https://pubmed.ncbi.nlm.nih.gov/35802134/) v3.1 78 genes

Static paper (https://pubmed.ncbi.nlm.nih.gov/35802134/) v3.1 78 genes

 

Disease & symptom annotation sources

Table with annotation sources.

Data source

Version

Citations / link

HPO

Downloaded 02/05/2024

https://hpo.jax.org/

Mondo

Downloaded 02/05/2024

http://purl.obolibrary.org/obo/mondo.json
https://mondo.monarchinitiative.org/

GenCC

Downloaded 02/05/2024

https://search.thegencc.org/

HGNC

Downloaded 02/05/2024

https://www.genenames.org/

DDG2P

Downloaded 02/05/2024

https://www.deciphergenomics.org/ddd/ddgenes

PanelApp

Downloaded 02/05/2024

https://panelapp.genomicsengland.co.uk/