AION Database - variant statistics

Variant statistics allow you to automatically generate high quality insights from your cases

As part of the AION Database (AION DB), data submitted to AION automatically generate statistical insights from all the cases you submit. This statistical data is then annotated to new cases and can be utilised as additional source of insights when analysing the case.

The AION DB is a private database on a per customer level to store and generate insights. Any information stored is not shared with other accounts. The AION DB contains two types of data:

  • Population frequency and other statistics of variants in submitted VCF files.
  • Previously classified variants (in a similar fashion how ClinVar works), see here.

For both types of data, only small variants (SNPs and Indels) are currently supported. CNV variants and other will be supported in a future iteration of the functionality.

In this article you will find more information about the statistical data generated. The AION DB also stores variant classification data, please refer to that article for further information.

Access to the statistical insights of the AION Database is activated through our support team. If you don't have it active, please get in touch with our support team at: support@nostos-genomics.com  

Variant statistics

You can leverage AION DB variant statistic and previous classifications for filtering variants in the manual filtering views through either column filters or advanced filters.

AION currently computes the following statistical data:

  • Allele frequency
  • Allele count
  • Number of homozygous, heterozygous, hemizygous
  • Total number of samples
  • Number of samples with the variant of interest

Each of these quantities are computed for the total population as well as for affectedness status (affected / not affected). This is intended to accommodate frequency data coming from control populations or from unaffected parents in duo/trio/etc cases.

Importantly, cases with different reference genomes are also kept at different databases.

Where to find the AION DB data

AION displays the data in the AION DB in the AION ranking. The variants that are present in the AION DB are highlighted in the AION ranking showing the allele count of the variant in the AION DB, the population frequency and the past interpretations available.

If the variant is found in the internal variant database, a tab will be available containing the past interpretations along with how many cases contain this variant, the zygosity of the variant in those cases and the affectedness status.

Additionally, you can filter by this data in the manual filtering views.

Important considerations

Case Activation

Cases that have more than 7 days since submission and have not been accessed during the last 24h become inactive. To be able to check the status of the case, you need to access the case to visualize this information. In order to activate the case, press the "Activate" button.

Case activation will take a few minutes (2-10mins) and will enable the ability to fiter by AION DB statistics in the manual filtering - small variants tab.

Upon the activation of the case you will be able to refresh the statistics as described in the Timing section below.

Timing

Variant statistics are annotated from the annotation-ready data from the AION DB as it is at submission time. The annotation data is refreshed every 30 minutes during business hours (6am-10pm CET), so new cases during these hours will be annotated with, at most, 30 minutes old data. You will always see a date and time of the statistics shown for each case.

The AION DB statistics data can be updated by pressing the "Refresh" button on the case header information. AION currently support updated for data analysed using GRCh37/hg19 reference genome:

Ranking

The ranking is not affected by the data in the AION DB, however, data is clearly displayed for the consideration of the user.

Frequency definition

The population frequency is effectively a minimal population frequency because there is no information on the sequenced regions in the VCF files used as input. It can be that a specific VCF file does not contain a specific variant because that variant is not present in the sequenced individual or because the region where the variant is located was not sequenced. This cannot be solved using data from VCF, so we display the calculated population frequency noting this important remark.

Influence of the current case in the statistics

The current case has no special treatment in the AION DB and will be considered towards the statistics. However, given the timing of when new data is annotated onto cases, at first the current case will not count towards the statistics. It will only do so if statistics are manually refreshed (feature in progress) or if the case is relaunched and the annotation-ready data in the AION DB has already been refreshed.

Duplicates

We check for duplicates in the AION DB, so if you submit the same VCF file twice, it will not be added again. However, this wouldn't recognise a re-sequencing of the same person in a new VCF file.

ℹ️ Find further visual support in the following clickable flow: AION DB - Statistics

Submitting data during onboarding

See AION Database - onboarding