General recommendations
Additionally to the VCF format checklist in VCF format , these are things to consider to guarantee best results:
-
Does your VCF file contain variants with enough quality?
-
All variants in the VCF file which do not fulfil the following quality criteria will be filtered out by AION to avoid evaluating false positive variants. This means that even if your file has the required format and fields, it may contain low quality data, and wrong or no results would be produced.
-
Small variants:
-
Depth (DP) > 10x
-
Variant allele frequency (VAF) > 20%
-
-
CNV / SV variants:
-
FILTER = PASS
-
- AION CNV analysis does not hard filter on other quality metrics due the high variability of the scales of these metrics between different variant callers. However, AION CNV analysis considers as high quality and ranks higher variant fulfilling the criteria outlined above (VCF format | CNV / SV variants VCFs ).
-
-
What variant callers can be used to generate VCF files?
-
In principle all variant callers can be used to generate the VCF for small variants, but some tools do not normalise variant calls by default (e.g. contain multi-allelic variants that have not been standardised following best practices). This can result in variant annotation problems: please revise that variant calls have been normalised before submitting the files.
-
For CNVs and SVs, AION has been validated on Canvas, Manta, Dragen CNV and Dragen SV. Other variant callers may be supported if they adhere to the same standards.
-
- Shall I apply any variant filtering before submitting the files?
-
Best results are guaranteed when low-quality and artifactual variants (generated during the sequencing process) have been filtered out from VCF files.
-
Error codes
In case there is an error, the user is informed through the following messages:
Error code |
Category |
Explanation |
What to do |
---|---|---|---|
E103 |
General |
Failed on decompressing input file. Only gzipped and bgzipped vcf files are supported |
Use another tool to compress the VCF file or contact support if needed |
E104 |
General |
Some of the provided gene symbols or HGNC IDs are invalid |
Validate the input and contact support if needed |
E140 |
VCF integrity |
Provided samples do not have unique sample IDs. There are more than one files with the same sample ID |
Merge the VCFs with repeated IDs and submit again |
E141 |
VCF integrity |
Required fields check failed for input file! There are some missing fields in the VCF that are required |
See required fields above and contact support if needed |
E142 |
VCF integrity |
Import check failed for input file |
Contact support |
E143 |
VCF integrity |
VCF file could not be checked for mandatory columns |
Contact support |
E144 |
VCF integrity |
VCF file is missing the some columns |
See required fields above and contact support if needed |
E145 |
VCF integrity |
VCF file truncated |
Validate the input and contact support if needed |
E146 |
VCF integrity |
GVCF file detected. |
GVCF is currently not a supported input format, please convert to VCF. Contact support if needed |
E147 |
VCF integrity |
Structural variant file submitted to SNV pipeline or the other way around. |
See information for how to submit a case Creating a case , contact support if needed. |
E150 |
VCF Processing |
VCF consistency checks failed due to missing definitions in header |
See required fields above and contact support if needed |
E151 |
VCF Processing |
No variants in the selected region. Update your in silico panel filters if they were applied |
If you added in silico panel filters, resubmit the case selecting other genes, otherwise the content of you VCF may not overlap with AION’s supported region. Contact support if needed. |
E162 |
Multisample VCF |
Found more than one sample in input VCF! Currently, AION only supports singletons or trios. |
Split your VCF into individual samples and submit again. Contact support if needed. |
E170 |
Reference genome mismatch |
A mismatch was found between the reference genome provided and the input VCF files |
Verify the provided reference genome is correct |
E301 |
Sample matching |
The sample ID in the small and structural variants VCFs doesn’t match for the same samples. |
Verify the sample IDs in the VCF files |
Additional resources:
-
-
Guidelines for VCF files: https://samtools.github.io/hts-specs/VCFv4.3.pdf
-
Tools to validate VCF files:
-