AVDB

Help Page – AVDB: Arab Variation and Disease Burden

About AVDB

AVDB is a curated, publicly accessible database cataloging pathogenic and likely pathogenic genetic variants in the Emirati population. It serves as a valuable tool for clinicians, genetic counselors, researchers, and policymakers focused on rare diseases, carrier screening, and population genomics.

The AVDB was built to:

  • Address the lack of Arab representation in major databases like gnomAD and ClinVar.
  • Empower carrier screening and preventive health programs in consanguineous populations.
  • Share population-specific variant data following ACMG/AMP guidelines.

Data Sources

AVDB integrates data from two major sources:

1. Clinical Rare Disease Cohort (N = 1,333)

  • 346 individuals tested internally at Al Jalila Children's Specialty Hospital (Dubai).
  • 987 individuals curated from published literature (2010–2022), screened and annotated using standard criteria.

2. Independent Emirati Exome Dataset (N = 1,194)

  • Samples sequenced in a CAP-accredited laboratory.
  • High-quality exome-wide variant calling and annotation.
  • Filtered for autosomal recessive pathogenic (P) and likely pathogenic (LP) variants.

Variant Curation & Classification

  • All variants are interpreted using ACMG/AMP guidelines, with support from ClinVar, dbSNP, segregation data, and functional annotations.
  • Only P/LP variants are included in the final dataset.
  • Internal data helped classify novel variants not present in existing databases.

Frequency & Risk Estimates

For each gene and variant, the following are calculated:

  • Allele frequencies in Emirati exomes
  • Carrier rates per gene and condition
  • First-cousin risk modeling, using 2q(1−q) × 1/8

Annotation Tools

The platform includes links to external tools:

  • Ensembl – genome and transcript data
  • UCSC Genome Browser – region visualization and conservation
  • ClinVar and dbSNP – variant-level significance and population data

Data Access

  • Searchable by gene, variant, disorder, or genomic coordinates
  • Filterable by screening panel inclusion and carrier frequency
  • Exportable datasets for further use

All curated variants submitted to ClinVar (ID: SUB14857280).

How To Use database:

1. Gene Browser
Functionality: Explore gene-level statistics from the Arab cohort. Example: A clinician interested in common recessive disorders filters genes by: • Screening Panel: Included • Min Carrier Rate (%): 1.0 They identify CFTR with: • AJGC Frequency: 6.698% • Carrier Frequency: 10.67% • At-Risk Couples: 0.0144 They click on CFTR to investigate pathogenic variants linked to Cystic Fibrosis. Screenshot:

Gene Browser IMG
Column Description (for Help Page and Tooltip)
Gene Name of the gene based on HGNC nomenclature.
Disorder The primary genetic disorder(s) associated with the gene, based on clinical annotations.
Allele Count Total number of alternative alleles observed in the 1,194 Emirati individuals (out of 2,388 total alleles).
Allele Frequency Allele frequency of the gene in the Arab Genome cohort (AJGC); calculated as allele count / 2,388.
Carrier Frequency Estimated frequency of heterozygous carriers in the population, calculated using Hardy-Weinberg approximation (2pq).
Max At-Risk Couples rate Estimated proportion of couples at risk for having affected offspring if they are first cousins (2pq × 1/8).
In Panel Indicates whether the gene is part of the Emirati premarital screening panel (Yes/No).
View In Ensembl Direct link to the gene’s entry in the Ensembl Genome Browser.
View In UCSC Direct link to the gene’s location on the UCSC Genome Browser.


2. Variant Browser
Functionality: View and filter pathogenic variants within genes. Example: A geneticist searches for variants in CFTR with a Min Allele Frequency of 1%. They find: • Variant: NM_000492.3:c.1521_1523del • HGVS p.: p.Phe508del • Allele Count: 24 • Carrier Rate: 4.753% • At-Risk Couples: 0.00475

Variant Browser IMG
Column Description (for Help Page and Tooltip)
Chrom Chromosome number where the variant is located.
Chromosome Position Genomic position (GRCh38/hg38) of the variant within the specified chromosome.
Gene Name The gene in which the variant is located, based on standardized HGNC nomenclature.
HGVS c. (Clinically Relevant) Coding-level Human Genome Variation Society (HGVS) notation describing the variant’s nucleotide change.
HGVS p. (Clinically Relevant) Protein-level HGVS notation showing the predicted amino acid change caused by the variant.
Allele Count Number of times the variant allele was observed in the 1,194 individuals (out of 2,388 total alleles).
Allele Frequency Frequency of the variant allele in the AJGC cohort, calculated as allele count / 2,388.
Carrier Rate Estimated proportion of individuals heterozygous for this variant, based on Hardy-Weinberg approximation.
At-Risk Couples Rate Estimated frequency of first-cousin couples both carrying the variant, used to assess reproductive risk.
View In Ensembl Direct link to the variant's position in the Ensembl Genome Browser for additional functional context.
View In UCSC Direct link to the variant in the UCSC Genome Browser to explore tracks like conservation and annotations.


3. Gene Detail View
Functionality: Summarized view of variants linked to a selected gene. Example: Clicking on PAH from the Gene Browser opens its detail page, listing: Variant HGVS c. HGVS p. Allele Frequency Carrier Rate ACMG c.1066-11G>A - - 0.03% 0.06% This view helps clinicians assess variant-level significance and inheritance patterns.

Gene Detail IMG


4. Screening Panel
Functionality: Review recommended genes for premarital screening. Example: A policymaker views the panel, identifying PAH and BTD as high-risk genes. They can view: • Rationale: Based on high carrier frequency and clinical severity • Carrier Rate: ≥0.3% • ML Risk Score: Computed using allele metrics + gene burden Filters allow dynamic prioritization based on risk score or inclusion status.

Screening Panel IMG
Column Description (for Help Page and Tooltip)
Gene Name The standard HGNC-approved symbol representing the gene included in the screening panel.
Disorder The associated autosomal recessive disorder(s) linked to this gene, based on clinical and literature evidence.
Allele Freq The frequency of pathogenic variants in this gene within the AJGC cohort (based on 1,194 individuals or 2,388 alleles).
Variant Count Total number of unique pathogenic variants identified in the AJGC cohort for this gene.
Included Indicates whether the gene is included (“Yes”) in the recommended screening panel based on risk and clinical criteria.
Rationale Brief explanation for why the gene was included (e.g., high carrier rate, severe phenotype, high population frequency).
Carrier Rate Estimated proportion of carriers for this gene assuming Hardy-Weinberg equilibrium, based on aggregated variant frequencies.
1st Cousin Risk Rate The estimated probability that both partners in a first-cousin marriage carry a pathogenic variant in the same gene.
ML Risk Score A machine learning-derived score estimating the relative risk or priority of including the gene in a screening panel.


Key Features Highlighted

Feature Benefit:
Interactive filtering Allows precise cohort-level queries Gene-to-Variant Linking Easy navigation from gene to pathogenic variants ML Risk Scoring Facilitates ranking genes for screening External Links Integrated access to Ensembl, UCSC, ClinVar, and dbSNP Downloadable CSVs Supports offline analysis

Intended Users

This platform benefits:

  • Clinicians – Diagnosis and counseling
  • Researchers – Population genomics studies
  • Policy Makers – Public health program planning
  • Genetic Counselors – Carrier risk evaluations

Citation

Please cite AVDB data as:

[Author List]. AVDB: Arab Variation and Disease Burden Database. [Journal, Year – once published].

Contact Us

Have questions or feedback?

Email: info@avdb-arabgenome.ae
Website: https://avdb-arabgenome.ae