Overview

Activity 3 of the larger Silent Genomes Research Project is focused on building the proposed Indigenous Background Variant Library (IBVL). 

What are DNA variants?

DNA (deoxyribonucleic acid) is the genetic material in all of our cells, which we inherited from our parents. DNA provides the instructions for how our bodies work and is made of a long string of four different genetic 'letters' (A, C, G, and T) that our cells can read. The four letters are repeated over and over in different combinations or sequences, with the entire length of DNA in each of our cells containing 3 billion DNA letters. 

DNA sequence variant
Figure 1: A DNA sequence from two individuals. Most of the sequence is the same for both individuals; however, a difference is visible (highlighted at one position). This difference is called a variant. Specifically, this variant is an A to C modification. 

All of humanity shares 99.9% of the same DNA letters. Even though we share most of the same DNA sequence, the 0.1% difference means that there are still many spots in our DNA where the letters differ between people — these letter differences are called 'variants'. In fact, each individual carries millions of variants, and this is part of what makes each of us unique. Most of these variants are harmless (non-disease-causing variants), but some may predispose to a health problem (disease-causing variants).

DNA variants can range from frequent to rare within any population (for example, as shown in Figure 2, 60% of people might have an A and 40% of people might have a C at a given DNA position).

Common variant distribution within population
Figure 2: Representation of common variant distribution within a population of five individuals. The A variant is present in 60% of the population and the C variant is present in 40%. Both variants are considered common in this population.

What is a Background Variant Library (BVL)?

A Background Variant Library (BVL) is a list or collection of all the DNA variants in a group of people without severe genetic conditions, which helps to determine variants that are common or rare in the general population. If a variant is common in a group of 'healthy' people without severe genetic conditions, it is less likely to be responsible for a rare disease.

A number of BVLs already exist. These libraries are not physical collections of books or lists, but instead the DNA variant information is stored in computer databases. One example of such a library that is publicly available is gnomAD

Genetic specialists and doctors use these BVLs as references, to learn which DNA variants are common in people from various populations around the world. Remember that each person carries millions of DNA variants. When a patient with a suspected rare genetic disease has genetic testing to try to find the DNA variant causing their condition, it is a big job to sort through all the DNA variants found on their genetic test to find the 'suspect' disease variant that is causing their condition. 

Acting like detectives, doctors can compare their patient's list of variants to those seen in a BVL and quickly rule out any of the common DNA variants found in people without rare diseases from their patient's specific population group. This then allows the doctors to focus on any rare variants on their patient's test report that are not seen (or seen only very rarely) in the library and are more likely to be disease-causing.

In summary, BVLs allow quick elimination of common variants seen in the general population in order to narrow in on the real 'suspects' for quicker diagnosis of genetic conditions.

What is an Indigenous Background Variant Library (IBVL)?

The proposed Indigenous Background Variant Library (IBVL) will be a database showing the frequency of each DNA variant found in a population of Indigenous people who have not been diagnosed with a severe genetic condition. It is important that a diversity of populations are represented in BVLs, since variants not seen at all in one group of people may be very common in another.

To date, no IBVL exists and Indigenous Peoples are not well-represented in the BVLs that are currently available. For example, people of European, Asian, African, and Ashkenazi Jewish background are represented in the gnomAD library, but DNA from Indigenous Peoples is absent (Figure 3).

Populations and subpopulations in DNA variant databases
Figure 3: Number of individuals by population and subpopulation in different DNA variant databases. Several databases were produced over time, with an increasing number of individuals. gnomAD is currently the most used database for variant interpretation. However, Indigenous peoples are under-represented within these databases; therefore, it is more complicated for genetic specialists and doctors to differentiate normal variation from potential disease-causing variants in Indigenous patients. | Source: MacArthur Lab

How will an IBVL be useful?

Each population around the world has certain variants that are unique to them and are not present in the DNA of other populations. The lack of an IBVL means that genetic specialists and doctors do not know which DNA variants are common and non-disease-causing in Indigenous Peoples. This makes diagnosing genetic conditions in Indigenous patients much more difficult and less efficient than in non-Indigenous patients, because doctors are not able to quickly rule out the common variants seen in the general population in order to find the real 'suspect' variants. This unfortunately can result in delayed diagnosis, potentially less effective health management and treatment for patients, and ongoing stress for families due to unanswered questions.

Rare disease variant
Figure 4: Example of how an Indigenous Background Variant Library could be used to diagnose a rare genetic disease. If the child with the T variant is Indigenous, without an IBVL, healthcare providers would not know if the T variant is common in Indigenous Peoples, and therefore could not be able to determine if it was likely to be disease-causing or not. However, with an IBVL to compare to, healthcare providers would see that the T variant is absent from the IBVL and therefore is a potential cause of the rare disorder. 

In summary, the proposed IBVL will:

  • Help healthcare providers understand what DNA variants are common in Indigenous Peoples in Canada.
  • Help increase equity and reduce health disparities by allowing more efficient diagnosis of genetic diseases for Indigenous families. 

Better diagnosis means improved overall medical care and wellbeing for families. 

How will the IBVL information be stored?

A key principle of the proposed IBVL is that no individual data (a list of all the DNA variants detected in a single person) will be available. Only population frequencies of the variants (how often each variant was seen in the overall population) will be included. That is, variant information from each participant will be merged as one set of data to ensure that no one person’s data could be recognized or accessed in any way. Access to the data will be for the purpose of clinical diagnosis only. Downloading the full data set from the IBVL database will not be allowed.

Individual IBVL data safety
Figure 5: Individual data will not be present in the IBVL; only population data will be present.

The IBVL database will be housed on an institutional server at the BC Children's Hospital Research Institute (BCCHR) in Vancouver, BC, which is protected by strict security measures, is regularly audited, password-protected, and can only be accessed by authorized personnel.

How will it be done? 

All the activities within the Silent Genomes Research Project are integrated and teams work in close collaboration. As part of Activity 1 (First Nations, Inuit and Métis Engagement, Governance and Capacity Building), an Indigenous-led Steering Committee is being established in partnership with participating individuals, communities, provincial and national Indigenous organizations to govern the safe use and access to the IBVL.  

Once the governance and safe protocols are in place, 1,500 Indigenous participants from across Canada will be invited into the study. Consented participants will be asked to provide a DNA sample (usually through a blood sample) or previously consented samples will be transferred for sequencing with community approval.

To protect privacy, each DNA sample will be de-identified (labelled only with a unique code, which does not contain any identifying information, such as name, date of birth, or community). The de- identified DNA samples will be read (by Whole Genome Sequencing) at the Genome Sciences Centre in Vancouver, BC. After sequencing is complete, the variant frequency (% of people with each variant) will be calculated and displayed in the IBVL. No identifying information will be displayed on participants' samples, sequencing/variant data, or the IBVL database at any point.

Who will be included in the IBVL?

The number of individuals included in the IBVL is of great importance to have a powerful IBVL. We are aiming for at least 1,500 Indigenous participants across Canada to provide sufficient background variant information for medical diagnostic purposes, but that will only be the start. Once the governance is in place and additional funding is secured, we hope that the IBVL will include many more Indigenous participants. 

Who is doing this study?

The IBVL construction is led by Dr. Wyeth W. Wasserman, a specialist in the use of computers in medicine. Dr. Laura Arbour, a clinical genetics doctor at UBC, in Victoria, BC, is the lead for the overall Silent Genomes Project. Dr. Nadine Caron, Canada's first Indigenous surgeon, is the co-lead for the overall Silent Genomes Project. There are other experts in this field (Dr. Ripan Mahli, and Dr. Simon Gravel) along with a number of trainees who are involved. 

Please see our Activity 3 Team.

 

Definitions

DNA: DNA stands for 'deoxyribonucleic acid'. It is the genetic material in all of our cells, which we inherited from our parents. DNA is made of a long string of genetic 'letters' that our cells can read. 

DNA Variant: A 'spelling change' in DNA (e.g., when a genetic letter is changed, added, deleted or moved) is called a 'variant'. Each person carries millions of DNA variants. Most variants are harmless, but some cause a section of our DNA to be misread by our cells and may lead to health problems. 

Gene: A gene is a section of DNA that gives a specific instruction that tells our cells how to function. For example, certain genes code for eye colour, while other genes tell our kidneys or heart how to work. Each person has about 25,000 different genes. Genes come in pairs, as we inherit one set of genes from our father and the other set from our mother.

Genetic Condition: A health condition that is caused by change(s) in a person's DNA (disease-causing DNA variants). Some genetic conditions are passed down from our parents, while others happen for the first time in an individual (due to a new DNA variant).

Genome: The whole package of all the DNA in a person is called their genome. The genome is like the book of all the DNA letters in a person.

Whole Genome Sequencing: A genetic test that can take a detailed look at all the DNA in a person at once.  The test reads the sequence of all the 'DNA letters' a person carries (i.e., reads their entire genome).

 

Click for more EDUCATIONAL MATERIALS