Predicting COVID-19 susceptibility and severity
Recent reports suggest that both clinical and genetic risk factors may contribute to COVID-19 susceptibility and severity. Catherine Ball, Chief Scientific Officer of Ancestry®, discusses results of the company’s COVID-19 Research Study, designed to explore non-genetic and genetic associations with disease outcomes.
ONE OF THE more puzzling aspects of SARS-CoV-2, the causative agent of COVID-19, is that infection can produce a remarkably diverse spectrum of outcomes, ranging from asymptomatic to fatal. In the US, most infections result in mild illness that can be managed at home, yet about 14 percent of cases are hospitalised and approximately five percent are fatal.
Known risk factors for severe COVID-19, as identified by epidemiological studies, include common health conditions such as hypertension, diabetes and obesity as well as older age and male sex. For example, reports of higher susceptibility to and severity of SARS-CoV-2 infections in men could suggest important differences in immune response to the virus in men relative to women.
The growing toll of the COVID-19 pandemic has heightened the urgency of identifying those who are most at risk of infection and severe outcomes”
Emerging evidence suggests that genetic variation may contribute to COVID-19 susceptibility and severity. An early genome-wide association study (GWAS) of COVID-19 cases with respiratory failure identified two genetic loci that achieved genome-wide significance: one signal on chromosome 9 near the ABO gene, which determines blood type, and one signal on chromosome 3 near a cluster of genes with known immune function.1 Both genetic signals were later replicated by meta-analyses conducted by the COVID-19 Host Genetics Initiative (HGI), which combines more than 30 individual GWAS. The HGI additionally identified novel associations on chromosome 6, near FOXP4; on chromosome 12, near a gene cluster encoding antiviral restriction enzyme activators; on chromosome 19 near TYK2; and on chromosome 21, near IFNAR2. Multiple studies have also reported evidence of rare-variant associations, though such discoveries have not yet successfully been replicated in independent cohorts.
The growing toll of the COVID-19 pandemic has heightened the urgency of identifying those who are most at risk of infection and severe outcomes; hence, the need for further investigation to assess patterns of susceptibility and severity in large datasets. The Ancestry® COVID-19 Research Study, one of the largest studies of infection susceptibility and severity to date, was designed to:
- determine genetic and non-genetic factors that may be associated with an increased risk of COVID-19 susceptibility or severity; and
- develop risk models to predict an individual’s COVID-19 susceptibility and severity risk using self-reported survey data at scale.
Examining susceptibility and severity through self-reported data
To replicate and discover non-genetic and genetic associations with COVID-19 outcomes, we engaged AncestryDNA adult members in the US – a majority of the 18 million individuals in our global network. On 22 April 2020, we issued a 54-question COVID-19 survey intended to assess exposure, risk factors, symptomatology and demographic information that had previously been identified as associated with COVID-19 susceptibility and severity. Within four weeks, more than 500,000 AncestryDNA customers from all 50 states who consented to participate in research responded, including more than 4,700 individuals with COVID-19, as measured by a selfreported positive nasal swab test. All data were de-identified prior to subsequent analyses.
Rates of hospitalisation calculated from the self-reported positive cases in the Ancestry data are consistent with characteristics seen in a CDC data analysis (10 percent of individuals reported hospitalisation in the Ancestry data compared to 14 percent in the CDC dataset). In addition, these data represent a unique view of the US population, including the range of symptoms experienced by those who tested positive for COVID-19 as well as those who have been exposed to SARS‑CoV-2 but have not experienced any symptoms. From these self-reported outcomes, we assessed susceptibility by comparing those who reported a positive COVID-19 nasal swab test result to those who reported a negative swab test result. We also looked at severity by comparing COVID-19 positive individuals who were hospitalised to COVID-19 positive individuals who were not hospitalised.
Males have elevated risk of testing positive even after accounting for exposures
We observed significant associations between several risk factors and COVID-19 susceptibility and severity outcomes.2 Given the scale of our database, we were able to account for known exposures to COVID-19 to try to understand potential risk factors not explained by differences in exposures, which has not been accounted for in most other work. We found males were more likely than females to test positive for COVID-19 (odds ratio [OR]=1.36), even among people with the same known exposures to COVID-19 and age. This exposure‑adjusted result is novel and distinct from previous reports of elevated severity risk in males. Among those who tested positive for COVID-19, males (6.6 percent) were more likely than females (3.9 percent) to report progression to a critical case of the virus, consistent with CDC findings.
Understanding and predicting risk from self-reported data
People aged 18-29 reported higher exposure to COVID-19 than all other age groups and were at a slightly elevated risk (OR=1.28) for positive diagnosis compared to those aged 50-64, even among people with the same exposure and sex. People aged 65 and older were significantly more likely to be hospitalised (OR=1.60) compared to those aged 50-64, even when accounting for differences in health conditions, obesity and biological sex.
We identified three novel loci indicating genetic associations with COVID-19 outcomes”
African‑Americans were more likely to develop COVID-19 (OR=1.23) and were also significantly more likely to report progression to a critical case compared to those with European ancestry (OR=2.34), after accounting for health conditions, obesity, age and biological sex. We developed risk models to robustly predict individualised COVID-19 outcomes and were able to accurately predict an individual’s susceptibility risk based on self-reported demographics, exposures and symptoms. We trained a peer-reviewed susceptibility model3 on our training cohort and found that our models perform slightly better (Ancestry area under a curve [AUC]=0.94, Lit‑model AUC=0.90). We were also able to accurately predict an individual’s severity risk based on self‑reported demographics, pre‑existing conditions and symptoms. The severity risk models performed slightly better than previously reported clinical models despite not relying on clinical risk factors (eg, bloodwork), suggesting that self‑reported data can be used to accurately assess risk of both susceptibility and severity in lieu of clinical data. We assessed the risk models across different age, sex and genetic ancestry cohorts and we can report reasonably high performance in all cohorts; highlighting the potential utility and generalisability of these models to the broader population. To our knowledge, the assessment by genetic ancestry is the first of its kind in the COVID-19 risk modelling literature.
Novel genetic associations with COVID-19
To explore possible differences in biological response to COVID-19 infection, we analysed both susceptibility and severity outcomes using sex-stratified GWAS and sex-combined meta-analyses to identify genetic determinants associated with COVID-19 susceptibility and severity from more than 500,000 respondents reporting COVID-19 symptoms, outcomes, risk factors and exposures. These analyses included over 2,400 individuals with COVID-19 and 250 hospitalised cases in a cohort of European ancestry individuals.
Importantly, we identified three novel loci indicating genetic associations with COVID-19 outcomes.4 The strongest association was near IVNS1ABP, a gene involved in influenza virus replication, and it was only associated in males. It is unclear why this association is present only in males, though it may provide a clue as to why males appear to be at higher risk of COVID-19 infection, hospitalisation and mortality. We speculate that sex hormones or behavioural differences might trigger different cellular responses to COVID-19 infection in men and in women, and one such difference may involve differential expression of IVNS1ABP. The other two novel loci harbour genes with established roles in viral replication or immunity.
Clinical, behavioural and genetic insights into COVID-19 using self-reported data: what is next and important for future study?
Our results add to a growing body of evidence that individual genetic variation contributes to both susceptibility to COVID-19 and severity of illness. These results also suggest that identification of these genetic risk factors could provide profound insight into why COVID-19 manifests differently in individuals, particularly in men.
This research highlights the value of self‑reported epidemiological data at scale to provide public health insights into the evolving COVID-19 pandemic. Further, these survey responses, coupled with genomic data for over 500,000 individuals who have consented to research, provides Ancestry with the unique ability to quickly contribute to the global effort to better understand this disease. We are working to gain a deeper understanding of COVID-19 by investigating genomic and clinical components that influence how people contract and respond to the virus. We know that this information may be useful in the effort to develop treatments, preventatives or vaccines for the disease. In that spirit, we are making a subset of data from this study available to other qualified scientists through the European Genome-phenome Archive (EGA) to help inform their research.
About the author
Cathy Ball, PhD has served as Chief Scientific Officer for AncestryDNA, LLC since September 2016. She joined as Vice President of Genomics and Bioinformatics in 2011, helping to establish the company’s approach to genetic genealogy leading to the launch of AncestryDNA. Cathy is a genomic scientist who has annotated and mined the genomes of various organisms and created resources to help clinicians, citizens and other scientists exploit and explore genome data. Cathy also led the Stanford Microarray Database, the largest academic database of its kind. She has presented seminars at leading universities and contributes to National Institutes of Health committees. She received a BS in Biology and a PhD in Molecular Biology from the University of California, Los Angeles. Cathy was a post-doctoral fellow at the University of California, Berkeley prior to her research in the Departments of Genetics and Biochemistry at Stanford University School of Medicine.
- Ellinghaus D, et al. Genomewide association study of severe Covid-19 with respiratory failure. N Engl J Med. (2020).
- Knight SC, McCurdy SR, et al. COVID-19 Susceptibility and Severity Risks in a Survey of Over 500,000 People. medRXiv (2020).
- Allen WE, et al. Population-scale longitudinal mapping of COVID-19 symptoms, behaviour and testing. Nat. Hum. Behav. 4, 972–982 (2020).
- Roberts GHL, Park DS, et al. AncestryDNA COVID-19 Host Genetic Study Identifies Three Novel Loci. medRXiv (2020).