Understanding racial and ethnic representation in cancer biorepositories
Racial and ethnic cancer disparities persist despite tremendous strides in our understanding of cancer biology. Large biorepositories–which are essentially databases that contain biological samples and, often, genomic sequencing–have been critical to enabling cancer research, but they have historically consisted of mostly white patients. Prior studies have shown insufficient representation of minority groups in biorepositories, which limits the generalizability of findings from studies that use such a narrow patient population. One such biorepository is the American Association of Cancer Research (AACR) Project Genomics Evidence Neoplasia Information Exchange (GENIE), which is a large and publicly available international cancer biorepository containing both clinical and genomic data.
Multiple groups have used Project GENIE to describe novel genomic variations between racial categories in different tumors. But without a better understanding of this resource, our concern is that inadequate representation of racial/ethnic minorities within Project GENIE may contribute to spurious findings or inadvertently contribute to health care disparities by excluding certain groups. Our recent publication therefore sought to compare the population included in Project GENIE to the general cancer population of the United States using the Centers for Disease Control and Prevention Wide-ranging Online Data for Epidemiologic Research (CDC WONDER) database, and to determine if Project GENIE is indeed powered for genomic comparisons between racial/ethnic groups.
Racial/ethnic minority groups are under-represented across most cancer types
The six races/ethnicities examined were White, Asian, Pacific Islander, Black, Hispanic, and Native American. The representation of these races was analyzed in 16 types of cancer, and we defined “representation” as the ratio of the actual number of GENIE samples to the expected number of samples per cancer type based on the respective incidences in CDC WONDER. Overall, white patients had adequate or over-representation across nearly all cancer types, as did the combined group of Asian/Pacific Islander. In contrast, Black and Hispanic patients were largely under-represented in the database, and Native Americans often had zero representative samples for certain cancer types. These disparities in representation suggest that Project GENIE does not accurately reflect the true distribution of cancers in the United States, and may limit our ability to address disparities in cancer outcomes for these minority groups.
Project GENIE is not powered to detect mutational differences between all racial/ethnic groups
The field of precision oncology fundamentally aims to identify the precise reasons as to why one patient might respond extremely well to a treatment, while another patient might not be affected at all. This is often achieved by comparing genomic profiles between two distinct phenotypes–such as responder or non-responder–to identify potential driver mutations. Various groups have used Project GENIE to compare mutational profiles in such a way between racial/ethnic minorities and white cohorts with the hope of explaining observed disparities in cancer outcomes. However, this approach has two potential drawbacks. First, race is a social construct whose definition can vary by generation and geography–any discoveries about mutational differences between such groups lack context since Project GENIE does not contain the relevant socioeconomic data or other factors that might confound a comparison. Second, prior to our study, it was unclear whether this database was sufficiently powered to make such a comparison in the first place. Our findings suggest that Project GENIE is not powered to detect subtle mutational differences between white and non-white patients in several common cancer types. For example, there was a paucity of samples from Black, Asian, Hispanic, Native American, and Pacific Islander patients for both primary and metastatic prostate cancer, thus limiting our ability to detect potentially relevant mutations between such cohorts.
Minority groups have historically been excluded from large biorepositories for a variety of reasons, and disparities in cancer outcomes will likely continue without additional efforts to improve representation and inclusion. Project GENIE provides public access to a rich clinico-genomic dataset and is critical for democratizing and accelerating precision oncology research–however, our findings suggest that these data do not reflect the true landscape of cancer patients in the US and thus may misrepresent the disease burden in many racial/ethnic minority populations. In addition, there are many more variables contributing to cancer disparities than genomics alone, and it is important to recognize what types of questions these data can answer. Our hope is that this study will shed light on the strengths and limitations of this valuable resource, and can inform future projects that aim to address racial/ethnic disparities in precision oncology research.
- Giaquinto, A. N. et al. Cancer statistics for African American/Black People 2022. CA Cancer J. Clin. 72, 202–229 (2022).
- Spratt, D. E. et al. Racial/ethnic disparities in genomic sequencing. JAMA Oncol. 2, 1070–1074 (2016).
- Consortium, A. P. G. et al. AACR Project GENIE: powering precision medicine through an international consortium. Cancer Discov. 7, 818–831 (2017).
- Mahal, B. A. et al. Racial differences in genomic profiling of prostate cancer. N. Engl. J. Med. 383, 1083–1085 (2020).
- Goel, N. et al. Racial differences in genomic profiles of breast cancer. JAMA Netw. open 5, e220573–e220573 (2022).
- Nassar, A. H., Adib, E. & Kwiatkowski, D. J. Distribution of KRASG12C somatic mutations across race, sex, and cancer type. N. Engl. J. Med. 384, 185–187 (2021).
- Kamran, S. C. et al. Tumor mutations across racial groups in a real-world data registry. JCO Precis. Oncol. 5, 1654–1658 (2021).
- Schumacher, F. R. et al. Race and genetic alterations in prostate cancer. JCO Precis. Oncol. 5, PO.21.00324 (2021).
- Bustamante, C. D., De La Vega, F. M. & Burchard, E. G. Genomics for the world. Nature 475, 163–165 (2011).
- Aldrighetti, C. M., Niemierko, A., Van Allen, E., Willers, H. & Kamran, S. C. Racial and ethnic disparities among participants in precision oncology clinical studies. JAMA Netw. Open 4, e2133205 (2021).
- United States Cancer Statistics – Incidence: 1999-2017, WONDER Online Data- base. United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Published 2020. http://wonder.cdc.gov/cancer-v2017.html