Racial and ethnic disparities in a real-world precision oncology data registry

Biorepositories like Project GENIE enable precision oncology by sharing clinico-genomic data, but it remains unknown whether such registries reflect the true distribution of cancers in minorities. Our publication assesses racial/ethnic representation within Project GENIE and its implications.
Published in Cancer
Racial and ethnic disparities in a real-world precision oncology data registry
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Understanding racial and ethnic representation in cancer biorepositories

Racial and ethnic cancer disparities persist despite tremendous strides in our understanding of cancer biology. Large biorepositories–which are essentially databases that contain biological samples and, often, genomic sequencing–have been critical to enabling cancer research, but they have historically consisted of mostly white patients. Prior studies have shown insufficient representation of minority groups in biorepositories, which limits the generalizability of findings from studies that use such a narrow patient population. One such biorepository is the American Association of Cancer Research (AACR) Project Genomics Evidence Neoplasia Information Exchange (GENIE), which is a large and publicly available international cancer biorepository containing both clinical and genomic data.

Multiple groups have used Project GENIE to describe novel genomic variations between racial categories in different tumors. But without a better understanding of this resource, our concern is that inadequate representation of racial/ethnic minorities within Project GENIE may contribute to spurious findings or inadvertently contribute to health care disparities by excluding certain groups. Our recent publication therefore sought to compare the population included in Project GENIE to the general cancer population of the United States using the Centers for Disease Control and Prevention Wide-ranging Online Data for Epidemiologic Research (CDC WONDER) database, and to determine if Project GENIE is indeed powered for genomic comparisons between racial/ethnic groups.

Racial/ethnic minority groups are under-represented across most cancer types

The six races/ethnicities examined were White, Asian, Pacific Islander, Black, Hispanic, and Native American. The representation of these races was analyzed in 16 types of cancer, and we defined “representation” as the ratio of the actual number of GENIE samples to the expected number of samples per cancer type based on the respective incidences in CDC WONDER. Overall, white patients had adequate or over-representation across nearly all cancer types, as did the combined group of Asian/Pacific Islander. In contrast, Black and Hispanic patients were largely under-represented in the database, and Native Americans often had zero representative samples for certain cancer types. These disparities in representation suggest that Project GENIE does not accurately reflect the true distribution of cancers in the United States, and may limit our ability to address disparities in cancer outcomes for these minority groups.

Project GENIE is not powered to detect mutational differences between all racial/ethnic groups

The field of precision oncology fundamentally aims to identify the precise reasons as to why one patient might respond extremely well to a treatment, while another patient might not be affected at all. This is often achieved by comparing genomic profiles between two distinct phenotypes–such as responder or non-responder–to identify potential driver mutations. Various groups have used Project GENIE to compare mutational profiles in such a way between racial/ethnic minorities and white cohorts with the hope of explaining observed disparities in cancer outcomes. However, this approach has two potential drawbacks. First, race is a social construct whose definition can vary by generation and geography–any discoveries about mutational differences between such groups lack context since Project GENIE does not contain the relevant socioeconomic data or other factors that might confound a comparison. Second, prior to our study, it was unclear whether this database was sufficiently powered to make such a comparison in the first place. Our findings suggest that Project GENIE is not powered to detect subtle mutational differences between white and non-white patients in several common cancer types. For example, there was a paucity of samples from Black, Asian, Hispanic, Native American, and Pacific Islander patients for both primary and metastatic prostate cancer, thus limiting our ability to detect potentially relevant mutations between such cohorts.

Conclusion

Minority groups have historically been excluded from large biorepositories for a variety of reasons, and disparities in cancer outcomes will likely continue without additional efforts to improve representation and inclusion. Project GENIE provides public access to a rich clinico-genomic dataset and is critical for democratizing and accelerating precision oncology research–however, our findings suggest that these data do not reflect the true landscape of cancer patients in the US and thus may misrepresent the disease burden in many racial/ethnic minority populations. In addition, there are many more variables contributing to cancer disparities than genomics alone, and it is important to recognize what types of questions these data can answer. Our hope is that this study will shed light on the strengths and limitations of this valuable resource, and can inform future projects that aim to address racial/ethnic disparities in precision oncology research.

References

  1. Giaquinto, A. N. et al. Cancer statistics for African American/Black People 2022. CA Cancer J. Clin. 72, 202–229 (2022).
  2. Spratt, D. E. et al. Racial/ethnic disparities in genomic sequencing. JAMA Oncol. 2, 1070–1074 (2016).
  3. Consortium, A. P. G. et al. AACR Project GENIE: powering precision medicine through an international consortium. Cancer Discov. 7, 818–831 (2017).
  4. Mahal, B. A. et al. Racial differences in genomic profiling of prostate cancer. N. Engl. J. Med. 383, 1083–1085 (2020).
  5. Goel, N. et al. Racial differences in genomic profiles of breast cancer. JAMA Netw. open 5, e220573–e220573 (2022).
  6. Nassar, A. H., Adib, E. & Kwiatkowski, D. J. Distribution of KRASG12C somatic mutations across race, sex, and cancer type. N. Engl. J. Med. 384, 185–187 (2021).
  7. Kamran, S. C. et al. Tumor mutations across racial groups in a real-world data registry. JCO Precis. Oncol. 5, 1654–1658 (2021).
  8. Schumacher, F. R. et al. Race and genetic alterations in prostate cancer. JCO Precis. Oncol. 5, PO.21.00324 (2021).
  9. Bustamante, C. D., De La Vega, F. M. & Burchard, E. G. Genomics for the world. Nature 475, 163–165 (2011).
  10. Aldrighetti, C. M., Niemierko, A., Van Allen, E., Willers, H. & Kamran, S. C. Racial and ethnic disparities among participants in precision oncology clinical studies. JAMA Netw. Open 4, e2133205 (2021).
  11. United States Cancer Statistics – Incidence: 1999-2017, WONDER Online Data- base. United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Published 2020. http://wonder.cdc.gov/cancer-v2017.html

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Cancer Biology
Life Sciences > Biological Sciences > Cancer Biology

Related Collections

With collections, you can get published faster and increase your visibility.

Applications of Artificial Intelligence in Cancer

In this cross-journal collection between Nature Communications, npj Digital Medicine, npj Precision Oncology, Communications Medicine, Communications Biology, and Scientific Reports, we invite submissions with a focus on artificial intelligence in cancer.

Publishing Model: Open Access

Deadline: Dec 30, 2024

Natural language processing in Clinical Medicine

This Collection welcomes research on Natural Language Processing innovations to improving medical and population health outcomes, with a particular emphasis on computational linguistics approaches and applications for health and digital medicine.

Publishing Model: Open Access

Deadline: Sep 27, 2024