Cancer Genome Atlas – Interactive Exploration of Patient Gender, Race and Age
enpicom.comI wonder why they chose these variables; they certainly aren't the first things that come to mind from that dataset. In particular, they do not necessarily mirror cancer incidence. Better to use actual incidence data if that's what one wants to explore.
The whole point of the dataset was the molecular side. Gene expression, copy number changes, and mutations.
The gene expression, copy number changes, and mutations in the TCGA data are used to discover and develop new cancer treatments.
These (and others) molecular traits differ between gender, race, and age. An over representation of a specific gender or race might affects the effectiveness of the developed therapies towards other gender/race combinations.
This visualization is meant to show at a glance how these clinical variables are currently distributed in one of the most used and relevant cancer data-sets.
The people at The Cancer Genome Atlas did a great job, but much more has to be done to achieve the ambitious goal of Precision Medicine and have therapies personalized to each one genetic makeup.
Right, which is what makes this visualization not interesting. If the comparison had been between TCGA cohort characteristics and the cancer population characteristics, that would be far more interesting.
Even weighting the tissue types by actual incidence rather than number of samples would be far more interesting.
I was one of (many many many) co-authors on several TCGA consortium papers, so I'm quite familiar with it, and with the challenges going forward, but this visualization addresses none of those challenges.
Not an expert in cancer by any means, but I know a thing or two about leukemia and race, sex and age are, precisely, the most common 'risk factors' that are looked into.
The Cancer Genome Atlas was meant to be a collection of molecular data though, not a measure of incidence data.
It's The Cancer Genome Atlas as the acronym they use is TCGA