NCI 60 Data#
NCI microarray data. The data contains expression levels on 6830 genes from 64 cancer cell lines. Cancer type is also recorded.
The format is a list containing two elements: ‘data’ and ‘labs’.
data
: is a 64 by 6830 matrix of the expression values whilelabs
: is a vector listing the cancer types for the 64 cell lines.
Source#
The data come from Ross et al. (Nat Genet., 2000). More information can be obtained at http://genome-www.stanford.edu/nci60.
from ISLP import load_data
NCI60 = load_data('NCI60')
NCI60.keys()
dict_keys(['data', 'labels'])
NCI60['labels'].value_counts()
label
NSCLC 9
RENAL 9
MELANOMA 8
BREAST 7
COLON 7
LEUKEMIA 6
OVARIAN 6
CNS 5
PROSTATE 2
K562A-repro 1
K562B-repro 1
MCF7A-repro 1
MCF7D-repro 1
UNKNOWN 1
Name: count, dtype: int64
NCI60['data'].shape
(64, 6830)