Microarray big data integrated analysis to identify robust diagnostic signature for triple negative breast cancer
MetadataView full catalogue record
KeywordsCancer; Gene expression; Big data; Pattern classification; Biology computing; Genetics; Molecular biophysics; Microarray; Early detection
Triple negative breast cancers (TNBC) are clinically heterogeneous, an aggressive subtype with poor diagnosis and strong resistance to therapy. There is a need to identify novel robust biomarkers with high specificity for early detection and therapeutic intervention. Microarray gene expression-based studies have offered significant advances in molecular classification and identification of diagnostic/prognostic signatures, however sample scarcity and cohort heterogeneity remains area of concern. In this study, we performed integrated analysis on independent microarray big data studies and identified a robust 880-gene signature for TNBC diagnosis. We further identified 16-gene (OGN, ESR1, GPC3, LHFP, AGR3, LPAR1, LRRC17, TCEAL1, CIRBP, NTN4, TUBA1C, TMSB10, RPL27, RPS3A, RPS18, and NOSTRIN) that are associated to TNBC tissues. The 880-gene signature achieved excellent classification accuracy ratio on each independent expression data sets with overall average of 99.06%, is an indication of its diagnostic power. Gene ontology enrichment analysis of 880-gene signature shows that cell-cycle pathways/processes are important clinical targets for triple negative breast cancer. Further verification of 880-gene signature could provide additive knowledge for better understanding and future direction of triple negative breast cancer research.