The recent breakthroughs in high-throughput technologies have resulted in a vast amount of big-data resources. However, it remains a significant challenge to transfer the knowledge from the public data to a new research project due to study design gaps and differences in data organization. Focusing on cancer immunology research, we integrated large-scale omics data and developed web platforms with interactive analysis modules.
In the first project, we processed the omics data for more than 33K samples in 188 tumor cohorts from public databases, 998 tumors from 12 ICB clinical studies, and eight CRISPR screens that identified gene modulators of the anticancer immune response. Integrating these datasets with three interactive analysis modules, our web platform TIDE has enabled public data reuse in hypothesis generation, biomarker optimization, and patient stratification in immune-oncology research.
In the second project, we have manually labeled 20,608 cytokine and growth factor treatment profiles from the NCBI GEO and ArrayExpress databases. With these curated datasets, our web platform CellSig can reveal the differential expression change of query genes upon diverse cell signals and predict the cytokine and growth factor response in a user’s transcriptomic data input.