IB Core Projects

Cross-tissue Gene Expression Meta-analysis in Dermatomyositis (Dr. Jessica Neely). We conducted a comprehensive gene expression meta-analysis in dermatomyositis (DM) muscle and skin tissues to identify shared disease-relevant genes and pathways across tissues. Meta-analysis was performed by first processing data sets individually followed by cross-study normalization. Complementary single-gene and network analyses using Significance Analysis of Microarrays (SAM) and Weighted Gene Co-expression Network Analysis (WGCNA) were conducted to identify genes significantly associated with DM. Cell-type enrichment was performed using xCell. There were 544 differentially expressed genes (FC ≥ 1.3, q < 0.05) in muscle and 300 in skin. There were 94 shared upregulated genes across tissues enriched in type I and II interferon (IFN) signaling and major histocompatibility complex (MHC) class I antigen-processing pathways. In a network analysis, we identified eight significant gene modules in muscle and seven in skin. The most highly correlated modules were enriched in pathways consistent with the single-gene analysis. There is striking similarity in gene expression across DM target tissues with enrichment of type I and II IFN pathways, MHC class I antigen-processing, T-cell activation, and antigen-presenting cells. These results suggest IFN-γ may contribute to the global IFN signature in DM, and altered auto-antigen presentation through the class I MHC pathway may be important in disease pathogenesis. 


Neely J, Rychkov D, Paranjpe M, Waterfield M, Kim S, Sirota M. Gene Expression Meta‐Analysis Reveals Concordance in Gene Activation, Pathway, and Cell‐Type Enrichment in Dermatomyositis Target Tissues ACR Open Rheumatology. PubMed PMID 31872188.

Cross-Tissue Transcriptomic Analysis Leveraging Machine Learning Approaches Identifies New Biomarkers for Rheumatoid Arthritis (Dr. Marina Sirota, with GT and GMR Cores). There is an urgent need to identify effective biomarkers for early diagnosis of rheumatoid arthritis (RA) and accurate monitoring of disease activity. We define an RA meta-profile by reprocessing publicly available, cross-tissue, gene expression data (13 synovium datasets with 284 samples and 14 blood datasets with 1,885 samples from NCBI GEO), and apply machine learning to identify putative biomarkers, which we further validate on independent datasets. We developed and applied a robust machine learning pipeline to select genes and to build an RA Score based on expression of genes dysregulated and highly associated with RA in both tissues. In validation with five independent datasets, we demonstrate the clinical utility of the RA Score. We found the RA Score to be highly correlated with DAS28 (r = 0.33 p = 7e-9) and able to distinguish osteoarthritis (OA) and RA samples (OR 0.57, p = 8e-10). The RA Score was also able to monitor the treatment effect among RA patients (t-test of treated vs untreated, p = 2e-4). We have filed a provisional patent on the methodology and the RA Score. Results of these analyses have also been used for several other PREMIER projects (Dr. Judith Ashouri and Dr. Sook Wah Yee).


Ashouri JF, Hsu LY, Yu S, Rychkov D, Chen Y, Cheng DA, Sirota M, Hansen E, Lattanza L, Zikherman J, Weiss A. Reporters of TCR signaling identify arthritogenic T cells in murine and human autoimmune arthritis. Proc Natl Acad Sci U S A. 2019. PubMed PMID: 31455730.


Rychkov D, Neely J, Oskotsky T, Sirota M. Cross-Tissue Transcriptomic Analysis Leveraging Machine Learning Approaches Identifies New Biomarkers for Rheumatoid Arthritis. bioRxiv. 2020. DOI name: 10.1101/2020.07.24.220483.

Integrative Multi-Omics Analysis Identifies Three Distinct Subtypes of Systemic Lupus Erythematous (Dr. Cristina Lanata with GT and CDI Cores). Systemic lupus erythematous (SLE) is a heterogeneous autoimmune disease in which progression and outcomes vary significantly among different racial groups. The reasons underlying these health disparities remain unknown. To enable precise patient stratification and guide the development of molecular therapeutics, we aimed to identify SLE patient subgroups within a multiethnic cohort using an unsupervised clustering approach based on high dimensional DNA methylation data, and American College of Rheumatology (ACR) classification criteria for SLE as clinical phenotypes. We successfully identified and validated three patient clusters based on clinical data, two severe and one mild.  We then performed a methylation association analysis and identified a set of 256 differentially methylated CpGs across clusters, including 101 CpGs in genes in the Type I Interferon pathway, and validated these associations in an external cohort. A cis-methylation quantitative trait loci analysis identified 744 significant CpG-SNP associations, with 397 SNPs controlling methylation of 61 of the 256 differentially methylated CpGs. Unlike previous studies, our computational approach highlights molecular differences associated with SLE patient clusters derived from all ACR phenotypes, rather than single outcome measures. This work is a collaboration across all PREMIER cores and demonstrates the utility of applying integrative methods to address clinical heterogeneity in multifactorial multi-ethnic disease settings. A patent has been filed on the results of this work. Results are accessible as an RShiny Application: http://comphealth.ucsf.edu/sle_clustering/. These raw data have been uploaded to dbGaP and GEO. To further understand race/ethnic differences in SLE, we performed cell-specific transcriptomic analysis from patients enrolled in the SLE cohort. RNA-Seq data for 4 immune-cell types sorted from PBMC (monocytes, B cells, CD4-T cells and NK cells) have been generated for 120 patients (63 Asian and 57 White individuals) from the cohort. Here we leverage these data to study the specific-transcriptomic differences in each of these cell types between individuals of Asian or White ethnicity using a four-tier approach: unsupervised clustering, differential expression analyses, gene co-expression analyses, and machine learning. Our computational approach highlights molecular differences associated with clinical and demographic features to address clinical heterogeneity in multifactorial multi-ethnic disease settings.


Andreoletti G, Lanata CM, Paranjpe I, Jain TS, Nititham J, Taylor KE, Combes AJ, Maliskova L, Ye CJ, Katz P, Era MD, Yazdany J, Criswell LA, Sirota M. Ethnicity-specific transcriptomic variation in immune cells and correlation with disease activity in systemic lupus erythematosus. BioRxiv. 2020. DOI name: 10.1101/2020.10.30.362715


Lanata CM, Paranjpe I, Nititham J, Taylor KE, Gianfrancesco M, Paranjpe M, Andrews S, Chung SA, Rhead B, Barcellos LF, Trupin L, Katz P, Dall'Era M, Yazdany J, Sirota M, Criswell LA. A phenotypic and genomics approach in a multi-ethnic cohort to subtype systemic lupus erythematosus. Nat Commun. 2019. PubMed PMID: 31467281.

Single Cell Data Analysis and Visualization (Dr. Daniel Bunis in collaboration with Dr. Gabi Fragiadakis). We have developed dittoSeq, a visualization suite for all major forms of bulk and single-cell RNAseq data in R which is part of Bioconductor and is downloaded about 200 times per month.


Bunis DG, Andrews J, Fragiadakis GK, Burt TD, Sirota M. dittoSeq: Universal User-Friendly Single-Cell and Bulk RNA Sequencing Visualization Toolkit. Bioinformatics. 2020. DOI name: 10.1093/bioinformatics/btaa1011