Effect of RNA-Seq data normalization on protein interactome mapping for Alzheimer's disease
Tarih
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Erişim Hakkı
Özet
High throughput RNA sequencing brings new perspective to the elucidation of molecular mechanisms of diseases. Normalization is the first and most important step for RNA-Seq data, and it can differ based on the purpose of the analysis. Within-sample normalization methods (eg. TPM) are preferred when genes in a sample are compared with each other, and between-sample normalization methods (eg. deseq2, TMM, Voom) are used when the samples in a dataset are compared. Normalization approaches rescale the data, and, therefore, they affect the results of the analysis. Here, we selected two most commonly used Alzheimer's disease RNA-Seq datasets from ROSMAP and Mayo Clinic cohorts and mapped the differentially expressed genes on human protein interactome to discover disease-specific subnetworks. To this end, the raw count data were first processed with four different, commonly used RNA-Seq normalization methods (deseq2, TMM, Voom and TPM). Then, covariate adjustment was applied to the normalized data for gender, age of death and post-mortem interval. Each normalized dataset was separately mapped on the human protein-protein interaction network either in covariate-adjusted or nonadjusted form. Capturing known Alzheimer's disease genes and genes associated with the disease-related functional terms in the discovered subnetworks were the criteria to compare different normalization methods. Based on our results, applying covariate adjustment has a positive effect on normalization by removing the confounder effects. Covariate-adjusted TMM and covariate-adjusted deseq2 methods performed better in both transcriptome datasets.








