Skip to main content

Generalized correlation measure using count statistics for gene expression data with ordered samples.

Citation
Wang, Y. X. R., et al. “Generalized Correlation Measure Using Count Statistics For Gene Expression Data With Ordered Samples.”. Bioinformatics (Oxford, England), pp. 617-624.
Center UCSD-UCLA
Author Y X Rachel Wang, Ke Liu, Elizabeth Theusch, Jerome I Rotter, Marisa W Medina, Michael S Waterman, Haiyan Huang, Oliver Stegle
Abstract

Motivation: Capturing association patterns in gene expression levels under different conditions or time points is important for inferring gene regulatory interactions. In practice, temporal changes in gene expression may result in complex association patterns that require more sophisticated detection methods than simple correlation measures. For instance, the effect of regulation may lead to time-lagged associations and interactions local to a subset of samples. Furthermore, expression profiles of interest may not be aligned or directly comparable (e.g. gene expression profiles from two species).

Results: We propose a count statistic for measuring association between pairs of gene expression profiles consisting of ordered samples (e.g. time-course), where correlation may only exist locally in subsequences separated by a position shift. The statistic is simple and fast to compute, and we illustrate its use in two applications. In a cross-species comparison of developmental gene expression levels, we show our method not only measures association of gene expressions between the two species, but also provides alignment between different developmental stages. In the second application, we applied our statistic to expression profiles from two distinct phenotypic conditions, where the samples in each profile are ordered by the associated phenotypic values. The detected associations can be useful in building correspondence between gene association networks under different phenotypes. On the theoretical side, we provide asymptotic distributions of the statistic for different regions of the parameter space and test its power on simulated data.

Availability and implementation: The code used to perform the analysis is available as part of the Supplementary Material.

Contact: msw@usc.edu or hhuang@stat.berkeley.edu.

Supplementary information: Supplementary data are available at Bioinformatics online.

Year of Publication
2018
Journal
Bioinformatics (Oxford, England)
Volume
34
Issue
4
Number of Pages
617-624
Date Published
12/2018
ISSN Number
1367-4811
DOI
10.1093/bioinformatics/btx641
Alternate Journal
Bioinformatics
PMID
29040382
PMCID
PMC5860612
Download citation