Single cell genomic technologies allow for much more than average gene expression to be examined among discrete groups of cells. With single cell resolution, we have the ability to better understand the differences in gene expression distributions for populations of cells, especially in terms of higher order interactions such as gene variation and covariation. This is particularly relevant in developmental contexts, where continuous and subtle distributional shifts occur before and after cell fate choice. These shifts are not necessarily reflected solely by differences in average gene expression across the inferred developmental time. Additionally, for technologies where spatial locations of cells are retained, we can examine these distributional shifts in gene expression to understand the behavior of cells in their spatial context.
From a methodological perspective, existing tests can compare higher order interactions among discrete groups of cells (e.g. Fisher’s Z-transform test for differences in Pearson correlation between two groups), but require some arbitrary choice to dichotomize the data. These existing tests also do not identify non-linear changes that can occur over a dynamic differentiation trajectory or in the spatial context.
For this reason, we sought to develop a method for testing changes in higher order interactions that fulfilled three important criteria. Firstly, it would not require pre-defined groups of cells; secondly, it would be able to identify non-linear and non-monotonic changes in these higher order interactions; and thirdly, it would be flexible enough to accommodate the vast range of higher order interaction measures that are of interest to researchers.
To that end, we present single cell Higher Order Testing (scHOT) as a flexible framework for performing tests of higher order interactions of single or multiple genes.
scHOT has three main components: 1. an underlying weighted higher order function, for example Pearson or Spearman correlation; 2. a cell weighting scheme, with a user-defined span; and 3. a testing scaffold of single or multiple genes which go into the weighted higher order function. scHOT then uses sample permutations to build an empirical null distribution to assess statistical significance for single or multiple genes.
Additionally, the scHOT test statistic allows for detection of non-linear or non-monotonic changes in the higher order interaction, so this is a powerful way of identifying key changes along a particularly long trajectory or complex physical space.
One of the contexts in which we demonstrated scHOT was examining gene variability in mouse liver development, where immature hepatoblasts differentiate into either mature hepatocytes or cholangiocytes. We identified genes, involved in cell division and chromosome organisation, which increased in variability towards the point of cell fate decision. Our scHOT analysis suggests that increased transcriptional plasticity can precede cell fate commitment in this context.
Overall, scHOT is an analytical approach that allows us to gain novel insights thanks to the high resolution naturally offered with single cell genomic technologies. We believe this offers a new lens for interrogating single cell data and describing continuous patterns of variation and covariation, giving additional insights beyond those offered by tests for differential expression alone. scHOT is implemented as an R Bioconductor package, available at https://bioconductor.org/packages/scHOT, with a detailed vignette, and is interoperable with the state-of-the-art SingleCellExperiment object class.