Comparing model-based unconstrained ordination methods in the analysis of high-dimensional compositional count data
DOI:
https://doi.org/10.52933/jdssv.v5i6.133Keywords:
Community-level modeling, copula, latent variable model, overdispersion, zero-inflationAbstract
Model-based ordination of ecological community data has gained recently significant popularity among practitioners, largely due to increased availability and utilization of computational resources. Specifically, generalized linear latent variable models (GLLVMs)–a factor-analytic and rank-reduced form of mixed effect models–have proven to be both accurate and computationally efficient. GLLVMs have been implemented for a wide range of response types common to ecological community data; presence-absence, biomass, overdispersed and/or zero-inflated counts serving as examples. In this paper, we demonstrate how GLLVMs can be applied in the analysis of high-dimensional compositional count data. These methods are useful for example in the analysis of microbiome data, which are typically collected using modern lab-based sampling tools and are inherently compositional due to the finite capacity of sequencing instruments. We use simulation studies to compare the ordination methods based on GLLVMs with algorithmic compositional data analysis methods that rely on log-transformations. Also recently developed fast model-based ordination methods that utilize Gaussian copula models are included in our comparisons. The methods are illustrated with a microbiome data example.