Introduction

Author

Mathew Crowther

TipObjectives

The aim of this component of the practical series is to introduce you to several commonly used techniques for analysing multivariate data. By the end of the series, you will:

  • Understand the principles underlying Principal Components Analysis (PCA) and Non-metric Multidimensional Scaling (nMDS)
  • Be familiar with the concepts behind multivariate hypothesis testing using permutational techniques such as ANOSIM and PERMANOVA
  • Be able to plan and conduct experiments that test multivariate hypotheses
  • Know how to carry out these analyses using relevant statistical software
  • Be able to interpret, present, and report your findings clearly and effectively

1 Background

Multivariate analyses are becoming increasingly common in biology, largely due to the growing accessibility of software tools capable of handling complex data. However, as discussed in lectures, this accessibility has also led to the uncritical and sometimes inappropriate use of multivariate methods. It is therefore essential that biologists develop a working understanding of when and how to apply these techniques, as well as how to recognise when others have used them incorrectly.

Multivariate approaches are widely used for:

  • Summarising complex patterns in biological data,
  • Identifying groupings or patterns among samples or species,
  • Testing statistical hypotheses involving multivariate datasets.

2 What you’ll be doing

In this part of the course, you will:

  1. Practice performing key multivariate techniques using sample datasets provided during practical sessions.
  2. Apply those techniques to your own data, which you will collect and collate in groups.
  3. Explore and compare the outcomes of different multivariate methods, observing how each highlights different features of the same dataset.

Although best scientific practice recommends selecting appropriate analyses before data collection, this exercise is designed to give you the opportunity to trial multiple analytical approaches and better understand their application and interpretation.

3 Analyses and programs we’ll use

The main programs we will use in this course are Jamovi and PRIMER. However, if you prefer working in R, complete R scripts (RStudio files) for all the methods covered will be made available on Canvas.

Principal Components Analysis

Principal Components Analysis (PCA) is a method used to reduce the complexity of large datasets by transforming them into a smaller set of uncorrelated variables (called principal components) that capture the maximum variance in the original data.

This technique is especially useful for analysing habitat data, where multiple environmental variables can be summarised into a few key components that describe the main gradients in the data.

We will perform PCA using Jamovi (or R, if preferred).

4 nMDS and multivariate hypothesis testing

(ANOSIM, PERMANOVA, and SIMPER)

In this section of the course, we will explore techniques for visualising and testing patterns in complex multivariate datasets. These methods reduce the dimensionality of the data (e.g., species abundances across sites) to reveal meaningful patterns and relationships, and then allow us to test explicit hypotheses about these patterns—often in relation to predefined groups (e.g., treatment types) or environmental variables.

We will primarily use PRIMER 7 for these analyses, although equivalent R code will also be provided.

nMDS – Non-metric multidimensional scaling

nMDS is an ordination technique used to visualise patterns in multivariate data by representing samples in a low-dimensional space based on their similarity (usually Bray-Curtis). It helps us see groupings or gradients in community composition or other multivariate datasets.

ANOSIM – Analysis of similarities

ANOSIM tests whether groups of samples differ in composition based on a comparison of within-group vs among-group similarity. It uses rank-based distance measures and is best suited to simple experimental designs.

  • Software: PRIMER 7 (or R)
  • Output includes: Global R statistic and p-value

5 PERMANOVA – Permutational multivariate analysis of variance

PERMANOVA is a more flexible and powerful method for testing hypotheses in multivariate data. It uses permutation-based ANOVA on a resemblance matrix and can accommodate complex experimental designs, including multiple fixed and random factors.

  • Software: PRIMER 7 (or R)
  • Capable of testing main effects, interactions, and nested designs

6 SIMPER – Similarity percentages

When a significant difference is detected (e.g., by ANOSIM or PERMANOVA), SIMPER helps identify which variables (e.g., species) are contributing most to the observed differences between groups.

  • It calculates the percentage contribution of each variable to the overall dissimilarity.
  • Useful for interpreting biological meaning behind multivariate differences.
  • Software: PRIMER 7 (or R)

7 Timeline

  • Week 9: introduction, PCA of lecture data and designing model systems to test multivariate hypotheses (in groups) for your reports.
  • Week 10: group presentations of model systems and experimental design, and how to perform nMDS, PERMANOVA (and ANOSIM) and SIMPER using sample data
  • Week 11: submitting your complete datafiles onto Canvas, analysing your report data
  • Week 12: additional analyses and summary of findings and guidelines for describing multivariate methods and results for your report