Microbiome Insights Blog

Shotgun Metagenomic Sequencing: Determining Depth

Written by Ruairi Robertson, PhD | Oct 12, 2022 7:29:20 PM

Before conducting a microbiome study with shotgun metagenomic sequencing, it is important to consider how deeply you will sequence your samples.

Microbiome sequencing technologies and databases have advanced rapidly in the last 5-10 years allowing businesses, researchers and public organisations to analyse microbiomes in greater depth than ever before, using a number of different technologies. While previously metagenomic sequencing could identify a handful of microbial families, it is now possible to identify thousands of microbial species and strains within a sample, and simultaneously examine their genetic make-up at the finest detail. With greater depth, however, comes greater cost and greater analytical complexity. In this blog, we outline the limitations and strengths  of “shallow” and “deep” shotgun sequencing and the factors to consider when choosing sequencing depth so that you can make informed decisions when designing your microbiome study.

What is shotgun sequencing?

In the past, a majority of microbiome research was conducted using 16S rRNA gene sequencing. Although 16S sequencing has benefits, microbiome research is increasingly relying on shotgun sequencing because of its ability to analyse microbiomes in greater detail. A quick recap on 16S vs shotgun sequencing:

16S sequencing is a form of amplicon sequencing, alongside 18S and ITS sequencing. Amplicon sequencing involves the amplification of only one or more hypervariable gene regions. This means that 16S sequencing can only identify certain microbes (bacteria, archaea, and microeukaryotes) and cannot examine their genetic potential.

Shotgun sequencing works by breaking down DNA into small pieces, sequencing each of these small pieces, and then stitching these pieces back together again using computational bioinformatics or aligning them to databases. In microbiome sequencing, this allows researchers to identify thousands of different bacteria, viruses, fungi and other microbes within one sample. Unlike 16S sequencing, shotgun sequencing can read all parts of the genome, meaning that it can also identify microbial genes and their potential functions. Not only can shotgun sequencing identify many different types of microbes already in databases, it can also be used to recover genomes, called metagenomic-assembled genomes (MAGs). Therefore, shotgun sequencing can provide a complete insight into microbiome composition, gene diversity and genomic diversity. Although shotgun sequencing is more expensive than 16S rRNA gene sequencing and other forms of amplicon sequencing, the cost per unit of sequencing data has been decreasing over time.

The importance of sequencing depth

The depth of sequencing refers to the number of sequencing “reads” that are generated per sample, something that can vary by instrument and the number of samples that are co-sequenced in the same experiment. As sequencing technologies become more advanced, sequencing throughput (depth) has increased from thousands of reads to tens of billions of reads per sequencing run. This has led to the ability to identify extremely low abundant species and to discover new species with relatively high confidence. However, ultra-deep metagenomic sequencing is not always necessary. Shallow shotgun sequencing can provide comprehensive taxonomic and functional microbiome data at a cost similar to amplicon sequencing.  For instance, one study found that shallow shotgun sequencing (0.5 million reads) and ultra-deep sequencing (2.5 billion reads) were 97% correlated for species composition and 99% correlated for metagenomic profiles. The results from shallow shotgun sequencing were also highly correlated with 16S sequencing. This makes shallow shotgun sequencing a suitable alternative for certain studies including those where costs are limited but functional microbiome data are needed, studies with many samples, such as longitudinal studies, or those that do not rely on strain characterization or in-depth genetic variation within a microbiome.

In addition to cost, what factors should we consider when choosing the depth of sequencing for each sample in your microbiome study?

Detection limit

Growing evidence suggests that rare, low-abundant species can have major implications on overall microbiome function. The depth of sequencing determines  the confidence with which you can identify the rarest microbes and genes within your sample. However, in certain circumstances, such as identifying fungi in the human gut, amplicon sequencing may be preferable over deep metagenomic sequencing. There are also increasing efforts to identify new species/strains living within the human gut microbiome, something which requires deep sequencing and subsequent bioinformatic assembly of metagenomic-assembled genomes (MAGs). A number of new studies show that deep sequencing (>20 million reads per sample) is required to identify these low-abundant taxa (<0.1% abundance) and to identify novel strains.

Genetic resolution

In addition to identifying rare microbial species and assembling whole genomes, deep sequencing is also able to identify single nucleotide variants (SNVs) in individual microbial strains within a sample. This genetic resolution can allow researchers to examine how different microbial species evolve and mutate within particular microbiomes, environments or individuals. For example, SNVs in individual microbiome species can distinguish healthy individuals from those with type 2 diabetes, amongst other diseases. Sequencing depth is important to identify such SNVs. One study found that shallow-depth shotgun sequencing was insufficient to comprehensively identify functionally important SNVs within the human gut microbiome, something that could only be achieved with ultra-deep sequencing. Another study found that the observed diversity of antimicrobial resistance (AMR) genes within environmental samples is highly dependent on sequencing depth, whereby at least 80 million reads was required to capture the full richness of AMR genes within a sample.

Study type

Microbiome studies can vary widely from large-scale population screening studies to in-depth analysis of genetic variation of microbial species within a particular microbiome. These different study types may warrant different sequencing approaches and sequencing depths. Studies examining genetic diversity of full microbiome genomes and distinguishing individual strains may require greater sequencing depth, while larger population studies or those examining broad taxonomic and functional characteristics of microbiome composition may only need shallow sequencing.

Sample type

The diversity, complexity and functionality of the microbiome within a given sample can vary widely, from relatively simple skin microbiome samples to highly diverse soil microbiome samples. As “deep” sequencing increases the ability and confidence in detecting microbes that are in very small abundance in a sample, it is important to consider how deeply to sequence samples in samples with high diversity or low evenness, where low abundant microbes may be important. For example, fungi in the human gut are very low in abundance, but may be very important for health, and hence may require deep sequencing or ITS amplicon sequencing to detect. Host DNA concentration can also impact microbiome sequencing results, thereby requiring greater sequencing depth. Samples from skin swabs, for example, can contain >90% of human reads when sequenced. Finally, the overall microbial biomass of a sample may also influence sequencing depth that is required. Wastewater samples have high biomass, while saliva samples will have relatively low biomass.

Bioinformatic approaches

Your scientific question will determine what bioinformatic approach you use in your study and therefore how deeply you want to sequence your microbiome samples. There are two main approaches to analysing metagenomic sequencing data: direct-read mapping and metagenomic assembly. Direct-read mapping involves aligning your sequencing reads to reference genomes and, for functional profiling, to a collection of genes. This approach relies on curated reference databases but is a simple approach to assessing microbiome composition and function. The second approach, metagenomic assembly, involves the de novo assembly of microbial genomes, but is more complex and computationally expensive. As direct read-mapping only requires certain parts of the genome in order to identify a microbial gene or species, it generally requires less sequencing depth. On the other hand, deep sequencing is required in order to fully assemble entire metagenomic assembled genomes (MAGs) from a complex microbiome sample containing thousands of species and strains. Your bioinformatic approach will also depend on your computational resources. While amplicon data can be analysed on a well-powered modern desktop computer, deep metagenomic sequencing typically requires servers, data space and high-performance computing.

Do you need a shotgun sequencing partner? 

Sequencing depth is just one factor that is important when designing your microbiome study. For other tips, consult our microbiome study guide or get in touch with us. Our team of experts have extensive experience supporting researchers, businesses and other professionals to design their microbiome study. Microbiome Insights use the most advanced sequencing technologies available to analyse your samples with >20 billion sequencing reads per run. In addition, our bioinformatics support team can help you to analyse your microbiome data using standard or bespoke analytical pipelines. Reach out to the team today and we will be happy to help.