Before conducting a microbiome study with shotgun metagenomic sequencing, it is important to consider how deeply you will sequence your samples.
Microbiome sequencing technologies and databases have advanced rapidly in the last 5-10 years allowing businesses, researchers and public organisations to analyse microbiomes in greater depth than ever before, using a number of different technologies. While previously, metagenomic sequencing could identify a handful of microbial families, it is now possible to identify thousands of microbial species and strains within a sample, and simultaneously examining their genetic make-up at the finest detail. With greater depth however, comes greater cost and greater analytical complexity. In this blog, we outline the pros and cons of “shallow” and “deep” shotgun sequencing and the factors to consider when choosing sequencing depth so that you can make the most informed decisions when designing your microbiome study.
What is shotgun sequencing?
In the past, a majority of microbiome research was conducted using 16S rRNA gene sequencing. Although 16S sequencing has benefits, microbiome research is increasingly relying on whole genome shotgun sequencing because of its ability to analyse microbiomes in greater detail. A quick recap on 16S vs shotgun sequencing:
16S sequencing is a form of amplicon sequencing, alongside 18S and ITS sequencing. Amplicon sequencing involves the amplification of only one or more hypervariable gene region. This means that 16S sequencing can only identify certain microbes (bacteria, archaea, and microeukaryotes) and cannot examine their genetic potential.
Shotgun sequencing works by breaking down DNA into small pieces, sequencing each of these small pieces, and then stitching these pieces back together again using computational bioinformatics or aligning them to databases. In microbiome sequencing, this allows researchers to identify thousands of different bacteria, viruses, fungi and other microbes within one sample. Unlike 16S sequencing, shotgun sequencing can read all parts of the genome, meaning that it can also identify microbial genes and their potential functions. Not only can shotgun sequencing identify many different types of microbes already in databases, it can also be used to discover new species, called metagenomic-assembled genomes (MAGs). Therefore, shotgun sequencing can provide a full insight into microbiome composition, gene diversity and genomic diversity. Although shotgun sequencing is more expensive than 16S and other forms of amplicon sequencing, the cost per unit of sequencing data has been decreasing over time.
The importance of sequencing depth
The depth of sequencing refers to the number of times a given nucleotide in a genome has been “read”. Naturally, this relies on the number of sequencing “reads” that are generated per sample, something that can vary by sequencing instrument and the number of samples that are analysed in the same experiment. As sequencing technologies become more advanced, sequencing throughput (depth) has increased from thousands of reads to tens of billions of reads per sequencing run. This, alongside improving databases, increasingly allows the identification of extremely low abundant species (<0.01%) and to assemble completely new species with relatively high confidence. However, ultra-deep metagenomic sequencing isn’t always necessary. Shallow shotgun sequencing can provide comprehensive taxonomic and functional microbiome data at a cost similar to 16S sequencing, making it useful for large, or longitudinal studies. So what factors do you need to consider when choosing the depth of sequencing for each sample in your microbiome study?
Factors to consider when choosing sequencing depth
Often, the methods you choose to sequence your microbiome samples come down to cost. Ultimately, greater sequencing depth means greater cost. Ultra-deep sequencing may be too costly for a study with hundreds or thousands of samples, but may be value for money for a smaller study examining rare species within a particular microbiome. One way to lower cost is to use shallow shotgun sequencing, an approach that provides shotgun sequencing at a similar cost to 16S sequencing. One study found that shallow shotgun sequencing (0.5 million reads) and ultra-deep sequencing (2.5 billion reads) were 97% correlated for species composition and 99% correlated for metagenomic profiles. The results from shallow shotgun sequencing were also highly correlated with 16S sequencing. This makes shallow shotgun sequencing a suitable alternative for certain studies including those where costs are limited but functional microbiome data are needed, studies with lots of samples, such as longitudinal studies, or those that do not rely on strain characterization or in-depth genetic variation within a microbiome.
Growing evidence suggests that rare, low-abundant species can have major implications on overall microbiome function. The depth of sequencing determines with how much confidence you can identify the rarest microbes within your sample. However, in certain circumstances, such as identifying fungi in the human gut, amplicon sequencing can be useful over deep metagenomic sequencing. There are also increasing efforts to identify new species/strains living within the human gut microbiome, something which requires deep sequencing and subsequent bioinformatic assembly of metagenomic-assembled genomes (MAGs). A number of new studies show that deep sequencing (>20 million reads per sample) is required to identify these low-abundant taxa (<0.1% abundance) and to identify novel strains.
In addition to identifying rare microbial species and assembling whole genomes, deep sequencing is even able to identify single nucleotide variants (SNVs) in individual microbial strains within a sample. This genetic resolution can allow researchers to examine how different microbial species evolve and mutate within particular microbiomes, environments or individuals. For example, SNVs in individual microbiome species can distinguish healthy individuals from those with type 2 diabetes, amongst other diseases. Sequencing depth is important to identify such SNVs. One study found that shallow-depth shotgun sequencing was insufficient to comprehensively identify functionally important SNVs within the human gut microbiome, something that could only be achieved with ultra-deep sequencing. Another study found that the observed diversity of antimicrobial resistance (AMR) genes within environmental samples is highly dependent on sequencing depth, whereby at least 80 million reads was required to capture the full richness of AMR genes within a sample.
Microbiome studies can vary widely from large-scale population screening studies to in-depth analysis of genetic variation of microbial species within a particular microbiome. These different study types may warrant different sequencing approaches and sequencing depths. Studies examining genetic diversity of full microbiome genomes and distinguishing individual strains may require greater sequencing depth, while larger population studies or those examining broad taxonomic and functional characteristics of microbiome composition may only need shallow sequencing.
The diversity, complexity and functionality of the microbiome within a given sample can vary widely, from relatively simple skin microbiome samples to highly diverse soil microbiome samples. As “deep” sequencing increases the ability and confidence in detecting microbes that are in very small abundance in a sample, it is important to consider how deeply to sequence samples in samples with high diversity or low evenness, where low abundant microbes may be important. For example, fungi in the human gut are very low in abundance, but may be very important for health, and hence may require deep sequencing or ITS amplicon sequencing to detect. Host DNA concentration can also impact microbiome sequencing results, thereby requiring greater sequencing depth. Samples from skin swabs for example, can contain >90% of human reads when sequenced, meaning microbiome sequencing reads are diluted. Finally, the overall microbial biomass of a sample may also influence sequencing depth that is required. Wastewater samples have high biomass, while saliva samples will have relatively low biomass.
Your scientific question will determine what bioinformatic approach you use in your study and therefore how deeply you want to sequence your microbiome samples. There are two main approaches to analysing metagenomic sequencing data: direct-read mapping and metagenomic assembly. Direct-read mapping, involves aligning your sequencing reads to reference genomes. This approach relies on curated reference databases but is a simple approach to assessing microbiome composition and function. The second approach, metagenomic assembly, involves the de novo assembly of microbial genomes, but is more complex and computationally expensive. As direct read-mapping only requires certain parts of the genome in order to identify a microbial gene or species, it generally requires less sequencing depth. On the other hand, deep sequencing is required in order to fully assemble entire metagenomic assembled genomes (MAGs) from a complex microbiome sample containing thousands of species and strains. Your bioinformatic approach will also depend on your computational resources. While 16S data can be analysed on your own computer, deep metagenomic sequencing typically requires servers, data space and high-performance computing.
Do you need a shotgun sequencing partner?
Sequencing depth is just once factor that is important when designing your microbiome study. For other tips, consult our microbiome study guide or get in touch with us. Our team of experts have extensive experience supporting researchers, businesses and other professionals to design their microbiome study. Microbiome Insights use the most advanced sequencing technologies available to analyse your samples with >20 billion sequencing reads per run. In addition, our bioinformatics support team can help you to analyse your microbiome data using standard or bespoke analytical pipelines. Reach out to the team today and we will be happy to help.