Shotgun Metagenomic Sequencing Guide

Microbiome research relies on accurate identification and classification of thousands of microbial species in a single biological sample. To date, this has largely been conducted using amplicon sequencing, such as 16S rRNA gene sequencing for the identification of bacteria and archaea. However, shotgun metagenomic sequencing is a powerful tool that has distinct advantages, such as the identification of fungi, viruses, and other microorganisms in addition to elucidation of microbial gene functions. For someone who has never conducted shotgun metagenomic sequencing, it may seem difficult to know where to start.

This guide will explore the principles and applications of shotgun metagenomic sequencing, including sample preparation and data analysis. We will also discuss the limitations and potential pitfalls of shotgun sequencing and provide tips for optimizing your experiment.

Shotgun Metagenomic Sequencing – What is it?

Shotgun metagenomic sequencing is a powerful form of DNA sequencing that can identify multiple types of microbes and microbe functions within a biological sample. Unlike amplicon sequencing, which sequences individually selected gene regions such as the 16S rRNA gene found in bacteria and archaea, shotgun metagenomic sequencing involves sequencing all regions of genomic DNA from microorganisms in a sample.

The term “shotgun” sequencing is derived from the process used, whereby the DNA within the sample is fragmented into many small pieces, much like a shotgun would break something into pieces. The small DNA pieces are then sequenced, and their gene sequences are stitched back together using bioinformatics. This allows for the analysis of a wide range of genes and provides comprehensive insights into the microorganisms in the sample and their genetic potential.

Shotgun Metagenomic Sequencing – What is it used for?

Shotgun metagenomic sequencing is used to identify the wide range of microbes and their genetic functions within a particular sample. It therefore has a number of applications in industry, medicine, public health, and more:

Environmental microbiology: Shotgun metagenomic sequencing can be used to identify and classify microorganisms present in soil, water, and air samples, and to gain insights into their functional capabilities and the roles they play in the environment. For example, it has been used to study microbes in soil samples and permafrost, which may give insights into the effects of climate change on microbial life.

Medical microbiology: Shotgun metagenomic sequencing can be used to identify and classify bacteria and archaea present in clinical samples, such as those collected from the human microbiome or from infected tissue. This can help diagnose and treat infections, provide insights into the role of the microbiome in health and disease, or help detect and prevent hospital infection outbreaks.

Food microbiology: Shotgun metagenomic sequencing can be used to identify and classify microorganisms present in food products, such as fermented foods and beverages, or to ensure food safety and quality. It can also help to identify and track food-borne disease outbreaks.

Industrial microbiology: Shotgun metagenomic sequencing can be used to identify and classify microorganisms present in industrial processes, such as those involved in the production of biotechnology products or the treatment of wastewater.

Conducting your shotgun metagenome sequencing study

Sample Preparation

As with all microbiome studies, the collection and storage of your sample is critical to obtaining accurate, reliable, and reproducible gene sequencing results. Samples can be collected in a number of ways, depending on whether they are human fecal samples, soil samples, water samples, swabs or any other type of samples. In general, the three most important factors to consider are:

  1. Sterility: Sample containers must be sterile in order to prevent contamination from other microbes.

  2. Temperature: To preserve the microbes within a sample, it is critical to freeze them as soon as possible after collection. Options include storing samples in -20 or -80 C freezers or snap-freezing in liquid nitrogen. Freeze-thaw cycles are not good for the consistency of microbiome results, so it may be necessary to aliquot samples prior to freezing.

  3. Time: Timing is essential to sample preservation. In general, it is optimal to freeze samples as quickly as possible. When this is not possible, temporary storage at 4 degrees may be suitable, or preservation buffers can be used to maintain the integrity of samples for hours to days before they are frozen.

DNA Extraction

Once the sample has been collected and prepared, the next step is to extract the DNA from the microorganisms in the sample. This is typically done using a DNA extraction kit, which uses a combination of chemical and physical methods to separate the DNA from other cellular components. There are different DNA extraction kits available, and the choice of kit will depend on the type of sample being analyzed and the specific goals of the experiment. The selection of the DNA extraction kit has a significant impact on the picture of the microbial community that will be observed, and influences the ability to compare studies. In general, all DNA extraction kits include the following steps:

  1. Lysis: Lysis refers to the process of breaking open cells and their nuclei in order to release their contents, including DNA. This is usually conducted by chemical processes (adding enzymes) and mechanical processes (shaking/mixing).

  2. Precipitation: Once broken open, it is necessary to separate the DNA from other cell contents. This is usually done by adding a salt solution and alcohol.

  3. Purification: The precipitated DNA is then washed to remove other impurities and then resuspended in a water solution.

In a few cases, additional pre- or post-treatment steps are needed to break hard-to-lyse structures such as spores, separate specific components of the microbiome such as viruses, or get rid of DNA contaminants such as soil humic acids. This may involve adding enzymes to break down cell walls, using heat to denature proteins, or using physical methods to separate cells from debris. Read our DNA extraction blog for more details about how to optimize DNA extraction for microbiome studies.

Library Preparation

Library preparation refers to the series of steps taken to prepare the DNA for sequencing. For shotgun metagenomic sequencing, these steps include the following:

  1. Fragmenting the DNA: This process uses mechanical or enzymatic methods to break up the DNA into short pieces so that it can be sequenced.

  2. Ligating molecular ‘barcodes’ (‘index adapters’) to the DNA: This allows each individual sample to be identified after it is sequenced.

  3. Cleaning up the DNA: This ensures that the DNA is of the right size and is free from impurities for the best sequencing results.

Sequencing

Once the library has been prepared, the next step is to sequence the DNA fragments using a high-throughput sequencing platform. During sequencing, the DNA fragments are randomly amplified and sequenced using a combination of chemistry and optics. The resulting data generates short DNA sequences that can be aligned to databases to identify which microbial species that they belong to and the genes that they encode. A number of different sequencing platforms are available for shotgun sequencing, with sequence length and throughput being the most important factors in the selection of the platform.  

Data Analysis

Once the sequencing is complete, the next step is to analyze the data to identify and classify the microorganisms present in the sample. This is conducted using bioinformatics pipelines that help to clean up the sequencing data by removing human reads and sequencing errors, and align the cleaned reads to public databases such as the National Center for Biotechnology Information (NCBI) database. There are two primary approaches to data analysis of shotgun metagenomic sequencing, depending on the specific goals of the experiment:

  1. Metagenome assembly creates partial or full microbial genomes by stitching together all sequenced fragments of DNA. This is particularly useful if you want to identify new species or strains within a sample but is only possible if there is enough sequencing coverage, which can be more expensive. Assembly allows the recovery of large continuous regions that show the genomic context of the genes in the genomes, improve the taxonomic classification, and in some cases, allow the recovery of partial genomes.

  2. Comparison of reads to databases of microbial markers genes (using programs such as Kraken, MetaPhlAn, and HUMAnN). This approach relies on curated databases which help compare your results to known sets and rely on less sequencing coverage. However, it is less useful for identifying novel species which are not represented within the databases. Many of the pipelines for profiling and processing have online tutorials to assist those without bioinformatics expertise with analyses. The final results provide details of the relative abundances of bacteria, fungi, viruses, and other microbes in the sample as well as the relative abundances of all of the microbial genes (e.g. antibiotic resistance genes).

Strengths and Limitations of Shotgun Metagenomic Sequencing

While shotgun metagenomic sequencing is a powerful tool for identifying and characterizing microorganisms, it is important to be aware of its limitations and potential pitfalls. The main advantage of shotgun metagenomics is higher resolution in the taxonomic profiles. In contrast with amplicon sequencing where groups are classified to the genus or species level, shotgun metagenomics allows for species to strain levels. In addition, shotgun metagenomics accesses the genetic potential of the community. On the technical side, since there is no PCR step, no primer bias, copy-number-bias, or PCR artifacts are present. Neither do chimeras occur with shotgun metagenomics.

Cost: Shotgun metagenomic sequencing is more expensive than amplicon sequencing methods such as 16S rRNA gene sequencing. However, depending on how much detail needed on microbial genomes and the sample type, it is possible to perform shallow shotgun sequencing at costs similar to 16S amplicon sequencing.

Bioinformatics expertise: Shotgun metagenomic sequencing requires more computational power as well as expertise in bioinformatics to process and analyse the resulting data. However, publicly available tutorials and pipelines are available for non-experts.

Databases: Depending on the analysis approach, shotgun metagenomic sequencing often relies on comparison to particular databases, and the accuracy of the taxonomic and functional assignments depends on the quality and coverage of the databases. In addition, since amplicon sequencing has a longer history of use, there are more environments for which relevant information may be available if you want to compare your results to those from other studies.

“Contaminating” reads: As shotgun sequencing looks at all genomic DNA within a sample, there is a higher risk of sequencing DNA from unwanted, non-microbial sources. For example, when analysing the skin microbiome from a skin swab, a huge proportion of sequencing reads will come from human DNA and only a small proportion from microbial DNA.

Tips for Optimizing Your Shotgun Metagenomic Sequencing Experiment

To optimize your shotgun metagenomic sequencing experiments and obtain accurate and reliable results, there are a few key things to consider:

Think about sample selection: Microbiomes can exhibit large temporal and spatial variation over different sampling regions and times. It is therefore essential to select a consistent sample set that is representative of your population or environment.

Create rigorous sample collection protocols: Microbiome samples are at high risk of contamination from other microbes. It is therefore essential to collect, transport, and store your samples as safely as possible to ensure your results are truly representative of the microbiome that you are studying.

Optimize the sequencing coverage: The sequencing coverage, or the number and length of the reads obtained, can significantly impact the accuracy and resolution of the results. It is important to optimize the sequencing coverage to ensure that enough data is obtained to accurately classify and annotate the microorganisms present in the sample. This is particularly important for the identification of different microbial strains or single nucleotide variants. Sequencing depth also controls whether assembly, and all the analyses that depend on it, are possible.

Use negative/positive controls: Use appropriate controls, such as negative controls and DNA standards, to minimize the risk of errors and ensure the accuracy of the results. This will help to confirm if any contamination occurred in your protocol and can identify whether your results are capturing all types of microbes.

Use appropriate data analysis methods: Shotgun metagenomic sequencing analysis relies on a number of different assembly or alignment tools. The choice of these methods will depend on your study and should be considered carefully beforehand.

Shotgun Metagenomic Sequencing with Microbiome Insights

The team at Microbiome Insights are experts in shotgun metagenome sequencing and provide a full set of services from sample preparation to bioinformatics analysis. If you still have questions about using shotgun metagenome sequencing for your microbiome study, reach out and the Microbiome Insights team will be happy to help.

About Microbiome Insights

Microbiome Insights, Inc. is a global leader providing end-to-end microbiome sequencing and comprehensive bioinformatic analysis. The company is headquartered in Vancouver, Canada where samples from around the world are processed in its College of American Pathologist (CAP) accredited laboratory. Working with clients from pharma, biotech, nutrition, cosmetic and agriculture companies as well as with world leading academic and government research institutions, Microbiome Insights has supported over 925 microbiome studies from basic research to commercial R&D and clinical trials. The company's team of expert bioinformaticians and data scientists deliver industry leading insights including biomarker discovery, machine-learning based modelling and customized bioinformatics analysis.