Microbial metagenome aligned to pangenome gig map
Microbial Metagenome Aligned to Pangenome (gig-map)
After building a microbial pangenome, it can be useful to quantify the relative abundance of each gene group within a collection of metagenome samples. This analysis approach can be useful for comparing the relative abundance of organisms between groups of samples, identified on the basis of their gene content.
Analysis Steps:
- Align metagenome reads to the pangenome using DIAMOND
- Deduplicate any multi-mapping reads with FAMLI
- Filter samples using a minimum number of reads and genes detected
- Summarize the sequencing depth for each gene bin
- Compare the relative abundance of gene bins between samples
graph TD;
Metagenomes --> align[Align Metagenomes to Pangenome];
catalog[Gene Catalog] --> align;
align --> dedup[Deduplicate Aligments];
dedup --> filter_samples[Filter Samples];
bins[Gene Bins] --> summarize[Summarize Gene Bins];
filter_samples --> summarize;
Metadata --> compare[Compare Samples];
summarize --> compare;
Supports combining input datasets in a single analysis.
Parameters:
- Minimum Alignment Score: Alignment bitscore threshold applied on a per-read level (default: 50)
- Minimum Percent Identity: Minimum percent identity of the amino acid alignment required to retain the alignment (default: 90)
- Maximum E-Value: Maximum E-value threshold used to filter all alignments default (default: 0.001)
- Minimum Number of Reads: Minimum number of aligned reads required to retain a sample
- Minimum Number of Genes: Minimum number of aligned genes required to retain a sample
- Category: Optional: Compare sample groups based on a metadata variable
Workflow Repository: github.com/FredHutch/gig-map
Citations:
- Buchfink, B., Xie, C. & Huson, D. Fast and sensitive protein alignment using DIAMOND. Nat Methods 12, 59–60 (2015). https://doi.org/10.1038/nmeth.3176
- Golob JL, Minot SS. In silico benchmarking of metagenomic tools for coding sequence detection reveals the limits of sensitivity and precision. BMC Bioinformatics. 2020 Oct 15;21(1):459. doi: 10.1186/s12859-020-03802-0. PMID: 33059593; PMCID: PMC7559173.