Skip to content

Commit

Permalink
Add links to gg2 formatted database
Browse files Browse the repository at this point in the history
  • Loading branch information
pschloss committed Jun 25, 2024
1 parent 34015cb commit 3bfd4d2
Show file tree
Hide file tree
Showing 4 changed files with 17 additions and 3 deletions.
2 changes: 1 addition & 1 deletion _wiki/classify.seqs.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ neighbor consensus and zap. Taxonomy outlines and reference
sequences can be obtained from the [taxonomy
outline](/wiki/taxonomy_outline) page. The command requires that
you provide a fasta-formatted input and database sequence file and a
taxonomy file for the reference sequences. To run through the example below, download [Example Data](https://mothur.s3.us-east-2.amazonaws.com/wiki/ExampleDataSet.zip)
taxonomy file for the reference sequences. To run through the example below, download [Example Data](https://mothur.s3.us-east-2.amazonaws.com/wiki/exampledataset.zip)
and [mothur-formatted version of the RDP training set
(v.9)](https://mothur.s3.us-east-2.amazonaws.com/wiki/trainset9_032012.pds.zip).

Expand Down
10 changes: 10 additions & 0 deletions _wiki/greengenes-formatted_databases copy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: 'greengenes2-formatted databases'
redirect_from: '/wiki/greengenes2-formatted_databases'
---

The [biocore group](https://github.com/biocore/greengenes2/) released an updated version of the greengenes taxonomy in [October 2022](https://ftp.microbio.me/greengenes_release/2022.10/), which was published in [Nature Biotechnology](https://www.nature.com/articles/s41587-023-01845-1). If you use these files, you should cite McDonald et al.

I have modified the version made available on the greengenes2 ftp server. The most notable difference is that I removed the species level names since more than two thirds of the genera only have one species name. In my opinion, this would give an overly specific sense of the classification of your sequences since there is insufficient diversity within each species. If you would like to see how to get the species names and see how else I modified the files, please see the [mothur blog post](/blog/2014/greengenes-v13_8_99-reference-files) which is the same as the README file found within the download that I am making available.

* [greengenes2 (2020_10, wo/ species level names)](https://mothur.s3.us-east-2.amazonaws.com/wiki/greengenes2_2020_10.wo_sp.tgz)
5 changes: 3 additions & 2 deletions _wiki/miseq_sop.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,8 +75,9 @@ For this tutorial you will need `mothur` and several sets of files:
You can easily substitute these choices (and should) for the reference
and taxonomy alignments using the updated [Silva reference
files](/wiki/Silva_reference_files), [RDP reference
files](/wiki/RDP_reference_files), and [Greengenes-formatted
databases](/wiki/Greengenes-formatted_databases). We use the above
files](/wiki/RDP_reference_files), [Greengenes-formatted
databases](/wiki/Greengenes-formatted_databases), and [Greengenes2-formatted
databases](/wiki/greengenes2-formatted_databases). We use the above
files because they're compact and do a pretty good job. The various
classification references perform differently with different sample
types so your mileage may vary. It is generally easiest to decompress
Expand Down
3 changes: 3 additions & 0 deletions _wiki/taxonomy_outline.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,6 @@ You can download our version of the..
files](/wiki/Greengenes-formatted_databases): The fasta and
taxonomic outline that greengenes uses with their classifier and can
be used with the Bayesian classifier
- [ greengenes2 reference
files](/wiki/greengenes2-formatted_databases): The fasta and
taxonomic outline that was modified by McDonald et al.

0 comments on commit 3bfd4d2

Please sign in to comment.