Here is all the databases we formatted (on demand) for RDPClassifier and NCBI Blast+ Please be carefull on database licence and how to cite. For Silva databases, we propose reduced version based on a pintail score threshold. Le pintail score indicate the quality of the sequence. See https://www.arb-silva.de/documentation/faqs/ Section : "What do the green, yellow and orange quality bars tell me?" for a brief explanation or http://aem.asm.org/content/71/12/7724.abstract, for the pintail score paper. ##### SILVA/ 16S/ --> data related to the SILVA 16S database (https://www.arb-silva.de/download/arb-files/) silva_138.1_16S.tar.gz --> data related to the version 138.1 silva_138.1_16S_pintail50.tar.gz --> data related to the version 138.1, filtered on pintail score >=50 silva_138.1_16S_pintail80.tar.gz --> data related to the version 138.1, filtered on pintail score >= 80 silva_138.1_16S_pintail100.tar.gz -->data related to the version 138.1, filtered on pintail score = 100 silva_138_16S.tar.gz --> data related to the version 138 silva_138_16S_pintail50.tar.gz --> data related to the version 138, filtered on pintail score >=50 silva_138_16S_pintail80.tar.gz --> data related to the version 138, filtered on pintail score >= 80 silva_138_16S_pintail100.tar.gz -->data related to the version 138, filtered on pintail score = 100 silva_132_16S.tar.gz --> data related to the version 132 silva_132_16S_pintail50.tar.gz --> data related to the version 132, filtered on pintail score >=50 silva_132_16S_pintail80.tar.gz --> data related to the version 132, filtered on pintail score >= 80 silva_132_16S_pintail100.tar.gz -->data related to the version 132, filtered on pintail score = 100 silva_128_16S.tar.gz --> data related to the version 128 silva_128_16S_pintail50.tar.gz --> data related to the version 128, filtered on pintail score >= 50. silva_128_16S_pintail80.tar.gz --> data related to the version 128, filtered on pintail score >= 80. silva_128_16S_pintail100.tar.gz --> data related to the version 128, filtered on pintail score = 100. silva_123_16S.tar.gz --> data related to the version 123 18S/ --> data related to the SILVA 18S database (https://www.arb-silva.de/download/arb-files/) silva_138.1_18S.tar.gz --> data related to the version 138.1 silva_138_18S.tar.gz --> data related to the version 138 silva_132_18S.tar.gz --> data related to the version 132 silva_128_18S.tar.gz --> data related to the version 128 silva_123_18S.tar.gz --> data related to the version 123 silva_119-1_18S.tar.gz --> data related to the version 119-1 SSU/ --> data related to the SILVA SSU database (https://www.arb-silva.de/download/arb-files/) silva_138_SSU.tar.gz --> data related to the version 138 23S/--> data related to the SILVA 23S database (https://www.arb-silva.de/download/arb-files/) silva_138.1_23S.tar.gz --> data related to the version 138.1 silva_132_23S.tar.gz --> data related to the version 132 silva_128_23S.tar.gz --> data related to the version 128 silva_123_23S.tar.gz --> data related to the version 123 28S/ --> data related to the SILVA 28S database (https://www.arb-silva.de/download/arb-files/) SILVA_138.1_28S.tar.gz --> data related to the version 138.1 SILVA_132_28S.tar.gz --> data related to the version 132 LSU/ --> data related to the SILVA LSU database (https://www.arb-silva.de/download/arb-files/) SILVA_132_LSU.tar.gz --> data related to the version 132 ##### Greengenes/ --> 16S data related to the greengenes database (http://greengenes.secondgenome.com/) greengenes_13_5.tar.gz --> data related to the version 13.5 ##### DAIRYdb/ --> 16S data related to the DAIRYdb database (16S rRNA gene sequences from dairy products, https://github.com/marcomeola/DAIRYdb) DAIRYdb_v1.1.2.tar.gz --> data related to the version 1.1.2 DAIRYdb_v1.2.4_20200604.tar.gz --> data related to the version v1.2.4_20200604 DAIRYdb_v2.0_20210401.tar.gz --> data related to the version v2.0_20210401 ##### EZBioCloud/ --> 16S data related to EZBioCloud database (https://www.ezbiocloud.net/resources/16s_download) EZBioCloud_052018.tar.gz --> release 05/2018 ##### PR2/ --> 18S data related to the The Protist Ribosomal Reference (PR2) database (https://github.com/vaulot/pr2_database/releases) pr2_gb203_4.5.tar.gz --> data related to the version v4.5 pr2_4.11.0 --> data related to the version v4.11.0 pr2_4.12.0 --> data related to the version v4.12.0 pr2_4.13.0 --> data related to the version v4.13.0 PR2_4.14.0 --> data related to the version v4.14.0 PR2_5.0.1 --> data related to the version v5.0.1 ##### Unite/ --> data related to the UNITE ITS database https://unite.ut.ee/) Unite_s_7.1_20112016_ITS.tar.gz --> data related to UNITE 7.1 database Unite_Fungi_8.0_18112018.tar.gz --> data related to UNITE 8.0 database focused on fungal species Unite_Euka_8.0_18112018.tar.gz --> data related to UNITE 8.0 database for all eukaryote species Unite_Fungi_8.2_20200204.tar.gz --> data related to UNITE 8.2 database focused on fungal species Unite_Euka_8.2_20200204.tar.gz --> data related to UNITE 8.2 database for all eukaryote species Unite_Fungi_8.3_20210510.tar.gz --> data related to UNITE 8.3 database focused on fungal species Unite_Euka_8.3_20210510.tar.gz --> data related to UNITE 8.3 database for all eukaryote species Unite_Fungi_9.0_20221016.tar.gz --> data related to UNITE 9.0 database focused on fungal species Unite_Euka_9.0_20221016.tar.gz --> data related to UNITE 9.0 database for all eukaryote species ##### MiDas/ (Microbial Database for Activated Sludge : http://www.midasfieldguide.org/ ) MiDAS_S119_1.20.tar.gz --> data related to the MiDAS S119 1.20 database (based on Silva.119) MiDAS_S123_2.1.3.tar.gz --> data related to the MiDAS S123 2.1.3 database (based on Silva 123) MiDAS_S132_3.6.tar.gz --> data related to the MiDAS S132 3.6 database (based on Silva 132) MiDAS_S138.1_v4.8.1.tar.gz --> data related to the MiDAS S138 4.8.1 database (based on Silva 138) MiDAS_v5.0.tar.gz --> data related to the MiDAS 5.0 database ##### rpoB/ (for rpoB marker, not yet published, see rpoB/README.txt) rpoB_122017.tar.gz ##### Diat.barcode/ (for rbcL diatoms barcode: https://www6.inrae.fr/carrtel-collection_eng/Barcoding-database ) initially named RSyst::diatom RSyst_Diatom_7.tar.gz --> data related to the R-Syst::Diatom version 7 (http://138.102.89.206/new_rsyst_alg/) Diat.barcode_rbcL_10.1.tar.gz --> data related to the Diat.barcode version 10.0 (https://www6.inrae.fr/carrtel-collection_eng/Barcoding-database/Database-download) ##### PHYMYCO_DB/ (EF1 and 18S fungal DNA markers: http://phymycodb.genouest.org/) PHYMYCO-DB_2013.tar.gz --> data related to the curated version of 2013. ##### COI/ --> reference databases relative to COI amplicon genes BOLD_COI-5P/ --> data related to BOLD database (http://v3.boldsystems.org) with selection on Phyla BOLD_COI-5P_022019 --> version downloaded February-2019 (see readme for phyla selected) BOLD_COI-5P_1percentN_022019 --> version downloaded February-2019, filtered on a maximum 1% of N (see readme for phyla selected) BOLD_COI-5P_1percentN_630nt_022019 --> version downloaded February-2019, filtered on a maximum 1% of N and minimum length of 630nt (see readme for phyla selected) BOLD_COI-5P_marin_052022 --> version downloaded May-2022 (see readme for taxon selected) BOLD_COI-5P_082023 --> version downloaded August-2023 (see readme for phyla selected) BOLD_COI-5P_marin_20230913 --> version downloaded September-2023 (see readme for taxon selected) MIDORI/ --> COI reference database relative to MIDORI database (http://reference-midori.info/index.html) MIDORI_LONGEST_SP_COI_GB242 --> version GB242 of longest species sequences MIDORI_UNIQUE_COI_20180221 --> version 20180221, uniq amplicon sequences MIDORI_UNIQUE_COI_MARINE_20180221 --> version 20180221, uniq amplicon sequences restricted to marine organisms MIDORI_UNIQUE_SP_COI_GB249 --> version GB249 of uniq amplicon sequences MIDORI_LONGEST_SP_COI_GB249 --> version GB242 of longest species sequences MIDORI2_LONGEST_SP_COI_GB253 --> version GB253 of longest species sequences MIDORI2_UNIQ_SP_COI_GB253 --> version GB253 of uniq amplicon sequences COInr/ --> COI reference database from https://github.com/meglecz/mkCOInr COInr_2022_05_06 --> version published zenodo May 17, 2022 #### MaarjAM/ --> reference database relatve to 18S or 28S amplicon genes of arbuscular mycorrhizal fungi (Glomeromycota) MaarjAM_18S_05-06-2019 --> MaarjAM version 05-06-2019 of 18S sequences MaarjAM_28S_25-05-2019 --> MaarjAM version 25-05-2019 of 28S sequences #### rbcL/ --> reference database relative to rbcL amplicons KBell_plant_rbcL_2021-07.tar.gz --> version 2021 (see readme.txt) #### microgreen-db/ --> reference database relative to 23S amplicons of photosynthetic eukaryotic algae and cyanobacteria associated with 3 different taxonomies microgreen-db_algae_v1.2.tar.gz --> version 1.2 with algae taxonomies (see readme.txt) microgreen-db_ncbi_v1.2.tar.gz --> version 1.2 with NCBI taxonomies (see readme.txt) microgreen-db_pr2-silva_v1.2.tar.gz --> version 1.2 with PR2/Silva taxonomies (see readme.txt) #### REFSeq/ --> data related to the REFSeq database (https://www.ncbi.nlm.nih.gov/refseq/targetedloci/) NCBIdb_archaea_16S_v1.20230726.tar.gz --> version 1 or Archaea sequences downloaded on 20230726 NCBIdb_bacteria_16S_v1.20230726.tar.gz --> version 1 or Bacteria sequences downloaded on 20230726 #### PSH_Laffon/ --> personal database of IST2 and rbcL amplicon sequences from plant species commonly found in apple orchards in the Lower Durance Valley ITS2_PSH_laffon_v1_20230731.tar.gz --> version 1 of ITS2 sequences created on 2023 07 31 rbcL_PSH_laffon_v1_20230731.tar.gz --> version 1 of rbcL sequences created on 2023 07 31 #### GTDB/ --> long rRNA sequences (16S-ITS-23S) extracted from the GTDB database (https://gtdb.ecogenomic.org/) 16S-ITS-23S-DB_GTDB_08-RS214.tar.gz --> version related to the GTDB v214 version #### 12S/ --> database relative to the 12S gene mitochondrial_12S_Tasmania_20240226.tar.gz --> personnal NCBI selection of ribosomal mitochondrial sequences from Tasmanian vertebrates