The metabarcoding versions of Diat.barcode v9 database were curated for a specific use following a procedure based on our own expertise. The original database, Diat.barcode v9, is the reference and you can curate it differently to meet your personal requirements. We withdraw ourselves from any responsibility in the event of misuse of this database. Diat.barcode v9 was adapted for metabarcoding on the following way: 1/ The 263bp rbcL barcode (length after substraction of Diat_rbcL_108F and R3 primers) was extracted from the initial Diat.barcode v9 database rbcL alignment; 2/ Non-diatom taxa that can be amplified with the used primers were included; 3/ Sequences were realigned using the MUSCLE algorithm; 4/ Sequences that contain ambiguous bases and homopolymers (>8) were deleted; 5/ Potentially conflicting species names were harmonized (e.g. "aff." and "cf." removed, "Nanofrustulum_sp._SZCZCH285" transformed to "Nanofrustulum_sp."); 6/ Sequences that are identical on the 263bp rbcL barcode region were dereplicated (duplicated sequences were removed); 7/ The taxonomy for conflicting cases was harmonized with help of expert knowledge and unique ID was given to each unique sequence; 8/ Ready-to-use metabarcoding versions were prepared for DADA2 and Mothur. Output files: 1/ for DADA2: - diat_barcode_v9_tax_assign_dada2.gz (compressed fasta file containing the DNA sequences and the corresponding taxonomy for standard taxonomic assignation) - diat_barcode_v9_sp_assign_dada2.gz (compressed fasta file containing the DNA sequences and the species assignment for exact species assignation) 2/ for Mothur (common sequence ID in both files): - diat_barcode_v9_263bp_mothur.fasta (fasta file containing the DNA sequences) - diat_barcode_v9_263bp_mothur_tax.txt (text file containing the corresponding taxonomy) 3/ correspondence file: - v9_correspondence.csv (csv file serving as a correspondence between the original Diat.barcode v9 and the version adapted for metabarcoding) • ID_seq_original: Sequence ID of the original Diat.barcode v9; • ID_seq_barcoding: unique ID for dereplicated sequences used in the versions adapted for metabarcoding (see point 7 above); • sequence: the 263 bp barcode (see point 1 above); • species_v9: species name as in the original Diat.barcode v9, column "Species"; • species_v9_spellcheck: species name after spellcheck, harmonization of conflicting names (see point 5 above) and replacement of spaces with "_"; • species_compromise: species corrected with the help of expert knowledge after taxonomic harmonization of identical sequences (see point 7 above) • omni_code: code omnidia corresponding to the species_compromise • cf_v2: biovolume correction factor (modified after Vasselon et al. 2018) corresponding to the species_compromise 4/ readme: - readme_diat_bc_v9_dada2_mothur.txt (text file containing the description of the procedure used to adapt Diat.barcode v9 for metabarcoding and the output files)