Reads and pairwise distances from 10 samples of diatoms in Geneva lake

Fiche du document

Date

21 février 2023

Discipline
Type de document
Identifiant



Citer ce document

Alain Franc et al., « Reads and pairwise distances from 10 samples of diatoms in Geneva lake », Recherche Data Gouv, ID : 10.57745/NKTRHO


Métriques


Partage / Export

Résumé 0

This dataset contains 55 hdf5 files related to 10 samples (one per month) of benthic diatoms collected in Geneva lake at monthly interval in the same location (close to UMR Carrtel on the shore of the lake). For each sample, DNA has been extracted, a fragment amplified (a marker of 312 bp in rbcL fragment), and sequenced. Next, all pairwise distances between reads have been computed (from Smith-Waterman local alignment score), within and between samples. This has led to 55 hdf5 files organized each as follows as far as h5 datasets are concerned: sequence identifiers (seqid): one h5 dataset if within a sample, two if between samples sequences (word): one h5 dataset if within a sample, two if between samples pairwise distances between sequences (h5 dataset distances). Pairwise distances have been computed through DARI project i2015037360 (8 millions of hours, 2016, give, to AF) at IDRIS on Turing and Ada machines. As there are 10 samples, there are 10 files for within sample distances, and 45 files (n(n-1)/2 with n=10) for between samples istances. There are 55 hdf5 samples, labeled L1 to L10 within each sample, and Lx_Ly beween samples, with x < y . (Note that the files are ordered according to lexicographic order of their names).

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en