GeneCore, a trusted data producer at EMBL¶
EMBL hosts several data producers, including Core Facilities like GeneCore (Genomics Core Facility). GeneCore is the primary source of genomic sequencing data at EMBL. It functions as a service, providing cutting-edge sequencing data to EMBL scientists. As so, it has collaborated early on with LabID, as part of our trusted data provider concept.
GeneCore and LabID collaboration¶
GeneCore provides LabID with a trusted source of data. In return, LabID provides the advanced interface needed by the end users to register, annotate and manage this data (e.g. browse, archive, delete).
GeneCore is considered a trusted source, not only because they effectively are a team of high-skill technicians and scientists that can be trusted, but also because they provide the data alongside with an extensive set of metadata information about the data and the performed scientific processes (instrument, run type, etc.). We have developed a dedicated import pipeline, effectively by-passing safeguards set by LabID when manually importing data from other sources.
Data from trusted sources is managed and comes with additional protection
Data (assay and associated datasets) registered from trusted sources is flagged managed. As such, the data can only be deleted by the LabID administrator. Even the data owner does not have the privilege to delete it ; it is still possible to remove the data files from disk if there are (1) QC-flagged failed or (2) a back up copy exists (e.g. on tape). This ensures that every single assay received from a trusted source never disappears. This contrasts with non-managed data that were manually loaded and can be deleted by their owner.
When registered, the managed data is not only available from the web interface, it is also placed into the group repository, which means it remains directly accessible from the file-system.
Interfacing more trusted data providers
LabID is actively involved in discussion with other data providers at EMBL. We are willing to integrate more of them in future releases and developed adapted pipelines for seamless registration and advanced management. As of now, we also provide manual import of several raw data types.
Video: Registering NGS data from GeneCore ~18min¶
This video explains how to register your Illumina Sequencing data.
The user interface on this video is outdated, but the concepts remain
Many thanks to GeneCore, in particular to Jan Provaznik and Jonathan Landry for the collaboration on the automated data transfer procedure.