Skip to content

GeneCore, a trusted data producer at EMBL

EMBL hosts several data producers, including Core Facilities like GeneCore (Genomics Core Facility). GeneCore is the primary source of genomic sequencing data at EMBL. It functions as a service, providing cutting-edge sequencing data to EMBL scientists. As so, it has collaborated early on with LabID, as part of our trusted data provider concept.

GeneCore and LabID collaboration

GeneCore provides LabID with a trusted source of data. In return, LabID provides the advanced interface needed by the end users to register, annotate and manage this data (e.g. browse, archive, delete).

GeneCore is considered a trusted source, not only because they effectively are a team of high-skill technicians and scientists that can be trusted, but also because they provide the data alongside with an extensive set of metadata information about the data and the performed scientific processes (instrument, run type, etc.). We have developed a dedicated import pipeline, effectively by-passing safeguards set by LabID when manually importing data from other sources.

Data from trusted sources is managed and comes with additional protection

Data (assay and associated datasets) registered from trusted sources is flagged managed. As such, the data can only be deleted by the LabID administrator. Even the data owner does not have the privilege to delete it ; it is still possible to remove the data files from disk if there are (1) QC-flagged failed or (2) a back up copy exists (e.g. on tape). This ensures that every single assay received from a trusted source never disappears. This contrasts with non-managed data that were manually loaded and can be deleted by their owner.

When registered, the managed data is not only available from the web interface, it is also placed into the group repository, which means it remains directly accessible from the file-system.

Interfacing more trusted data providers

LabID is actively involved in discussion with other data providers at EMBL. We are willing to integrate more of them in future releases and developed adapted pipelines for seamless registration and advanced management. As of now, we also provide manual import of several raw data types.

Video: Registering NGS data from GeneCore ~18min

This video explains how to register your Illumina Sequencing data.

The user interface on this video is outdated, but the concepts remain

Many thanks to GeneCore, in particular to Jan Provaznik and Jonathan Landry for the collaboration on the automated data transfer procedure.