Skip to content

LabID Workflow System

This section provides a comprehensive overview of LabID's workflow management system and serves as a navigation guide to all workflow-related documentation.

LabID's workflow system manages computational workflows through Git-based version control with integration to external platforms like WorkflowHub and Galaxy. Rather than executing workflows directly, the system functions as a registry and metadata repository, handling workflow organization, version control, and metadata management while delegating execution to dedicated workflow engines. This approach enables workflows to be linked with datasets, permissions, and other components for comprehensive research data management, allowing researchers to maintain provenance records linking data inputs, workflow runs, and outputs.

Detailed user documentation starts with the Workflows and Workflow Runs submenus below.

Key Features

  • Git-Based Version Control: Workflows are stored with Git-based version tracking and release-based versioning with commit-hash integrity
  • Workflow Storage and Organization: Store and version workflow files with semantic file type classification (MAIN, CONFIG, TEST, etc.)
  • Dual Workflow Types: Support for both "manual" workflows (uploaded directly) and "imported" workflows (from external Git repositories)
  • Metadata Management: Track workflow metadata, versions, and relationships between components
  • Standards-Compliant Export: Export workflows as RO-Crates and publish them to WorkflowHub
  • Dataset Provenance: Link workflow runs to input/output datasets and record execution metadata
  • External Integration: Connect with workflow engines like Galaxy for importing workflow metadata and invocations

Common Use Cases

Research Workflow Development

Scenario: Developing a new bioinformatics pipeline

  1. Create Workflow: Upload existing workflow files to create new manual workflow
  2. Iterative Refinement: Upload additional files, test, modify
  3. Version Management: Release stable versions
  4. Documentation: Include README and test files
  5. Publication: Export RO-Crate for sharing

Collaborative Workflow Sharing

Scenario: Sharing workflows across research teams

  1. Git Integration: Import from team's Git repository
  2. Selective Tracking: Choose relevant files to track
  3. Local Customization: Add local configurations
  4. Sync Updates: Pull updates from upstream
  5. Community Sharing: Publish to WorkflowHub

Workflow Run Metadata Tracking

Scenario: Recording workflow runs and dataset provenance

  1. Workflow Setup: Ensure workflow is properly versioned
  2. Run Registration: Create WorkflowRun records to document external executions
  3. Dataset Linking: Associate input/output datasets from completed runs
  4. Provenance: Maintain complete execution history and metadata
  5. Reproducibility: Support workflow reproduction through version control and metadata

Execution Delegation

LabID does not execute workflows directly. It serves as a registry for workflow metadata and run records. Actual workflow execution happens in external systems (Galaxy, local environments, HPC clusters, etc.), and the results are registered in LabID for provenance tracking.

Troubleshooting Quick Reference

Common Issues

Issue Likely Cause Solution
Cannot commit version No file changes Upload or modify files first
Publishing fails Missing MAIN file Designate exactly one MAIN file
Git import fails SSH key issues Check SSH configuration
Large repository Too many files Use .gitignore or selective import
Slow operations Large repositories or database issues Check repository size and database health

🔧 Technical Guide

The Technical reference is where developers and system architects can learn about:

🛠️ Administration Guide

Admins will find all details about system setup, configuration, and maintenance in Complete administrative guide; i.e.:

  • System requirements and dependencies
  • Git repository management and SSH configuration
  • WorkflowHub integration setup
  • System monitoring and maintenance
  • Security configuration and troubleshooting