LabID Workflow System¶
This section provides a comprehensive overview of LabID's workflow management system and serves as a navigation guide to all workflow-related documentation.
LabID's workflow system manages computational workflows through Git-based version control with integration to external platforms like WorkflowHub and Galaxy. Rather than executing workflows directly, the system functions as a registry and metadata repository, handling workflow organization, version control, and metadata management while delegating execution to dedicated workflow engines. This approach enables workflows to be linked with datasets, permissions, and other components for comprehensive research data management, allowing researchers to maintain provenance records linking data inputs, workflow runs, and outputs.
Detailed user documentation starts with the Workflows and Workflow Runs submenus below.
Key Features¶
- Git-Based Version Control: Workflows are stored with Git-based version tracking and release-based versioning with commit-hash integrity
- Workflow Storage and Organization: Store and version workflow files with semantic file type classification (MAIN, CONFIG, TEST, etc.)
- Dual Workflow Types: Support for both "manual" workflows (uploaded directly) and "imported" workflows (from external Git repositories)
- Metadata Management: Track workflow metadata, versions, and relationships between components
- Standards-Compliant Export: Export workflows as RO-Crates and publish them to WorkflowHub
- Dataset Provenance: Link workflow runs to input/output datasets and record execution metadata
- External Integration: Connect with workflow engines like Galaxy for importing workflow metadata and invocations
Common Use Cases¶
Research Workflow Development¶
Scenario: Developing a new bioinformatics pipeline
- Create Workflow: Upload existing workflow files to create new manual workflow
- Iterative Refinement: Upload additional files, test, modify
- Version Management: Release stable versions
- Documentation: Include README and test files
- Publication: Export RO-Crate for sharing
Collaborative Workflow Sharing¶
Scenario: Sharing workflows across research teams
- Git Integration: Import from team's Git repository
- Selective Tracking: Choose relevant files to track
- Local Customization: Add local configurations
- Sync Updates: Pull updates from upstream
- Community Sharing: Publish to WorkflowHub
Workflow Run Metadata Tracking¶
Scenario: Recording workflow runs and dataset provenance
- Workflow Setup: Ensure workflow is properly versioned
- Run Registration: Create WorkflowRun records to document external executions
- Dataset Linking: Associate input/output datasets from completed runs
- Provenance: Maintain complete execution history and metadata
- Reproducibility: Support workflow reproduction through version control and metadata
Execution Delegation
LabID does not execute workflows directly. It serves as a registry for workflow metadata and run records. Actual workflow execution happens in external systems (Galaxy, local environments, HPC clusters, etc.), and the results are registered in LabID for provenance tracking.
Troubleshooting Quick Reference¶
Common Issues¶
| Issue | Likely Cause | Solution |
|---|---|---|
| Cannot commit version | No file changes | Upload or modify files first |
| Publishing fails | Missing MAIN file | Designate exactly one MAIN file |
| Git import fails | SSH key issues | Check SSH configuration |
| Large repository | Too many files | Use .gitignore or selective import |
| Slow operations | Large repositories or database issues | Check repository size and database health |
🔧 Technical Guide¶
The Technical reference is where developers and system architects can learn about:
- System Architecture - Technical implementation details
- File Types and Organization - Complete file type system reference
- Versioning Concept - Version management and Git integration
- Publishing and RO-Crate Export - Standards compliance and export
- External Integrations - Galaxy and other platform integrations
- Workflow Run API - Execution tracking API
🛠️ Administration Guide¶
Admins will find all details about system setup, configuration, and maintenance in Complete administrative guide; i.e.:
- System requirements and dependencies
- Git repository management and SSH configuration
- WorkflowHub integration setup
- System monitoring and maintenance
- Security configuration and troubleshooting