Capturing Workflow Executions in LabID¶
LabID provides support for capturing the execution of computational workflows by allowing you to track the provenance of generated datasets throughout the computational analysis process. This is achieved through the concept of Workflow Runs, which represent the execution of a specific version of a workflow with associated input and output datasets, execution parameters, configuration details, reports, and logs.
Working with Workflow Runs¶
Workflow Runs represent the execution or invocation of a specific workflow version, serving as the link between a WorkflowVersion and its associated input and output datasets in LabID. Each Workflow Run captures the metadata and provenance information for a particular execution instance.
Workflow version associated to runs
While LabID supports associating a WorkflowVersion that was not yet released to a Workflow Run, we recommend first making sure the workflow version is finalized (all relevant files were added...) and releasing it before associating it to a run.
Released workflow versions cannot be changed (except changing types associated to files), so this mechanism ensures that the correct set of files associated to a workflow version are referenced.
Associating non-released versions to workflow runs can be convenient to test functionalities but is not recommended for proper tracking of data-provenance.
Create a Workflow Run from scratch¶
The workflow run list page offers a "New" button to create a Workflow Run using a standard web form capturing the mandatory metadata.
Linking datasets to a workflow run¶
After a WorkflowRun has been created, the edit view can be entered. This page contains tabs organized by data type:
- Configs tab: Shows both WorkflowFiles marked as CONFIG type (from the WorkflowVersion) and WorkflowDatasets marked as CONFIG type (associated with the WorkflowRun)
- Input, Output, Logs & Reports tabs: Show WorkflowDatasets with the corresponding data_type (INPUT, OUTPUT, LOGS, or REPORTS)
Each WorkflowDataset has a data_type that categorizes it within the workflow execution context. Additional WorkflowDatasets can be associated or disassociated in the relevant tabs.
For technical details on the API for registering workflow runs, please refer to the developer documentation.