Hands-on: Python Client - Exclusive Features¶
- 15 min
- Expert
Overview
In this hands-on we will explore the features that are currently exclusive to the CLI.
- Export of a Study - as a basis for an ArrayExpress data submission
- Linking files on disk
Walkthrough¶
Prerequisites
The labid-cli client package (and jq) has to be installed on a unix (e.g. linux, mac) machine. See the 101 Step2
In the 101 Getting Started with CLI we setup the client to query the training instance, here we will need to reconfigure the client to talk to our production instance at https://labid-demo.embl.de
Step 1. Configure another instance¶
- Execute the following
stocks config --url https://labid-demo.embl.de --username <username> # replace <username> by your username
stocks config --show
Switch instance
If you need to switch to another instance, you can ru e.g. stocks config --url <other_instance_url>.
Step 2. ArrayExpress like export¶
ArrayExpress like export
The export is still a few updates away from being a ready-to-use submission tool. However, it has proven to be a good starting point to check that users have properly annotated their samples and datasets. With a few adjustments the export XLS can be used to submit the (meta)data to external portals.
For this exercise we will use an already published study that has been annotated on our production instance.
- Using the UI locate the Study ATAC-seq of in vitro reprogrammed mouse pericytes to oligodendrocytes (
3fe9b49a-93df-43a2-b81e-1ee2b76195e5) - This Study is published and has been marked
publicin LabID, so that we, once logged in, can all view this item. - Take note of the associated datasets, the samples and assays linked. Also appreciate how well the samples have been annotated with information about the material type, sequencing prep and a succession of applied protocols.
- Also notice, that the Study is annotated with an Array Express ID. Googling this ID will lead you to the published Study at EBI under E-MTAB-8452
API Endpoint
This Study ID will be sent to the API endpoint /api/v2/data_management/datasets/arrayexpress_export by the client. The result will be stored as csv or xlsx file depending on the choice made by us.
- Execute the following to generate the XLS
labid export study --study 3fe9b49a-93df-43a2-b81e-1ee2b76195e5 --format tabular --odir . --filename output.csv
Inspecting the resulting file you will see the aggregated data of this Study.
Step 3. Finding your managed datasets¶
At EMBL, data managed by LabID is stored on your group share. Following the general convention /g/<group_name>/LabID/. This directory is typically readable by the group and data is organized by assay names or by date. Although browsable, it's not always convenient or desirable to do that.
The client provides another way to find and work with your data.
Working with your data
Notice that for this exercise, you or your group will need to have data available on LabID. If this is not the case, you can load some data by using the dropbox, or connect to the training instance. Alternatively, you can just follow without executing the steps.
Remember, to switch to the training instance, you can use: stocks config --url <other_instance_url>
- Run the following
# here $USER will be your username, you can replace it with any filter to find your datasets
stocks list datafilecopies --filter owner=$USER --filter fields=id,name,shortname| jq
DatafileCopies
stocks list datafilecopies --filter owner=$USER --filter fields=id,name,shortname| jq.
In the next step, the client would have to be executed on a machine that has access to the filesystem used by LabID. At EMBL, this means, the group share needs to be reachable. For demonstrating the result, we can however execute this on our local machine. But feel free to experiment.
stocks list datafilecopies --filter owner=$USER | jq -r '.results[]|[.uri, .shortname] | @tsv' | xargs -p -n 2 ln -s
Symbolic links
stocks list datafilecopies --filter owner=$USER | jq -r '.results[]|[.uri, .shortname] | @tsv' | xargs -p -n 2 ln -s.
As can be seen from the result, symbolic links have been created using the shortname that now points to the original path on your group share. The heavy lifting is done by jq and some unix pipes.
Closing remarks and call for contributions¶
The commands shown in this hands-on show the potential of consuming the LabID API. The current client can be seen as a starting point, and we welcome any contribution from any of you. Please contact us or directly start hacking the code base :).
Notice that this client can be used in your workflows, or used as a library in any other tool. But be warned, it still has its rough edges.