Archiving Datasets¶
The Archive feature can be used to send a list of datasets to an archiving solution, which at EMBL is a tape archiving. Files stored here are kept for at least 10 years. Once datasets are archived and no longer used, they can be safely removed from the disk to free up storage. We can perform an Archive operation from Projects, Studies, and Datasets page. Let's go through the whole process of initiating one archive from the Datasets page.
Bulk archive button¶
The Archive button can be found on the dataset list page, next to the other bulk-action buttons e.g. edit, delete, share, etc.
By default these bulk-action buttons remain disabled; you can enable them by checking one or multiple items from the table.
After clicking on the Archive button a details view will show up, with a list of datafiles associated with the selected datasets. Let's dig deep into each of the sections of this view:
Section A: Datafiles Info¶
In this section, we have a table and each of the rows represents one datafile, its size, status, and if it's archivable or not. These datafiles are grouped by Datasets. Remember, a Dataset can consist of one or more Datafiles, which in turn can have one or more datafile copies.
Section B: Archive Information¶
You as a user need to fill in this form before creating an archive. - Name: A relevant name that might help you to identify the archive in the future. - Billing Details: Refers to the Budget Number which will be used to charge for the archiving service. In general, this will be your group's budget number. - Description: We're automatically injecting a description, but you can edit or add to this, to describe the archive better. - Checkbox: If some of the selected datafiles have been archived before, you can opt to remove those datafiles from this archive. This will save you costs, but it might make it harder to retrieve these files in the future. This is especially true if you are archiving a whole Project or Study, where you may want to keep all the files together in one big archive.
Section C: Confirmation¶
After reviewing the dataset info in Section A, and filling in all the necessary details in Section B; it's time to confirm and create the archive. You just need to check the checkbox and hit the button labeled Archive. If the job is successfully started, you will be directed to the Archive detail page:
The archive is an asynchronous job, meaning it will run in the background. LabID will send you an email when the archive is done or further action is needed.
You can also track the status of your archive job from the user tasks page. You can find an icon in the top-right corner of the screen along with the cart icon:
If there are any unfinished tasks you will see a number displayed on the icon. If you follow the icon you will get to the user tasks page and see the info related to tasks you've started: