Details
-
Feature
-
Not Assigned
-
None
-
None
-
None
-
SRCnet
-
-
-
-
-
24.2
Description
There are two methods for implementation that use the SRC DM prototype APIs and related services to make the data transfer.
Red items are new never before tried, blue needs to be developed anew and green have been done before.
1) Non deterministic Aus object storage RSE (total copies of data in Aus = 1)
- Set up a non-deterministic object storage RSE that corresponds to the object store hosting the AAVS3 data.
- Ingestion service modified to poll the OS based on a parent bucket (??), register the data with a PFN and DID, and add the metadata to the file.
- Ingestion service runs 'next to' Acacia
- Set up a new RSE in the SDH&P tenant (for accounting purposes.) with volumes mounted of size (??).
- Initiate a TPC between AUS_OS RSE and SDH&P RSE using the DM API.
- Download the data and reassemble into a familiar structure (?? )
2) Deterministic file system based Aus RSE (total copies of data in Aus = 3)
- Data moved from OS to a staging area which looks like a typical file system
- Ingestion client modified to be able to consume the extended Rucio client in order to ingest hierarchical data and set the corresponding metadata
- Ingestion service runs 'next to' the file system based staging area and sets metadata such that all files in the hierarchy have metadata
- Set up a new RSE in the SDH&P tenant (for accounting purposes.) with volumes mounted of size (??).
- Initiate a TPC between AUS RSE and SDH&P RSE using the DM API
- Run the download client on the SDH&P RSE to build the data in the expected structure.
There is a many new things that we will be trying for the first time or developing from scratch, but timeboxing this feature with a backup option for achieving this can be setting up an alternative service like bbcp for data transfer.
Assumption: a) There are no significant data privacy requirements on this data. Access will be via valid OIDC tokens but data will be accessible to the wider SRCNet ecosystem and people. b) There is a small amount of resource available 'next' to the OS to run ingestion services.