Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-3542

Perform TPC and/or streamed transfers to/from an S3 storage endpoint in the Rucio data lake

Change Owns to Parent OfsSet start and due date...


    • SRCnet
    • Hide

      Exploring Rucio support for S3 storage has been ongoing over the last several PIs and we have a (Swiss) S3 endpoint added to the datalake that supports uploads. However, until we look at how data transfers can be done in/out of this endpoint, it will remain a bit of an island. This feature attempts to start exploring how TPC might work between an S3 endpoint and a more traditional posix like storage endpoint. 
      Auth flows are considerably different for the two storage endpoint types (signed URLs vs user tokens) so it is important to understand how the different layers like Rucio and FTS deal with this. 

      Object storage is increasingly becoming more commonplace, several infra providers are prioritising deploying and offering object storage, thus we need to have an understanding of whether this type of storage can be fully integrated with Rucio. Identifying problems soon if they exist will be important. Additionally, there is some AAVS data that might be ported into an object storage in Pawsey, and if this is an indication that Observatory data might at its source be stored in an object store, we need to start understanding if this is a problem.

      Exploring Rucio support for S3 storage has been ongoing over the last several PIs and we have a (Swiss) S3 endpoint added to the datalake that supports uploads. However, until we look at how data transfers can be done in/out of this endpoint, it will remain a bit of an island. This feature attempts to start exploring how TPC might work between an S3 endpoint and a more traditional posix like storage endpoint.  Auth flows are considerably different for the two storage endpoint types (signed URLs vs user tokens) so it is important to understand how the different layers like Rucio and FTS deal with this.  Object storage is increasingly becoming more commonplace, several infra providers are prioritising deploying and offering object storage, thus we need to have an understanding of whether this type of storage can be fully integrated with Rucio. Identifying problems soon if they exist will be important. Additionally, there is some AAVS data that might be ported into an object storage in Pawsey, and if this is an indication that Observatory data might at its source be stored in an object store, we need to start understanding if this is a problem.
    • Hide

      Demo that illustrates Rucio based data movement between an S3 and non-S3 endpoint using TPC, and streaming transfers if TPC is technically not feasible OR a document outlining hurdles in doing the same, why TPC didn't work (if it didn't work) and  applicable development work needed in the software stack involved (gfal, Rucio, FTS, etc) is identified.

      Demo that illustrates Rucio based data movement between an S3 and non-S3 endpoint using TPC, and streaming transfers if TPC is technically not feasible OR a document outlining hurdles in doing the same, why TPC didn't work (if it didn't work) and  applicable development work needed in the software stack involved (gfal, Rucio, FTS, etc) is identified.
    • 1
    • 1
    • 0
    • Team_MAGENTA
    • Sprint 4
    • Hide

      Demo: https://confluence.skatelescope.org/display/SRCSC/2024-02-08+SRC+ART+System+Demo+21.5+Part+1+AM 

      Note: we opted not to perform a streamed transfer with FTS. This would be hard to scale and we will eventually need our collaborators in the FTS team to provide a fix as soon as reasonably possible.

      (Extended notes at: https://confluence.skatelescope.org/display/SRCSC/SP-3542+Rucio+S3+TPC+transfers ) 

      Demo: https://confluence.skatelescope.org/display/SRCSC/2024-02-08+SRC+ART+System+Demo+21.5+Part+1+AM   Note: we opted not to perform a streamed transfer with FTS. This would be hard to scale and we will eventually need our collaborators in the FTS team to provide a fix as soon as reasonably possible. (Extended notes at: https://confluence.skatelescope.org/display/SRCSC/SP-3542+Rucio+S3+TPC+transfers )  
    • 21.6
    • Stories Completed, Demonstrated, Satisfies Acceptance Criteria, Accepted by FO
    • PI23 - UNCOVERED

    • PI19-PB PI20-PB PI21-PB


      Try to move data in/out of an S3 RSE to a non-S3 RSE. Perform TPC and streaming transfers on the FTS instance being used to move Rucio data.

      This might be best done once an SRCNet FTS instance is up and running as we will have more control over it and communication will be less async.


        Issue Links




                r.bolton Bolton, Rosie
                r.joshi Joshi, Rohini
                0 Vote for this issue
                1 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (100.00%)

                  Feature Estimate: 1.0

                  IssuesStory Points
                  To Do00.0
                  In Progress   00.0



                    Structure Helper Panel