Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-3542

Perform TPC and/or streamed transfers to/from an S3 storage endpoint in the Rucio data lake

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • SRCnet
    • Hide

      Exploring Rucio support for S3 storage has been ongoing over the last several PIs and we have a (Swiss) S3 endpoint added to the datalake that supports uploads. However, until we look at how data transfers can be done in/out of this endpoint, it will remain a bit of an island. This feature attempts to start exploring how TPC might work between an S3 endpoint and a more traditional posix like storage endpoint. 
      Auth flows are considerably different for the two storage endpoint types (signed URLs vs user tokens) so it is important to understand how the different layers like Rucio and FTS deal with this. 

      Object storage is increasingly becoming more commonplace, several infra providers are prioritising deploying and offering object storage, thus we need to have an understanding of whether this type of storage can be fully integrated with Rucio. Identifying problems soon if they exist will be important. Additionally, there is some AAVS data that might be ported into an object storage in Pawsey, and if this is an indication that Observatory data might at its source be stored in an object store, we need to start understanding if this is a problem.

      Show
      Exploring Rucio support for S3 storage has been ongoing over the last several PIs and we have a (Swiss) S3 endpoint added to the datalake that supports uploads. However, until we look at how data transfers can be done in/out of this endpoint, it will remain a bit of an island. This feature attempts to start exploring how TPC might work between an S3 endpoint and a more traditional posix like storage endpoint.  Auth flows are considerably different for the two storage endpoint types (signed URLs vs user tokens) so it is important to understand how the different layers like Rucio and FTS deal with this.  Object storage is increasingly becoming more commonplace, several infra providers are prioritising deploying and offering object storage, thus we need to have an understanding of whether this type of storage can be fully integrated with Rucio. Identifying problems soon if they exist will be important. Additionally, there is some AAVS data that might be ported into an object storage in Pawsey, and if this is an indication that Observatory data might at its source be stored in an object store, we need to start understanding if this is a problem.
    • Hide

      Demo that illustrates Rucio based data movement between an S3 and non-S3 endpoint using TPC, and streaming transfers if TPC is technically not feasible OR a document outlining hurdles in doing the same, why TPC didn't work (if it didn't work) and  applicable development work needed in the software stack involved (gfal, Rucio, FTS, etc) is identified.

      Show
      Demo that illustrates Rucio based data movement between an S3 and non-S3 endpoint using TPC, and streaming transfers if TPC is technically not feasible OR a document outlining hurdles in doing the same, why TPC didn't work (if it didn't work) and  applicable development work needed in the software stack involved (gfal, Rucio, FTS, etc) is identified.
    • 1
    • 1
    • 0
    • Team_MAGENTA
    • Sprint 4
    • Hide

      Demo: https://confluence.skatelescope.org/display/SRCSC/2024-02-08+SRC+ART+System+Demo+21.5+Part+1+AM 

      Note: we opted not to perform a streamed transfer with FTS. This would be hard to scale and we will eventually need our collaborators in the FTS team to provide a fix as soon as reasonably possible.

      (Extended notes at: https://confluence.skatelescope.org/display/SRCSC/SP-3542+Rucio+S3+TPC+transfers ) 

      Show
      Demo: https://confluence.skatelescope.org/display/SRCSC/2024-02-08+SRC+ART+System+Demo+21.5+Part+1+AM   Note: we opted not to perform a streamed transfer with FTS. This would be hard to scale and we will eventually need our collaborators in the FTS team to provide a fix as soon as reasonably possible. (Extended notes at: https://confluence.skatelescope.org/display/SRCSC/SP-3542+Rucio+S3+TPC+transfers )  
    • 21.6
    • Stories Completed, Demonstrated, Satisfies Acceptance Criteria, Accepted by FO
    • PI23 - UNCOVERED

    • PI19-PB PI20-PB PI21-PB

    Description

      Try to move data in/out of an S3 RSE to a non-S3 RSE. Perform TPC and streaming transfers on the FTS instance being used to move Rucio data.

      This might be best done once an SRCNet FTS instance is up and running as we will have more control over it and communication will be less async.

      Attachments

        Issue Links

          Structure

            Activity

              People

                r.bolton Bolton, Rosie
                r.joshi Joshi, Rohini
                Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (100.00%)

                  Feature Estimate: 1.0

                  IssuesStory Points
                  To Do00.0
                  In Progress   00.0
                  Complete68.0
                  Total68.0

                  Dates

                    Created:
                    Updated:
                    Resolved:

                    Structure Helper Panel