Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-4274

Develop metadata-based replication tests as part of regular functional testing

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • SRCnet
    • Hide

      As we head towards SRCNet0.1 delivery, more robust test coverage will be required to assess component performance. A fast way to achieve this is full-stack data transfer tests, together with descriptive monitoring infrastructure that flags issues for operators to act on. Therefore increasing the scope of the Rucio automated test campaign is a high-value task.

      Show
      As we head towards SRCNet0.1 delivery, more robust test coverage will be required to assess component performance. A fast way to achieve this is full-stack data transfer tests, together with descriptive monitoring infrastructure that flags issues for operators to act on. Therefore increasing the scope of the Rucio automated test campaign is a high-value task.
    • Hide

      (To be refined at Backlog Prioritisation/PI Planning)

      AC1: Additional tests added to the regular functional test suite which demonstrate additional Rucio functionality (at least subscription-based data transfers)

      AC2: Modifications to the monitoring infrastructure (including dashboard views) to make clear success/failure criteria (and/or performance) for new tests

      AC3: Additional tests added to the regular functional test suite which demonstrate larger scale transfers between sites, including at least one intercontinental link (e.g. AusSRC to Europe)

      Show
      (To be refined at Backlog Prioritisation/PI Planning) AC1: Additional tests added to the regular functional test suite which demonstrate additional Rucio functionality (at least subscription-based data transfers) AC2: Modifications to the monitoring infrastructure (including dashboard views) to make clear success/failure criteria (and/or performance) for new tests AC3: Additional tests added to the regular functional test suite which demonstrate larger scale transfers between sites, including at least one intercontinental link (e.g. AusSRC to Europe)
    • 2
    • 1.5
    • 0
    • Team_MAGENTA
    • Sprint 5
    • Hide

      AC1/2: Test added which demonstrates metadata-based replications, with a simple dashboard view showing tabular results of recent runs (tests are run hourly as part of functional tests suite): https://monit.srcdev.skao.int/grafana/d/cdtfeqszc19tsf/subscription?orgId=1&from=now-24h&to=now 

      AC3: Tests written and merged into main repository (https://gitlab.com/ska-telescope/src/src-dm/ska-src-dm-da-rucio-task-manager/-/tree/aussrc-tests). However, these are not currently running regularly - AusSRC RSE is currently out of action, and we are waiting for the switch to SKA IAM before chasing RSE operator to bring this back up.

      Show
      AC1/2: Test added which demonstrates metadata-based replications, with a simple dashboard view showing tabular results of recent runs (tests are run hourly as part of functional tests suite): https://monit.srcdev.skao.int/grafana/d/cdtfeqszc19tsf/subscription?orgId=1&from=now-24h&to=now   AC3: Tests written and merged into main repository ( https://gitlab.com/ska-telescope/src/src-dm/ska-src-dm-da-rucio-task-manager/-/tree/aussrc-tests). However, these are not currently running regularly - AusSRC RSE is currently out of action, and we are waiting for the switch to SKA IAM before chasing RSE operator to bring this back up.
    • 24.3
    • Outcomes Reviewed
    • PI24 - UNCOVERED

    • SRC23-PB SRCNet0.1 data-ingestion-dissemination-and-replication network-tests team_DAAC tests-compilation

    Description

      To date, Rucio tests built in the rucio-task-manager (https://gitlab.com/ska-telescope/src/ska-rucio-task-manager) have been fairly small scale full mesh functional tests.

      Now that Rucio has been adopted as the DDM solution of choice for SRCNet 0.1, and as sites start to increase the storage resources they make available to the network, it is a good time to increase the scope of the tests running on a regular cadence.

      This could include:

      • Increasing the size of test data files moved around the network
      • Increasing Rucio code coverage through adding e.g. subscription-based transfer tests
      • Moving real/simulated science data (in addition to randomly generated data) around the network

      Corresponding improvements to the monitoring/dashboards should also be made to facilitate easier 'quick-look' assessment of network health and performance.

      Attachments

        Issue Links

          Structure

            Activity

              People

                r.bolton Bolton, Rosie
                j.collinson Collinson, James
                Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (93.33%)

                  Feature Estimate: 2.0

                  IssuesStory Points
                  To Do11.0
                  In Progress   00.0
                  Complete1114.0
                  Total1215.0

                  Dates

                    Created:
                    Updated:
                    Resolved:

                    Structure Helper Panel