Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-2944

Implement consistent strategy for writing SDP data products

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • True
    • Data Processing
    • Hide

      See "Why?" in description.

      Show
      See "Why?" in description.
    • Hide

      See "What?" in description.

      Show
      See "What?" in description.
    • Intra Program
    • 2
    • 2
    • 0
    • Team_ORCA, Team_YANDA
    • Sprint 5
    • Hide

      AC1:

      ORCA:

      Demo of PVC generation: at System Demo 17.4 (slides)

      The SDP Helm chart can now create a Persistent Volume Claim (PVC) in its control system namespace and the processing namespace via easy setting in values.yaml. It can also use existing PVCs for the purpose. This is decided at the time of installation, buy the user who installs the chart (there are no checks whether the PVC really exists). The PVC name is passed to the Processing Controller as an environment variable, which variable is passed onto scripts by the Processing Controller.

      The changes are part of the following releases:

      (Note that SDP 0.13.0 doesn't contain processing controller 0.11.3. The latest version will be contained by SDP 0.14.0)

      AC2:

      YANDA:
      The modified visibility receiver now reads an environment variable to populate the required Helm values for the receive chart to pass the PVC name to be used by the receiver pod.

      AC3:

      ORCA:

      The metadata python package, which allows, e.g. the measurement set (MS) writer to write a file into a given directory containing MS metadata, has been added to ska-sdp-dataproduct-metadata. Its documentation is at: https://developer.skao.int/projects/ska-sdp-dataproduct-metadata/en/latest/index.html.

      Data Product Metadata: ska-sdp-dataproduct-metadata==0.1.1

      Show
      AC1: ORCA: Demo of PVC generation: at System Demo 17.4 ( slides ) The SDP Helm chart can now create a Persistent Volume Claim (PVC) in its control system namespace and the processing namespace via easy setting in values.yaml. It can also use existing PVCs for the purpose. This is decided at the time of installation, buy the user who installs the chart (there are no checks whether the PVC really exists). The PVC name is passed to the Processing Controller as an environment variable, which variable is passed onto scripts by the Processing Controller. The changes are part of the following releases: SDP: https://jira.skatelescope.org/browse/REL-191 (0.13.0) Processing Controller:  ska-sdp-proccontrol==0.11.3 (Note that SDP 0.13.0 doesn't contain processing controller 0.11.3. The latest version will be contained by SDP 0.14.0) AC2: YANDA: The modified visibility receiver now reads an environment variable to populate the required Helm values for the receive chart to pass the PVC name to be used by the receiver pod. AC3: ORCA: The metadata python package, which allows, e.g. the measurement set (MS) writer to write a file into a given directory containing MS metadata, has been added to ska-sdp-dataproduct-metadata . Its documentation is at: https://developer.skao.int/projects/ska-sdp-dataproduct-metadata/en/latest/index.html . Data Product Metadata: ska-sdp-dataproduct-metadata==0.1.1
    • 17.6
    • Stories Completed, Integrated, BDD Testing Passes (no errors), Outcomes Reviewed, Demonstrated, Satisfies Acceptance Criteria, Accepted by FO
    • PI23 - UNCOVERED

    • LOW_SUT1 LOW_SUT2 MID_SUT1 MID_SUT3

    Description

      Introduction

      In the short term, we have decided to store the SDP data products in a single large Kubernetes volume. However, configuring the SDP to use a pre-existing volume or create a new one, and configuring the processing scripts to write the data to that volume, is a haphazard process at present.

      We need a joined-up way to configure the volume when deploying the SDP and get the scripts to discover it and write their data in the appropriate location. We must also ensure the data products are written with the appropriate metadata.

      Consideration needs to be given to the various kinds of SDP deployment, ephemeral (for testing in the CI pipeline) and persistent, both stand-alone and as part of the integrated system in Skampi.

      Who?

      • SDP developers
      • SDP users
      • AIV engineers

      What?

      • SDP can be configured to use a pre-existing volume or create a new volume for storing data products.
      • Processing scripts discover the volume automatically and their data is written to the volume in the appropriate location.
      • Processing scripts write appropriate metadata files alongside the output data.

      Why?

      • Simplifies configuration and deployment of SDP and processing scripts.
      • Ensures data products and metadata are stored in a systematic way.
      • Enables discovery of data products via user interfaces.

      References

      • ADR-55 defines the paths to be used in the volume and the format for the metadata files.

      Attachments

        Structure

          Activity

            People

              m.ashdown Ashdown, Mark
              m.ashdown Ashdown, Mark
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Feature Progress

                Story Point Burn-up: (100.00%)

                Feature Estimate: 2.0

                IssuesStory Points
                To Do00.0
                In Progress   00.0
                Complete1217.0
                Total1217.0

                Dates

                  Created:
                  Updated:
                  Resolved:

                  Structure Helper Panel