Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-4058

Prepare AA0.5 software deployment ahead of station integration

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • Feature
    • Must have
    • PI22
    • None
    • None
    • LOW ART
    • Hide

      A key step in station integration is the integration of software with hardware. This will be required by station integrators in order to verify the hardware integration – i.e. to obtain station telemetry and to run the station short functional tests. If we are not ready to integrate the software with the hardware in time, station integration will be delayed or may have to be handed over to AIV without having been properly verified.

      Show
      A key step in station integration is the integration of software with hardware. This will be required by station integrators in order to verify the hardware integration – i.e. to obtain station telemetry and to run the station short functional tests. If we are not ready to integrate the software with the hardware in time, station integration will be delayed or may have to be handed over to AIV without having been properly verified.
    • Inter Program
    • 2
    • 2
    • 0
    • Team_VULCAN
    • Sprint 4
    • Hide

      A repository structure, chart structure and deployment strategy for AA0.5 was architected — see https://confluence.skatelescope.org/display/SE/Deployment+strategy+for+SKA-Low+telescope+software for details. (The architecture has evolved from this initial proposal.) Repositories were created and bootstrapped, and the charts were created.

      An important aspect of the proposed chart structure involved abstracting away from specific target facilities. That is, rather than have an AAVS3 chart, a Low ITF chart, a CPF chart, an SPC chart, and so on, we defined generic charts that can be configured for specific facilities.

      The new charts were deployed to the SPC, and the CSP-SDP integration test (originally written for the Low ITF Integration Event) was successfully run in the SPC.

      They were also deployed to the tCPF, and are currently in use by the station integration team as they integrate the first stations. A great testimony to the success of this work was the fact that as soon as the s8-1 SPS was powered on for the first time, the grafana dashboards started displaying telemetry data — that is, the application software was already deployed, correctly configured, and ready to go, before the hardware was powered on.

      The AAVS3 and Low ITF deployments were both refactored to use the new chart structure, without any loss of functionality.

      Aside from the charts themselves, specific work done under this feature to prepare for successful deployment to AA0.5 include:

      • Updating the platform specification file used to configure the software, as new information became available e.g. station rotations, hardware IP addresses and subnets, surveyed antenna positions, cable lengths, etc.
      • Shipping of DAQ data from fast local storage to cluster-wide storage
      • Configuration and deployment of the SKA Portal
      • Configuration of grafana to pull dashboards from a configmap, allowing dashboards to be under version control
      • Implementation of grafana dashboards
      • Workaround for configuring the Tango archiver, given the archiver configurator cannot handle large numbers of devices
      • Refactoring of MCCS charts to work with constrained permissions
      • Implementation of SDN gateway support into MCCS charts and devices
      • Working with Bang Team to resolve issues including cluster access, DAQ network interfaces, access to server GPUs
      • Setup of jupyter environments for the clusters, e.g. inclusion of config capture functionality

      Demos:

      Most of the above work can be seen in the ska-low-software repo; the platform specification currently lives in the ska-low-deployment repo.

      Show
      A repository structure, chart structure and deployment strategy for AA0.5 was architected — see https://confluence.skatelescope.org/display/SE/Deployment+strategy+for+SKA-Low+telescope+software for details. (The architecture has evolved from this initial proposal.) Repositories were created and bootstrapped, and the charts were created. An important aspect of the proposed chart structure involved abstracting away from specific target facilities. That is, rather than have an AAVS3 chart, a Low ITF chart, a CPF chart, an SPC chart, and so on, we defined generic charts that can be configured for specific facilities. The new charts were deployed to the SPC, and the CSP-SDP integration test (originally written for the Low ITF Integration Event) was successfully run in the SPC. They were also deployed to the tCPF, and are currently in use by the station integration team as they integrate the first stations. A great testimony to the success of this work was the fact that as soon as the s8-1 SPS was powered on for the first time, the grafana dashboards started displaying telemetry data — that is, the application software was already deployed, correctly configured, and ready to go, before the hardware was powered on. The AAVS3 and Low ITF deployments were both refactored to use the new chart structure, without any loss of functionality. Aside from the charts themselves, specific work done under this feature to prepare for successful deployment to AA0.5 include: Updating the platform specification file used to configure the software, as new information became available e.g. station rotations, hardware IP addresses and subnets, surveyed antenna positions, cable lengths, etc. Shipping of DAQ data from fast local storage to cluster-wide storage Configuration and deployment of the SKA Portal Configuration of grafana to pull dashboards from a configmap, allowing dashboards to be under version control Implementation of grafana dashboards Workaround for configuring the Tango archiver, given the archiver configurator cannot handle large numbers of devices Refactoring of MCCS charts to work with constrained permissions Implementation of SDN gateway support into MCCS charts and devices Working with Bang Team to resolve issues including cluster access, DAQ network interfaces, access to server GPUs Setup of jupyter environments for the clusters, e.g. inclusion of config capture functionality Demos: Low ART System Demo 22.1 Low ART System Demo 22.2 Low ART System Demo 22.4 Most of the above work can be seen in the ska-low-software repo; the platform specification currently lives in the ska-low-deployment repo.
    • PI24 - UNCOVERED

    Description

      Architect and implement the repository and chart structure for AA0.5 and beyond. Rearchitect the chart structure of other deployment platforms to align with the AA structure.
       
      The SKA-Low software deployment platform will comprise two distinct kubernetes clusters: one in the CPF, another in the SPC. The software deployment needs to be structured so that

      • the MCCS subsystem is deployed to the CPF, along with anything else needed there (monitoring and control of cooling system, power supplies, PDUs, etc? Jupyterhub / taranta for debugging? RFI monitoring?)
      • the TMC, CSP and SDP subsystems are deployed to the SPC, along with anything else needed there (monitoring and control of SKAO assets?)

      Having established this structure, the Low ITF chart structure needs to be rearchitected to align with the SKA-Low structure, so that the charts that we deploy to the Low ITF for testing, are the same charts that end up deployed to production.
      Similarly, the AAVS3 software deployment is currently defined in an AAVS3-specific chart; this should be refactored to allow the CPF chart to be deployed to AAVS3.

      Attachments

        Issue Links

          Structure

            Activity

              People

                b.mort Mort, Ben
                Drew.Devereux Devereux, Drew
                Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (82.05%)

                  Feature Estimate: 2.0

                  IssuesStory Points
                  To Do67.0
                  In Progress   10.0
                  Complete2432.0
                  Total3139.0

                  Dates

                    Created:
                    Updated:

                    Structure Helper Panel