Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-4042

Enhance monitoring of SDP real-time pipelines

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • Spike
    • Should have
    • PI23
    • COM SDP SW
    • None
    • Data Processing
    • Hide

      SDP is being used for real processing in current integration efforts (as opposed to executing test scripts), as well as it will be soon used for such for AA0.5 testing. AIV engineers, commissioning scientists, and operators will need to be able to debug and analyze any problems that may occur during pipeline execution and visibility data capture.

      At the moment, SDP does not report if any of the processing scripts or their execution engines did not start or failed. This results in the SDP subarray device timing out (AssignResources or Configure) or any sub-system waiting for results of a pipeline being stuck waiting (e.g. pointing offsets are not updated and TMC gets stuck waiting or times out).

      The aim is to explore ways we can monitor processing scripts and the execution engines they deploy, and come up with a design and strategy to report their status in a way that is useful for operators and other telescope sub-systems. We may also want to store relevant log outputs or raise alarms.

      Show
      SDP is being used for real processing in current integration efforts (as opposed to executing test scripts), as well as it will be soon used for such for AA0.5 testing. AIV engineers, commissioning scientists, and operators will need to be able to debug and analyze any problems that may occur during pipeline execution and visibility data capture. At the moment, SDP does not report if any of the processing scripts or their execution engines did not start or failed. This results in the SDP subarray device timing out (AssignResources or Configure) or any sub-system waiting for results of a pipeline being stuck waiting (e.g. pointing offsets are not updated and TMC gets stuck waiting or times out). The aim is to explore ways we can monitor processing scripts and the execution engines they deploy, and come up with a design and strategy to report their status in a way that is useful for operators and other telescope sub-systems. We may also want to store relevant log outputs or raise alarms.
      1. Design for monitoring processing scripts and the processes they deploy - ORCA
      2. Initial implementation of the design in a chosen processing script - ORCA
    • Intra Program
    • 3
    • 2
    • 0
    • Team_ORCA
    • Sprint 4
    • Hide

      Demo: System Demo 23.5 (slides)

      Design work is documented on Confluence (page and its sub-pages): https://confluence.skatelescope.org/display/SE/Monitoring+of+Processing+script+execution

      Final design can be accessed here: https://confluence.skatelescope.org/pages/viewpage.action?pageId=282497976

      Based on the design, we chose one scenario to implement. This work is in progress, the code has not been merged to the main branches. The aim is not to provide a fully functioning feature, but to explore and refine the design in practice. The comprehensive implementation is part of SP-4502.

      The scenario: an execution engine requested by a processing script doesn't start. In this case the helm deployer updates the deployment state with the reason (error message), which the processing script/block monitors and based on the information reports to the subarray. The subarray uses the processing block state to report the error on a new "errorMessage" tango attribute, and move the obsState to FAULT if needed.

      Relevant Merge Requests:

      Show
      Demo: System Demo 23.5 ( slides ) Design work is documented on Confluence (page and its sub-pages): https://confluence.skatelescope.org/display/SE/Monitoring+of+Processing+script+execution Final design can be accessed here: https://confluence.skatelescope.org/pages/viewpage.action?pageId=282497976 Based on the design, we chose one scenario to implement. This work is in progress, the code has not been merged to the main branches. The aim is not to provide a fully functioning feature, but to explore and refine the design in practice. The comprehensive implementation is part of SP-4502 . The scenario: an execution engine requested by a processing script doesn't start. In this case the helm deployer updates the deployment state with the reason (error message), which the processing script/block monitors and based on the information reports to the subarray. The subarray uses the processing block state to report the error on a new "errorMessage" tango attribute, and move the obsState to FAULT if needed. Relevant Merge Requests: Processing script : the chosen script for testing is pointing-offset Scripting Library LMC Helm Deployer
    • 23.5
    • Stories Completed, Outcomes Reviewed, Satisfies Acceptance Criteria, Accepted by FO

    Attachments

      Issue Links

        Structure

          Activity

            People

              m.ashdown Ashdown, Mark
              m.ashdown Ashdown, Mark
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Feature Progress

                Story Point Burn-up: (100.00%)

                Feature Estimate: 3.0

                IssuesStory Points
                To Do00.0
                In Progress   00.0
                Complete529.5
                Total529.5

                Dates

                  Created:
                  Updated:
                  Resolved:

                  Structure Helper Panel