Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-4685

Improvements to PST pipeline recoverability and health visibility

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • Feature
    • High
    • PI24
    • COM PST SW
    • None
    • Data Processing
    • Hide
      1. a published document describes the conditions under which TANGO alarms will be raised by PST and how these are handled
      2. a published document describes how the Engineering Data Archive (EDA) will be configured for PST
      3. PST can recover from failure states and brought back to an Idle state without having to be redeployed
      4. PST regularly performs a health state check with all of its components, rolls up this information and accurately reports overall health state
      5. Taranta dashboard(s) are available to provide PST monitoring and health information for operators and AIV engineers.
      6. A release of PST is published that provides these improvements.
      Show
      a published document describes the conditions under which TANGO alarms will be raised by PST and how these are handled a published document describes how the Engineering Data Archive (EDA) will be configured for PST PST can recover from failure states and brought back to an Idle state without having to be redeployed PST regularly performs a health state check with all of its components, rolls up this information and accurately reports overall health state Taranta dashboard(s) are available to provide PST monitoring and health information for operators and AIV engineers. A release of PST is published that provides these improvements.
    • 3.5
    • 3
    • 0
    • Team_PST
    • Sprint 5
    • PI24 - UNCOVERED

    • DP_ART_24.5

    Description

      From Miro:

      Why?
      When using complex systems composed of a large number of connected distributed components, a high-level of observability and monitoring is vital to track the health and performance and provide the necessary insights into the internal state of the system. This is provides the insights AIV and soon Operations and Commissioning Scientists will need to quickly understand and diagnose issues, which is essential to preventing future incidents and improving reliability.

      What?

      1. Provide Startup telescope functionality
      2. Provide Shutdown Telescope Functionality
      3. Provide Monitoring and roll up capability
      4. Provide alarm handling
      5. Provide Control functionality
      6. Provide auto-cabinet shutdown

      Most relevant to PST in the above list are the provision of monitoring and roll up capability (from the monitoring data produced by core C++ applications to the Taranta dashboards used by operators) and the handling of alarms (e.g. when disk space is low or overall health is degraded).

      Attachments

        Issue Links

          Structure

            Activity

              People

                A.Noutsos Noutsos, Aristeidis
                A.Noutsos Noutsos, Aristeidis
                Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (71.83%)

                  Feature Estimate: 3.5

                  IssuesStory Points
                  To Do24.0
                  In Progress   26.0
                  Complete825.5
                  Total1235.5

                  Dates

                    Created:
                    Updated:

                    Structure Helper Panel