Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-2616

CSP LMC Resilient ON/OFF implementation

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • Enabler
    • Must have
    • PI15
    • None
    • Obs Mgt & Controls
    • Hide

      Having knowledge about failures in completing tasks and subtasks will greatly increase the ability to diagnose and understand root causes of failures triggered by compute infrastructure.

      Show
      Having knowledge about failures in completing tasks and subtasks will greatly increase the ability to diagnose and understand root causes of failures triggered by compute infrastructure.
    • Hide

      A new release in skampi with refactored functionality causing end-to-end tests to pass consistently (eg. > 100 times back to back) or in case of failures indicate clearly what the possible cause or causes could be. 

      The acceptance tests can be verified on either Low or Mid.

      Evidence of unhappy path tests in csp repo must also be given

      Show
      A new release in skampi with refactored functionality causing end-to-end tests to pass consistently (eg. > 100 times back to back) or in case of failures indicate clearly what the possible cause or causes could be.  The acceptance tests can be verified on either Low or Mid. Evidence of unhappy path tests in csp repo must also be given
    • 1
    • 2.3
    • 0
    • Team_CREAM
    • Sprint 4
    • Hide

      Consistent logging strategy was adopted in CSP.LMC. Long running command attributes longRunningCommandStatus and longRunningCommandResult were implemented to provide more information about the execution of the On/Off/Standby commands.

      https://gitlab.com/ska-telescope/ska-csp-lmc-common/-/merge_requests/67

      https://gitlab.com/ska-telescope/ska-csp-lmc-common/-/merge_requests/69

      Unhappy path tests with exceptions and timeouts provoked on CBF/PSS/PST subsystems when executing On/Off/Standby commands are provided in CSP.LMC MID repository: 

      https://gitlab.com/ska-telescope/ska-csp-lmc-mid/-/merge_requests/52

      Latest release 0.11.7 containing these changes has been introduced to SKAMPI and the tests have been run 100 times to prove that they pass consistently.

      https://gitlab.com/ska-telescope/ska-skampi/-/jobs/2921553614

      Feature was demoed at OMC ART System Demo 15.6.

       

       

      Show
      Consistent logging strategy was adopted in CSP.LMC. Long running command attributes longRunningCommandStatus and longRunningCommandResult were implemented to provide more information about the execution of the On/Off/Standby commands. https://gitlab.com/ska-telescope/ska-csp-lmc-common/-/merge_requests/67 https://gitlab.com/ska-telescope/ska-csp-lmc-common/-/merge_requests/69 Unhappy path tests with exceptions and timeouts provoked on CBF/PSS/PST subsystems when executing On/Off/Standby commands are provided in CSP.LMC MID repository:  https://gitlab.com/ska-telescope/ska-csp-lmc-mid/-/merge_requests/52 Latest release 0.11.7 containing these changes has been introduced to SKAMPI and the tests have been run 100 times to prove that they pass consistently. https://gitlab.com/ska-telescope/ska-skampi/-/jobs/2921553614 Feature was demoed at OMC ART System Demo 15.6 .    
    • 15.6
    • Stories Completed, Integrated, BDD Testing Passes (no errors), Accepted by FO
    • PI24 - UNCOVERED

    Description

      Refactor current CSP On/OFF functionality so that it allows for the CSP LMC to handle "compute infrastructure-related" failures on CBF (and or PSS, PST) whilst performing the act of switching on (e.g. after returning OK but before successfully switched ON/OFF).

      Note the handling of this shall be limited to "passive" measures only: E.g appropriate log messages on the client-side to indicate to an investigator when and why something went wrong in the execution of ON/OFF on one of the servers.

      Verification shall be done by means of tests on the CSP LMC repo side in which server-side failures can be injected using mocks.

       

      Also, see below for technical discussion:

      https://docs.google.com/document/d/15LAXGcrQVfUEQp3-1nPpcUaNNxhH8oszA2u8z1nlS3w/edit#

      Attachments

        Issue Links

          Structure

            Activity

              People

                g.leroux Le Roux, Gerhard [X] (Inactive)
                g.leroux Le Roux, Gerhard [X] (Inactive)
                Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (100.00%)

                  Feature Estimate: 1.0

                  IssuesStory Points
                  To Do00.0
                  In Progress   00.0
                  Complete1022.0
                  Total1022.0

                  Dates

                    Created:
                    Updated:
                    Resolved:

                    Structure Helper Panel