Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-2617

CSP LMC Resilient assign resources implementation

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • Enabler
    • Must have
    • PI15
    • None
    • Obs Mgt & Controls
    • Hide

      Having knowledge about failures in completing tasks and subtasks will greatly increase the ability to diagnose and understand root causes of failures triggered by compute infrastructure.

      Show
      Having knowledge about failures in completing tasks and subtasks will greatly increase the ability to diagnose and understand root causes of failures triggered by compute infrastructure.
    • Hide

      A new release in skampi with refactored functionality causing end-to-end tests to pass consistently (eg. > 100 times back to back) or in case of failures indicate clearly what the cause was. 

      Evidence of unhappy path tests in tmc repo must also be given

      Show
      A new release in skampi with refactored functionality causing end-to-end tests to pass consistently (eg. > 100 times back to back) or in case of failures indicate clearly what the cause was.  Evidence of unhappy path tests in tmc repo must also be given
    • 1
    • 1
    • 0
    • Team_CREAM
    • Sprint 5
    • Hide

      Consistent logging strategy was adopted in CSP.LMC. Long running command attributes longRunningCommandStatus and longRunningCommandResult were implemented to provide more information about the execution of the Assign/Release/ReleaseAllResources commands.

      https://gitlab.com/ska-telescope/ska-csp-lmc-common/-/merge_requests/69

      https://gitlab.com/ska-telescope/ska-csp-lmc-common/-/merge_requests/77

      Unhappy path tests with exceptions and timeouts provoked on CBF/PSS/PST subsystems when executing Assign/Release/ReleaseAllResources commands are provided in CSP.LMC MID repository: 

      https://gitlab.com/ska-telescope/ska-csp-lmc-mid/-/merge_requests/55

      Latest release 0.11.7 containing these changes has been introduced to SKAMPI and the tests have been run 100 times to prove that they pass consistently.

      https://gitlab.com/ska-telescope/ska-skampi/-/jobs/2921553614

      Feature was demoed at OMC ART System Demo 15.6.

      Show
      Consistent logging strategy was adopted in CSP.LMC. Long running command attributes longRunningCommandStatus and longRunningCommandResult were implemented to provide more information about the execution of the Assign/Release/ReleaseAllResources commands. https://gitlab.com/ska-telescope/ska-csp-lmc-common/-/merge_requests/69 https://gitlab.com/ska-telescope/ska-csp-lmc-common/-/merge_requests/77 Unhappy path tests with exceptions and timeouts provoked on CBF/PSS/PST subsystems when executing Assign/Release/ReleaseAllResources commands are provided in CSP.LMC MID repository:  https://gitlab.com/ska-telescope/ska-csp-lmc-mid/-/merge_requests/55 Latest release 0.11.7 containing these changes has been introduced to SKAMPI and the tests have been run 100 times to prove that they pass consistently. https://gitlab.com/ska-telescope/ska-skampi/-/jobs/2921553614 Feature was demoed at OMC ART System Demo 15.6 .
    • 15.6
    • Stories Completed, Integrated, BDD Testing Passes (no errors), Accepted by FO
    • PI22 - UNCOVERED

    • Team_CREAM

    Description

      Refactor current CSP Subarray Assign/Release resources functionality so that it allows for the CSP LMC to handle "compute infrastructure-related" failures on CBF (and or PSS, PST) whilst busy resourcing (e.g. after returning OK but before successfully assigned/released).

      Note the handling of this shall be limited to "passive" measures only: E.g appropriate log messages on the client-side to indicate to an investigator when and why something went wrong in the execution of assign/release on one of the servers.

      Verification shall be done by means of tests on the CSP LMC repo side in which server-side failures can be injected using mocks.

       

      Also, see below for technical discussion:

      https://docs.google.com/document/d/15LAXGcrQVfUEQp3-1nPpcUaNNxhH8oszA2u8z1nlS3w/edit#

      Attachments

        Issue Links

          Structure

            Activity

              People

                g.leroux Le Roux, Gerhard [X] (Inactive)
                g.leroux Le Roux, Gerhard [X] (Inactive)
                Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (100.00%)

                  Feature Estimate: 1.0

                  IssuesStory Points
                  To Do00.0
                  In Progress   00.0
                  Complete25.0
                  Total25.0

                  Dates

                    Created:
                    Updated:
                    Resolved:

                    Structure Helper Panel