Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-636

Measure performance for CSP.LMC like deployment architecture

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • Spike
    • Not Assigned
    • None
    • None
    • Obs Mgt & Controls
    • Hide

      Detect limits of performance and possible identify if source of performance issues is the result of current deployment architecture. If identified issues can be addressed in early stages of development, so that if the issues cannot be fixed in current design, the SKA project can pivot to more suitable design choices. 

      Show
      Detect limits of performance and possible identify if source of performance issues is the result of current deployment architecture. If identified issues can be addressed in early stages of development, so that if the issues cannot be fixed in current design, the SKA project can pivot to more suitable design choices. 
    • Hide

      Primary acceptance criterion will be to  "Get a baseline metric of to what extent we can scale on the integration k8s cluster, using trivial devices" This should however closely replicate the way CSP.LMC are currently proposing to deploy ( in terms of expected hierarchy of devices, containers, Pods etc.) while its not required to use actual CSP code.

      Synchronous commands replaced with asynchronous commands where required (see feature description). 

      Source of the performance issues identified. This is a nice to have, else document the level to which no performance degradation identified.

      Plan for mitigation / resolution proposed if the analysis reveals clearly what the issues are, else identify next steps for further work.

      Show
      Primary acceptance criterion will be to   "Get a baseline metric of to what extent we can scale on the integration k8s cluster, using trivial devices" This should however closely replicate the way CSP.LMC are currently proposing to deploy ( in terms of expected hierarchy of devices, containers, Pods etc.) while its not required to use actual CSP code. Synchronous commands replaced with asynchronous commands where required (see feature description).  Source of the performance issues identified. This is a nice to have, else document the level to which no performance degradation identified. Plan for mitigation / resolution proposed if the analysis reveals clearly what the issues are, else identify next steps for further work.
    • 2
    • 7.5
    • Sprint 5
    • 16.3

    Description

      We consider a short timeboxed Spike of a Sprint to look at the measurement possibilities and As Is performance. Further work could be covered later (A cloned Feature SP-712 created to capture all the initial descriptions)

      Serious performance issues have been detected when a realistic number of CSP sub-arrays and other TANGO Servers and Devices are instantiated. The cause has not been identified, it may be inherent to pyTango implementation, caused by sub-optimal container configuration, blocking due to extensive use of forwarded attributes, use of synchronous (as opposed of asynchronous commands) commands. 

       

      In this Spike the team can look at baselining the current performance limits under a very similar deployment architecture replicating as closely as possible the number and hierarchy (of Pods, Containers, Tango Devices) as expected in CSP.LMC  and document the results. If possible extend and scale till the performance degrades seriously or devices crash. This should be run on the Engage cluster.

       

      Some of the description below may not occur this Spike but could be part of the follow up feature:

      These are some of the options that will be explored: 

      • Where appropriate replace synchronous with asynchronous commands (in particular when command completion depends on large number of other devices or involves parsing of large JSON  objects). 
      • Experiment using CSP.LMC and Mid.CBF images that instantiate large number of TANGO Servers/Devices.
      • Experiment with different container configurations.

       

      Attachments

        Issue Links

          Structure

            Activity

              People

                v.mohile Mohile, Vivek
                s.vrcic Vrcic, Sonja
                Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (0%)

                  Feature Estimate: 2.0

                  IssuesStory Points
                  To Do00.0
                  In Progress   00.0
                  Complete00.0
                  Total00.0

                  Dates

                    Created:
                    Updated:
                    Resolved:

                    Structure Helper Panel