Loading...

Xporter

XML

Word

Printable

Details

Type: Enabler
Priority: Must have
Fix Version/s: PI12
Component/s: None
Labels:
None

ARTs:

Data Processing, Services, Obs Mgt & Controls
Benefit hypothesis:
Hide

It takes less time for teams to integrate Features in SKAMPI

SKAMPI Tests work reliably as integration tests

SKAMPI infrastructure status is decoupled from SKAMPI software quality

Team morale is increased, as they are not hampered by SKAMPI

Teams are shown that we care about software quality above pure output
Show
It takes less time for teams to integrate Features in SKAMPI SKAMPI Tests work reliably as integration tests SKAMPI infrastructure status is decoupled from SKAMPI software quality Team morale is increased, as they are not hampered by SKAMPI Teams are shown that we care about software quality above pure output
Acceptance criteria:
Hide

In PI12 we will focus on rebuilding the SKAMPI integration from the ground up. This will start with only a subset of components (see description) and proceed in inverse dependency order. At any stage we will:

Make available a SKAMPI release that satisfies these criteria:

SKAMPI deployment is predictable, testable and repeatable

The CI pipeline is GREEN, and tests can be reliably executed to identify bugs and errors

Tests are meaningful and their results can correctly predict system behaviour

Stress tests are executed correctly

The components to be re-integrated are:

platform

tango-base

test harness

central node

Logging database and centralised logging capability

(uncommitted)

skuid

EDA

landing page

for all components re-integrated in the integration environment:

use best practices for container images, tools, and test processes (and update/disseminate docs when the docs aren't current)

re-align with SKA CI/CD policies

artefacts are released and published in the CAR following the Release Process

Code Ownership is correctly defined within SKAMPI assigning clear responsibility on different areas of the integration repository

It is noted that this will possibly include only a subset of the subsystems and of the tests base.

The end goal of this capability is to get to state where, at component and integrated Skampi level - , we need to:

Common deployment process that is flexible and scales with the environment and partial deployment of Skampi

Common pipeline machinery including gitlab-ci steps, and Makefile targets

Common machinery for runtime applications (eg: device servers) including command line options, environment variable handling/standardisation, mem/cpu/resource handling, builtin health-checkers - to support robust deployment/management

Testing framework that is flexible, with easy to understand output to support diagnosing problems

Testing that equally focuses on unhappy as well as happy paths

Have a suite of tests for SKAMPI software that must pass in order for any SKAMPI version to be accepted.

Have a suite of tests for SKAMPI infrastructure (pipelines, virtual machines, attached storage…) that indicate when the underlying infrastructure is failing.

Demonstrate that all tests included can run reliably and are not flaky. Flaky tests should be isolated for investigation.

Have a dashboard that shows the status of SKAMPI infrastructure and SKAMPI tests for a number of versions (at least, the one from last PI demo, until the current ones; at most, the ones from the last 6 months) which is accessible by everyone.
Show
In PI12 we will focus on rebuilding the SKAMPI integration from the ground up. This will start with only a subset of components (see description) and proceed in inverse dependency order. At any stage we will: Make available a SKAMPI release that satisfies these criteria: SKAMPI deployment is predictable, testable and repeatable The CI pipeline is GREEN, and tests can be reliably executed to identify bugs and errors Tests are meaningful and their results can correctly predict system behaviour Stress tests are executed correctly The components to be re-integrated are: platform tango-base test harness central node Logging database and centralised logging capability (uncommitted) skuid EDA landing page for all components re-integrated in the integration environment: use best practices for container images, tools, and test processes (and update/disseminate docs when the docs aren't current) re-align with SKA CI/CD policies artefacts are released and published in the CAR following the Release Process Code Ownership is correctly defined within SKAMPI assigning clear responsibility on different areas of the integration repository It is noted that this will possibly include only a subset of the subsystems and of the tests base. The end goal of this capability is to get to state where, at component and integrated Skampi level - , we need to: Common deployment process that is flexible and scales with the environment and partial deployment of Skampi Common pipeline machinery including gitlab-ci steps, and Makefile targets Common machinery for runtime applications (eg: device servers) including command line options, environment variable handling/standardisation, mem/cpu/resource handling, builtin health-checkers - to support robust deployment/management Testing framework that is flexible, with easy to understand output to support diagnosing problems Testing that equally focuses on unhappy as well as happy paths Have a suite of tests for SKAMPI software that must pass in order for any SKAMPI version to be accepted. Have a suite of tests for SKAMPI infrastructure (pipelines, virtual machines, attached storage…) that indicate when the underlying infrastructure is failing. Demonstrate that all tests included can run reliably and are not flaky. Flaky tests should be isolated for investigation. Have a dashboard that shows the status of SKAMPI infrastructure and SKAMPI tests for a number of versions (at least, the one from last PI demo, until the current ones; at most, the ones from the last 6 months) which is accessible by everyone.
Feature Points:
13
Epic Link:
Evolutionary Prototype
Feature Point Burn-up:

$i18n.getText("admin.common.words.hide")

$i18n.getText("admin.common.words.show")
Overdue:
Requirement Status:

PI24 - UNCOVERED

Description

SKAMPI is brittle: it is easy for SKAMPI to be in a state in which tests do not pass, and where reverting to a past configuration still test do not pass. We need to be able to fix this before more development on SKAMPI features can happen.

In order to do that, we need to a) provide tests that are robust, and that prove whether SKAMPI is working or not; b) fix SKAMPI so that tests can pass, or fix temporarily the test, annotating an SKB against the test to make sure that the underlying reason is fixed; c) provide a well-known version of SKAMPI to work against.

Many initiatives have been promoted to try and address this issue programmatically, these are related to improved processes for testing and bug fixing, better release management for better coordination, refactoring of some core aspects within the control system, but the problem is persisting. After careful evaluation, the obvious choice seems to be to refocus on the internals of the SKAMPI integration and make sure that it conforms to the necessary quality standards that enable a smoother integration and testing activity.

To this extent, in PI12 it has been decided that a task team shall be coordinated, composed by members from different teams, to rebuild the SKAMPI integration to a level where it is reliable and controllable. They will proceed by rebuilding the integration from the ground up and verifying some core properties. Their activity will focus on the integration of some key components:

Test harness: skallop , organisation of test code and .features
Integration platform: deployment and monitoring of the k8s cluster and additional services
TMC (common, TANGO DB, central node, EDA), skuid - possibly starting with a version that only deploys the central node.

INTEGRATION ORDER:

platform ( what ? k8s, Elastic, MariaDB TimescaleDB )
tango-base (tango-cpp, tango-db, tango-dsconfig)
TMC (central node)
skuid
Verifying that logging and transaction ID are correctly implemented in the integrated components
EDA
landing page

Other activities related to ~~SS-82~~ can proceed in parallel to this integration effort in order to improve future integrations of components that are not touched in this initial effort. In particular:

ruthless bug fixing activity
skallop code refactoring and tests standardisation
Taranta - resource usage and deployability
Archiver - refactoring
SDP - updates to the component level testing to be more robust to failure modes
TMC - updates to the component level testing to be more robust to failure modes
OET - updates to the component level testing to be more robust to failure modes
Work already planned in relation to the refactoring of the Control System guidelines and their implementation

Please note that training of users so that they can make better use of SKAMPI is not part of the scope of this Enabler/Capability.

Attachments

Issue Links

depends on

SS-80 All artefacts are released and uploaded to the CAR

Done

Parent Of

SP-1942 Define and organise test sets, test plans and aggregation of results for SKA MID in XRay

Implementing

SP-1828 TMC deployed and tested autonomously.

Done

SP-1845 Further elaboration of acceptance tests for standalone SDP

Done

SP-1884 Ensure all SDP components can be (re)deployed or be restarted independently and will auto-heal

Done

SP-1887 Contribute to revised integration and acceptance testing of SDP in SKAMPI

Done

SP-1953 Recoverable Tango Devices in TMC

Done

SP-1983 OET robustness improvements in support of SS-82

Done

SP-1951 Fast and agile skampi pipeline

Discarded

SP-1961 TMC component testing

Discarded

relates to

SP-1848 Improve SKALLOP to be a well documented and tested product

Releasing

SP-1941 Establish / refine process / reporting tool(s) for determining the current status of functionality for each product

Analyzing

links to

Google Docs: SS-82 Enabler/Capability definition discussion

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

mentioned on

Commit - Merge branch 'ss-82' into 'master'

Commit - SS-82: Update unit test test_tmc_state

Commit - SS-82 Address review comments

Commit - SS-82 Code cleanup

Commit - SS-82 model improvements

Commit - SS-82 re-structure folders

Merge request - AT2-938 Improve OET BDD tests

Merge request - AT-19 added documentation on Testing SKAMPI

Merge request - Draft: AT-59 run test a lot of times

Merge request - SAR-315 Added chart template unit tests for the ska-tango-base charts

Merge request - SS-82

Merge request - ST-854: Migrate tango examples to poetry

Merge request - YAN-688 Add new skamid-emulated umbrella chart

Wiki Page: Wiki Page Loading...

(5 Parent Of, 2 relates to, 1 links to, 60 mentioned in, 13 mentioned on, 1 Wiki Page)

Features

Key	Summary	Status	Assignee	ARTs	Agile Teams	FPs	FixVersion	Due Sprint
SP-1941	Establish / refine process / reporting tool(s) for determining the current status of functionality for each product	Analyzing	Bartolini, Marco	Services	Team_ATLAS	2	PI14
~~SP-1951~~	Fast and agile skampi pipeline	Discarded	Harding, Piers	Services		2	PI13
~~SP-1884~~	Ensure all SDP components can be (re)deployed or be restarted independently and will auto-heal	Done	Wortmann, Peter	Data Processing	Team_ORCA	2	PI12	Sprint 2
~~SP-1953~~	Recoverable Tango Devices in TMC	Done	Avison, Adam	Obs Mgt & Controls	Team_NCRA	2	PI12	Sprint 3
~~SP-1887~~	Contribute to revised integration and acceptance testing of SDP in SKAMPI	Done	Wortmann, Peter	Data Processing	Team_ORCA	3	PI12	Sprint 5
~~SP-1828~~	TMC deployed and tested autonomously.	Done	Vrcic, Sonja	Obs Mgt & Controls	Team_NCRA	2	PI12	Sprint 5
SP-1848	Improve SKALLOP to be a well documented and tested product	Releasing	Le Roux, Gerhard [X] (Inactive)	Services	Team_VULCAN	2	PI12	Sprint 5
~~SP-1845~~	Further elaboration of acceptance tests for standalone SDP	Done	Fenech, Danielle	Data Processing	Team_ORCA	3	PI12	Sprint 5
~~SP-1983~~	OET robustness improvements in support of SS-82	Done	Bridger, Alan	Obs Mgt & Controls	Team_BUTTONS	2	PI12	Sprint 5
SP-1942	Define and organise test sets, test plans and aggregation of results for SKA MID in XRay	Implementing	Le Roux, Gerhard [X] (Inactive)	Services		1	PI12, PI14, PI15, PI16, PI17
~~SP-1961~~	TMC component testing	Discarded	Vrcic, Sonja	Obs Mgt & Controls			PI12

Structure

Activity

People

Assignee:: Bartolini, Marco

Reporter:: Santander-Vela, Juande

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Feature Progress

Story Point Burn-up: (98.96%)

Feature Estimate: 13.0

	Issues	Story Points
To Do	0	0.0
In Progress	1	1.0
Complete	38	95.5
Total	39	96.5

Capability Progress

Feature Point Burn-up: (93.33%)

Capability Estimate: 13

	Count	Feature Points
Todo	0	0
In Progress	1	2
Done	38	14
Total	39	15

Dates

Created:: 12/Aug/21 11:53 AM

Updated:: 21/May/24 12:54 AM

Resolved:: 21/May/24 12:54 AM

Make sure SKAMPI is stable for developers and users