Loading...

Change Owns to Parent Ofs

Set start and due date...

Xporter

XML

Word

Printable

Details

Type: Feature
Priority: Must have
Fix Version/s: PI15
Component/s: COM SDP SW
Labels:
None

ARTs:

Data Processing
Benefit hypothesis:

Hide

(see why in the description)

SDP will need to be able to manage long-running processing jobs that will be semi-independent of the overall observatory schedule. This means that it is critical to be able to operate SDP long-term without the need for regular shut-downs for either internal (storage filling up) or external reasons (interface changes).

Show
(see why in the description) SDP will need to be able to manage long-running processing jobs that will be semi-independent of the overall observatory schedule. This means that it is critical to be able to operate SDP long-term without the need for regular shut-downs for either internal (storage filling up) or external reasons (interface changes).
Acceptance criteria:
Hide

Deploy SDP into a shared namespace on DP testing platform (and/or new k8s cluster provided by the system team for SS-94), document how it would be done in different environments

Especially be ready to give some support to others in replicating it as needed (e.g. help SKA operations to set up a persistent demonstration system in their operations room). Timescale isn't quite clear on this, might have to be clarified as part of PI planning.

Expose operator interfaces (e.g. notebooks?) that allow operators to run processing blocks on SDP

Demonstrate that long-term usage will not cause unmanageable "leaks" - such as configuration database entries, storage usage or Kubernetes entities.

Stretch: Show how we can migrate to new SDP (LMC?) versions without interrupting ongoing processing.
Show
Deploy SDP into a shared namespace on DP testing platform (and/or new k8s cluster provided by the system team for SS-94 ), document how it would be done in different environments Especially be ready to give some support to others in replicating it as needed (e.g. help SKA operations to set up a persistent demonstration system in their operations room). Timescale isn't quite clear on this, might have to be clarified as part of PI planning. Expose operator interfaces (e.g. notebooks?) that allow operators to run processing blocks on SDP Demonstrate that long-term usage will not cause unmanageable "leaks" - such as configuration database entries, storage usage or Kubernetes entities. Stretch: Show how we can migrate to new SDP (LMC?) versions without interrupting ongoing processing.
Feature Points:
4
Initial Size:
4
WSJF:
0
Informed By:

USE-24 Access, Inspect, and Analyse Telescope Data Products

USE-22 Operate and Monitor Telescope

USE-3 SDP: Configure and execute visibility receive workflow
Delivered By:

REL-106 SDP v1.0.0
Epic Link:
SDP AA0.5 system
Agile Teams:

Team_ORCA
Due Sprint:
Sprint 5
Story Point Burn-up:
Overdue:
Outcomes:
Hide

We have made a persistent stand-alone deployment of the SDP in shared namespaces on the DP cluster (called dp-shared and dp-shared-p).

In order to exercise the deployment and get data on its behaviour, we run an automated test once every hour using a scheduled run of the SDP integration CI pipeline. The test runs the visibility receive processing script only. See the pipeline runs in GitLab.

The configuration of the persistent deployment differs from the "ephemeral" deployments for testing in a number of ways:

Persistence is enabled for the configuration database, so if the server is restarted then the state of the system is not lost. We have the option of deploying multiple servers for even greater reliability, but so far that has not proved necessary.

Two subarrays are deployed, one for the automated testing and the other for manual testing by users. If it become necessary to support multiple simultaneous manual testers, we can deploy more subarrays (the Helm chart supports an arbitrary number).

More information is available in Confluence in Persistent SDP. We have set up a preliminary page to capture feedback from users and a Slack channel for users to ask for help: #help-sdp.

We have implemented continuous deployment for the SDP integration repository. The CI pipeline is configured to deploy the SDP to the following environments:

Integration, which is upgraded when a branch is merged to master (namespaces: sdp-integration and sdp-integration-p), and

Staging, which is upgraded when a release is tagged (namespaces: sdp-staging and sdp-staging-p).

If we follow the integration → staging → production deployment model, we could consider the shared persistent deployment as one of the production environments. The deployment to "production" is not automated at present, so it is done manually.

We have created Jupyter notebooks to explain how to control the SDP via its Tango interface and the ska-sdp CLI. They can be run on the DP cluster using the BinderHub set up by the System Team. The notebooks are in the ska-sdp-notebooks repository and they are documented in the developer portal.
Show
We have made a persistent stand-alone deployment of the SDP in shared namespaces on the DP cluster (called dp-shared and dp-shared-p). In order to exercise the deployment and get data on its behaviour, we run an automated test once every hour using a scheduled run of the SDP integration CI pipeline. The test runs the visibility receive processing script only. See the pipeline runs in GitLab . The configuration of the persistent deployment differs from the "ephemeral" deployments for testing in a number of ways: Persistence is enabled for the configuration database, so if the server is restarted then the state of the system is not lost. We have the option of deploying multiple servers for even greater reliability, but so far that has not proved necessary. Two subarrays are deployed, one for the automated testing and the other for manual testing by users. If it become necessary to support multiple simultaneous manual testers, we can deploy more subarrays (the Helm chart supports an arbitrary number). More information is available in Confluence in Persistent SDP . We have set up a preliminary page to capture feedback from users and a Slack channel for users to ask for help: #help-sdp . We have implemented continuous deployment for the SDP integration repository. The CI pipeline is configured to deploy the SDP to the following environments: Integration, which is upgraded when a branch is merged to master (namespaces: sdp-integration and sdp-integration-p), and Staging, which is upgraded when a release is tagged (namespaces: sdp-staging and sdp-staging-p). If we follow the integration → staging → production deployment model, we could consider the shared persistent deployment as one of the production environments. The deployment to "production" is not automated at present, so it is done manually. We have created Jupyter notebooks to explain how to control the SDP via its Tango interface and the ska-sdp CLI. They can be run on the DP cluster using the BinderHub set up by the System Team. The notebooks are in the ska-sdp-notebooks repository and they are documented in the developer portal .
Resolved PI.Sprint:
17.4

Feature Checklist:

Stories Completed, Integrated, Solution Intent Updated, BDD Testing Passes (no errors), Outcomes Reviewed, NFRS met, Demonstrated, Satisfies Acceptance Criteria, Accepted by FO

Demos:
- DP_ART_15.5
Requirement Status:

PI22 - UNCOVERED
Goals_MIRO:
SPO-1775 SPO-1782

Description

Who?

AA0.5 Operators
AIV Engineers

What? (outcomes)

Contribution to SS-94 for SDP being available in a persistent sandbox deployment (ie the PI15 "operations sandbox")
Coordinate and contribute to the publishing two SDP releases:
1. ~~REL-9~~ (Targeting 15.2-3)
  - SDP (or SDP as part of SKAMPI) is deployed to one or more persistent environments in a way that allows long-term maintenance and is configurable to be deployed for either MID or LOW control device naming
2. ~~REL-106~~ (Targeting 15.4)
  1. Operator-oriented interfaces to use and manage the instance (collaboration with ~~SP-2551~~)
  2. A script/notebook is provided along with sufficient documentation (similar to that for the standalone SDP walkthrough) to
    - Startup and shutdown SDP in standalone mode and as part of a SKAMPI deployment (USE-1, USE-21)
    - Run the visibility receive workflow to capture and inspect data using a CBF emulator (USE-22, USE-3, USE-24) in standalone mode and as part of a SKAMPI deployment, ideally using the latest release of visibility receive capable of running multiple successive scans (~~SP-2483~~)
Help support exploratory testing (15.5)

Why?

We believe that a persistent sandbox deployment of the SDP system will provide valuable experience and feedback needed to deliver a robust, and resilient system for use at early Array Assemblies, starting with AA0.5.
We also believe that the developer experience and knowledge needed for introducing new workflows or other service components necessarily to roll out new SDP capabilities will be greatly improved by developing these against a well-tested deployment of the latest stable SDP system.

References

Related solution goal: SPO-1775
Services ART are planning to roll out a notebook service by ~15.3 (~~SP-2590~~) which should be ready for use with ~~REL-106~~ scripts

Attachments

Issue Links

Child Of

SS-94 MID software released and deployed in the software "operations sandbox"

Implementing

Is delivered by

REL-106 SDP v1.0.0

Discarded

relates to

SP-2590 Jupyter/BinderHub as Service

Done

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...