Loading...

Change Owns to Parent Ofs

Set start and due date...

Xporter

XML

Word

Printable

Details

Type: Enabler
Priority: Must have
Fix Version/s: PI18
Component/s: COM TMC SW
Labels:
- Team_SAHYADRI

ARTs:

Obs Mgt & Controls
Benefit hypothesis:

Hide

The TMC is the overall telescope control and monitoring system. It is important that it remains "live" and able to cope with the absence or presence of the other systems, allowing the user to observe the state of the whole system.

Show
The TMC is the overall telescope control and monitoring system. It is important that it remains "live" and able to cope with the absence or presence of the other systems, allowing the user to observe the state of the whole system.
Acceptance criteria:
Hide

When starting the system the TMC will start cleanly and be able to report the absence of any key subsystem (dishes, CSP, SDP, MCCS) if that subsystem does not start (either deliberately or because of failure).

The TMC should cleanly show that it has not been able to read attributes from the missing subsystem and reflect this in any aggregated report.

The TMC should reject attempts to command the missing subsystem.

STRETCH - the TMC should be able to detect the loss of a key subsystem that was present and then at as in the points above.

STRETCH - the TMC should be able to detect the successful starting of a key subsystem that was not present and then act normally.

Clarification for point 3 above: There is a subtlety for commands such as "assign resources" and "configure". There may be reasons to take individual subsystems through the Observing State Machine without all being present - or even if all are present. A particular example is to be able to instruct the SDP to run a pipeline on data it already has for re-processing. In this example only SDP resources are assigned to a subarray and only the SDP is commanded. The JSON requesting the resources and configurations will only contain an SDP structure. In this example the TMC should not be instructing Dish/MCCS and CSP because no resources are requested of them, and no configuration is requested of them. This subtlety may need more discussion during development.

Note this can be tested in integration repo
Show
When starting the system the TMC will start cleanly and be able to report the absence of any key subsystem (dishes, CSP, SDP, MCCS) if that subsystem does not start (either deliberately or because of failure). The TMC should cleanly show that it has not been able to read attributes from the missing subsystem and reflect this in any aggregated report. The TMC should reject attempts to command the missing subsystem. STRETCH - the TMC should be able to detect the loss of a key subsystem that was present and then at as in the points above. STRETCH - the TMC should be able to detect the successful starting of a key subsystem that was not present and then act normally. Clarification for point 3 above: There is a subtlety for commands such as "assign resources" and "configure". There may be reasons to take individual subsystems through the Observing State Machine without all being present - or even if all are present . A particular example is to be able to instruct the SDP to run a pipeline on data it already has for re-processing. In this example only SDP resources are assigned to a subarray and only the SDP is commanded. The JSON requesting the resources and configurations will only contain an SDP structure. In this example the TMC should not be instructing Dish/MCCS and CSP because no resources are requested of them, and no configuration is requested of them. This subtlety may need more discussion during development. Note this can be tested in integration repo
Feature Points:
2
Initial Size:
2
WSJF:
0
Delivered By:

REL-512 SKA-LOW.23.3-rc1

REL-456 SKA-MID.23.3-rc1

REL-508 TMC Mid v0.11.0

REL-509 TMC LOW v0.6.0
Epic Link:
AA0.5 Observation Execution
Agile Teams:

Team_SAHYADRI
Due Sprint:
Sprint 3
Story Point Burn-up:
Overdue:
Outcomes:
Hide

Design:
The design for availability reporting was discussed with the FO. The same is available on the confluence page:
https://confluence.skatelescope.org/display/SWSI/Availability+Attribute
This includes the availability reporting for CSP and SDP subarray devices, by the respective leaf nodes, and aggregation of it by the TMC subarray node.
The Dish leaf nodes are not considered in the subarray availability reporting, (this may need further consideration)
Implementation:

TMC subarray leaf nodes detect the unavailability of CSP and SDP subarray devices. This is reported to the subarray node, which in turn reports an aggregated 'availability' value.

TMC master leaf nodes detect the unavailability of CSP and SDP master devices. This is reported to the central node, which in turn reports it on its 'availability' attribute.

TMC blocks the command execution depending on the SDP and CSP system unavailability.

The implementation is Verified on individual TMC node repos.

The related MRs that are merged as below:
CentralNode: https://gitlab.com/ska-telescope/ska-tmc/ska-tmc-centralnode/-/merge_requests/103
SubarrayNode: https://gitlab.com/ska-telescope/ska-tmc/ska-tmc-subarraynode/-/merge_requests/102
https://gitlab.com/ska-telescope/ska-tmc/ska-tmc-subarraynode/-/merge_requests/104
Tmc leaf nodes: https://gitlab.com/ska-telescope/ska-tmc/ska-tmc-cspleafnodes/-/merge_requests/59
https://gitlab.com/ska-telescope/ska-tmc/ska-tmc-sdpleafnodes/-/merge_requests/338
https://gitlab.com/ska-telescope/ska-tmc/ska-tmc-dishleafnode/-/merge_requests/39

System Demo provided for the TMC reporting unavailability of csp and sdp sub-systems.

Integration Updates/Issues (tmc integration repo):
It is observed that, on startup. the Central node misses the subarray node's availability events, and hence does not report the subarray availability correctly.
During subsequent operation, if the csp/sdp devices become unavailable, it is reflected correctly on the subarray node's aggregated 'availability'. In this case, the Central node also receives the events and reports the subarray 'availability' correctly.
Due to the issue mentioned above, the integration of the implemented functionality is not complete.
This integration activity needs to be continued in the next PI, hence feature may be carried forward, cloned or new feature can be created to take the work to closure.
Show
Design : The design for availability reporting was discussed with the FO. The same is available on the confluence page: https://confluence.skatelescope.org/display/SWSI/Availability+Attribute This includes the availability reporting for CSP and SDP subarray devices, by the respective leaf nodes, and aggregation of it by the TMC subarray node. The Dish leaf nodes are not considered in the subarray availability reporting, (this may need further consideration) Implementation : TMC subarray leaf nodes detect the unavailability of CSP and SDP subarray devices. This is reported to the subarray node, which in turn reports an aggregated 'availability' value. TMC master leaf nodes detect the unavailability of CSP and SDP master devices. This is reported to the central node, which in turn reports it on its 'availability' attribute. TMC blocks the command execution depending on the SDP and CSP system unavailability. The implementation is Verified on individual TMC node repos. The related MRs that are merged as below: CentralNode: https://gitlab.com/ska-telescope/ska-tmc/ska-tmc-centralnode/-/merge_requests/103 SubarrayNode: https://gitlab.com/ska-telescope/ska-tmc/ska-tmc-subarraynode/-/merge_requests/102 https://gitlab.com/ska-telescope/ska-tmc/ska-tmc-subarraynode/-/merge_requests/104 Tmc leaf nodes: https://gitlab.com/ska-telescope/ska-tmc/ska-tmc-cspleafnodes/-/merge_requests/59 https://gitlab.com/ska-telescope/ska-tmc/ska-tmc-sdpleafnodes/-/merge_requests/338 https://gitlab.com/ska-telescope/ska-tmc/ska-tmc-dishleafnode/-/merge_requests/39 System Demo provided for the TMC reporting unavailability of csp and sdp sub-systems. Integration Updates/Issues (tmc integration repo): It is observed that, on startup. the Central node misses the subarray node's availability events, and hence does not report the subarray availability correctly. During subsequent operation, if the csp/sdp devices become unavailable, it is reflected correctly on the subarray node's aggregated 'availability'. In this case, the Central node also receives the events and reports the subarray 'availability' correctly. Due to the issue mentioned above, the integration of the implemented functionality is not complete . This integration activity needs to be continued in the next PI, hence feature may be carried forward, cloned or new feature can be created to take the work to closure.
Resolved PI.Sprint:
19.6

Feature Checklist:

Stories Completed, Integrated, BDD Testing Passes (no errors), Outcomes Reviewed, Demonstrated, Satisfies Acceptance Criteria

Demos:
- OMC_ART_SystemDemo_5
Requirement Status:

PI24 - UNCOVERED
Labels_MIRO:
Team_SAHYADRI
Goals_MIRO:
OMC-G1 SOL-G4

Description

The TMC should be able to handle the absence of key subsystems (CSP, SDP, MCCS, Dish,...) in a graceful manner.

If a key subsystem is not present then the TMC should report this to the user and be able to show the absence cleanly in a dashboard.
When a subsystem is not present then the TMC should handle the failure to obtain key attributes gracefully, e.g. attributes that feed into an aggregated status.
If the TMC is asked (by the OET or by a user) to issue commands to an absent subsystem it should handle this cleanly and report the problem to the user (But note the "clarification" in the acceptance criteria - this may need further discussion).
The TMC should be able to detect that a key subsystem has disappeared if this should happen while the whole system is running, and then it should act as above (stretch)
The TMC should be able to detect the appearance of a key subsystem and absorb it cleanly into the system. (stretch)

Note: testing these scenarios can be done in the integration repo only.

Attachments

Issue Links

Is delivered by

REL-456 SKA-MID.23.3-rc1

Discarded

REL-508 TMC Mid v0.11.0

Discarded

REL-509 TMC LOW v0.6.0

Discarded

REL-512 SKA-LOW.23.3-rc1

Discarded

mentioned in: Page Loading...; Page Loading...

mentioned on

Merge request - Draft: SAH-1335 : Callbacks related changes

(1 mentioned in, 1 mentioned on)

Structure

Activity

People

Assignee:: Le Roux, Gerhard [X] (Inactive)

Reporter:: Bridger, Alan

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Feature Progress

Story Point Burn-up: (100.00%)

Feature Estimate: 2.0

	Issues	Story Points
To Do	0	0.0
In Progress	0	0.0
Complete	8	37.0
Total	8	37.0

Dates

Created:: 28/Feb/23 12:02 PM

Updated:: 07/Aug/24 1:42 PM

Resolved:: 29/Aug/23 12:59 PM

TMC should be robust against the absence of subsystems