Details
-
Enabler
-
Must have
-
None
-
True
-
Data Processing
-
-
-
Intra Program
-
2
-
2
-
0
-
Team_ORCA
-
Sprint 5
-
-
-
-
18.6
-
Stories Completed, Integrated, Solution Intent Updated, BDD Testing Passes (no errors), Outcomes Reviewed, NFRS met, Demonstrated, Satisfies Acceptance Criteria, Accepted by FO
-
-
SOL-G4
Description
The SDP subarray, like all subarray devices, goes through the AssignResources command into the IDLE obsState, and then through Configure to READY.
In the first command, all the necessary resources needed by an observation are specified (e.g., the PB to create, its script and its parameters, etc.), and the command only transitions the subarray to IDLE once the resources have been created. For example, in the case of the vis-receive script, the subarray transitions to IDLE as soon as the vis-receive script launches the underlying receive Helm Chart.
In the second command, the subarray transitions to READY directly without further ado (except for checking it is in a valid obsState that allows the transition).
This situation creates a race condition for external SDP users: if the AssignResources returns, and the Configure command is issued immediately after, the resources that have just been created might not be ready just yet. Following the example above, the receiver pod might take a while to transition to its Ready phase, depending on how long its initContainers take to run, any filesystem latencies, Kubernetes scheduling latencies, etc., so by the time the Configure command arrives the subarray will transition to READY, thus allowing a Scan to start, even though the visibility receiver might not be ready to actually receive any data.
To overcome this limitation, different techniques have been developed in different places. For instance:
- After issuing AssignResources during the visibility receive tests, the SDP integration tests wait both for the subarray to go to IDLE and for the receiver k8s pod to achieve its Ready condition (https://gitlab.com/ska-telescope/sdp/ska-sdp-integration/-/blob/master/tests/integration/test_visibility_receive.py#L268-280), all before issuing further Configure and Scan commands.
- More optimistically, the "SKA SDP Vis Receive Low with CBF-Emulator Example" notebook in ska-sdp-notebooks waits for the subarray to go to IDLE, then sleeps for 20 seconds (https://gitlab.com/ska-telescope/sdp/ska-sdp-notebooks/-/blob/main/src/ska-sdp-vis-receive-low-example.ipynb?plain=1#L629-650), after which it starts issuing Configure and Scan commands.
- In old experiments with SKAMPI and the emulated CBF subarray device we used a similar trick, whereby after issuing the AssignResources command via the TMC we would have to wait for some amount of time before issuing a Configure and Scan command
(https://confluence.skatelescope.org/display/SE/SPO-944+implementation+notes#SPO944implementationnotes-TorunitonPSILow, expand "OET script" and look for "HACK HACK HACK")
To avoid these horrible hacks, after receiving the Configure command the subarray should check that the visibility receive is indeed ready before entering the READY obsState. If it is not ready immediately, it can enter the transitional CONFIGURING obsState and subsequently go to READY when that condition is met.
In order to do this, the processing script needs to monitor the status of the pods (via the Helm deployer) and report their status in the processing block state. The subarray can use this information to make the appropriate state transitions (cf. the EMPTY-RESOURCING-IDLE transition in response to the AssignResources command).
Who?
- SDP developers.
- AIV engineers.
- Commissioning scientists.
What?
- SDP Subarrays ensure that the resources created as part of AssignResources are ready by the time the READY obsState is reached.
- A new test checks the behavior by starting a script with a configurable minimum time-to-ready delay, checking that the transition to READY obsState takes no less than said time.
- Workarounds are removed from known, relevant locations (SDP integration tests and SDP notebook mentioned above).
Why?
- Because otherwise users have to work-around this oversight and figure out by themselves if SDP is really ready to perform a scan.
Attachments
Issue Links
- Is delivered by
-
REL-456 SKA-MID.23.3-rc1
- Discarded
-
REL-654 SDP Scripting Library v0.5.0
- Released
-
REL-655 SDP Local Monitoring and Control v0.22.0
- Released
-
REL-656 SDP Helm Deployer v0.12.0
- Released
-
REL-664 SDP Scripting Library v0.5.2
- Released
-
REL-665 SDP Helm Deployer v0.12.1
- Released
-
REL-670 SDP 0.16.0
- Released
-
REL-688 SDP Helm Deployer v0.12.2
- Released
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
- mentioned on