# Startup integration test
- Sequency of steps to start up CSP (https://skaoffice.jamacloud.com/perspective.req#/testCases/1085128?projectId=328)
- Check what to expects for commands available via the `csp_controller`

References:
- [LOW.CSP LMC Documentation](https://developer.skatelescope.org/projects/ska-csp-lmc-low/en/latest/lmc/low_csp_lmc.html)
- [LOW.CSP LMC Tango Clients Examples](https://developer.skatelescope.org/projects/ska-csp-lmc-low/en/latest/example/example.html)
- [CSP LMC commands for AA05](https://confluence.skatelescope.org/display/SE/CSP+LMC+commands+for+AA05)

Visual inspection:

The notebook will interogate device states and report back attribute values as part of the verification output    
For visual inspection of device attributes the Taranta API interface is used.    
You can access the interface using the link below. Please note the KUBE_NAMESPACE parameter as defined in the "Tango config" section of this notebook    
- without HW: http://k8s.clp.skao.int/ska-low-csp-baseline-no-hw/taranta/devices/low-csp/
- with HW: http://k8s.clp.skao.int/ska-low-csp-baseline/taranta/devices/low-csp/

### Prerequisites

- All necessary equipment are installed and verified
- Assume a network is available and all equipment/systems are powered
- P4 switch is configured in order to control CBF
- LOW CSP has been deployed to the k8s cluster

### Imports

In [10]:
import json
import os
from contextlib import suppress
from time import sleep, time

import tango
from ska_control_model import AdminMode, ObsState

### Helper functions

In [32]:
def wait_until(
    predicate: callable, message_on_fail: str = None, timeout: int = 300, poll_frequency: int = 2
) -> None:
    start = time()
    while True:
        try:
            return_val = predicate()
            if return_val:
                return return_val
        except (IndexError, TypeError):
            sleep(0.1)
        if time() - start > timeout:
            raise TimeoutError(f"Timeout occurred: {message_on_fail}")
        sleep(poll_frequency)


def wait_for_attribute_value(
    device: tango.DeviceProxy,
    attribute: str,
    value=True,
    failure_message: str = "Timed out waiting for attribute value",
    max_duration: int = 120,
) -> None:
    """
    Wait until an attribute has a certain value

    :param device: Tango device proxy with the attribute to check
    :param attribute: The name of the attribute
    :param value: Expected value (defaults to True)
    :param failure_message: Message for the exception on failure.
    Defaults to "Timed out after attribute value".
    A note about duration is appended.
    :param max_duration: Approximate time-out period (in reality
    it could be longer due to delays waiting for each attribute read)
    :raises RuntimeError: if expected value not seen before timing out
    """
    sleep_per_loop = 2
    max_sleeps = max_duration // sleep_per_loop
    for _ in range(max_sleeps + 1):
        if getattr(device, attribute) == value:
            break
        sleep(sleep_per_loop)
    else:
        raise RuntimeError(f"{failure_message} after {max_duration}s")


def wait_for_device_response(
    device: tango.DeviceProxy,
    failure_message: str = "Timed out waiting for device to respond",
    max_duration: int = 120,
) -> None:
    """
    Wait until a device responds.

    :param device: Tango device proxy to wait for
    :param failure_message: Message for the exception on failure.
    Defaults to "Timed out after attribute value".
    A note about duration is appended.
    :param max_duration: Approximate time-out period
    :raises RuntimeError: if the device does not respond in time
    """
    timeout = time() + max_duration
    interval = 1.5
    while time() < timeout:
        try:
            device.ping()
            return
        except tango.ConnectionFailed:
            sleep(interval)
    raise RuntimeError(f"{failure_message} after {max_duration}s")

In [12]:
# Colored printing functions for strings that use universal ANSI escape sequences.
# fail: bold red, pass: bold green, warn: bold yellow,
# info: bold blue, bold: bold white


def print_fail(message, start="", end="\n"):
    print(f"{start} \x1b[1;31m{message.strip()}\x1b[0m", end=end)


def print_pass(message, start="", end="\n"):
    print(f"{start} \x1b[1;32m{message.strip()}\x1b[0m", end=end)


def print_warn(message, start="", end="\n"):
    print(f"{start} \x1b[1;33m{message.strip()}\x1b[0m", end=end)


def print_debug(message, start="", end="\n"):
    print(f"{start} \x1b[1;34m{message.strip()}\x1b[0m", end=end)


def print_info(message, start="", end="\n"):
    print(f"{start} {message.strip()}", end=end)


def print_bold(message, start="", end="\n"):
    print(f"{start} \x1b[1;37m{message.strip()}\x1b[0m", end=end)

### Tango config

This section links the notebook execution to the tango devices on the cluster.
The most important parameter is the namespace name: KUBE_NAMESPACE    
This identifies the k8s namespace with which to intend to interact.
For running notebooks on the CLP k8s cluster this needs to be "ska-low-csp-integration"

In [13]:
# specify here the namespace to connect in this cluster
KUBE_NAMESPACE_EX_HW = "ska-low-csp-baseline-no-hw"  # run on deployment without HW
KUBE_NAMESPACE = "ska-low-csp-baseline"  # run on deployment with HW
# set the name of the databaseds service
DATABASEDS_NAME = "ska-low-csp-databaseds"

# finally set the TANGO_HOST
os.environ["TANGO_HOST"] = f"{DATABASEDS_NAME}.{KUBE_NAMESPACE_EX_HW}.svc.cluster.local:10000"

### Tango proxy devices

In [14]:
csp_controller = tango.DeviceProxy("low-csp/control/0")
subarray_id = 1
csp_subarray_1 = tango.DeviceProxy(f"low-csp/subarray/{subarray_id:02}")

In [15]:
cbf_controller = tango.DeviceProxy("low-cbf/control/0")
cbf_subarray_1 = tango.DeviceProxy(f"low-cbf/subarray/{subarray_id:02}")

In [16]:
csp_devices = (csp_controller, csp_subarray_1)
cbf_devices = (cbf_controller, cbf_subarray_1)
all_devices = csp_devices + cbf_devices

### Current state of devices

In [17]:
def color_print(device):
    wait_for_device_response(device)
    if str(device.state()) == tango.DevState.FAULT:
        print_fail(f"{device.status()}".strip(), start="\t")
    elif str(device.state()) == tango.DevState.ALARM:
        print_warn(f"{device.status()}".strip(), start="\t")
    else:
        print_info(f"{device.status()}".strip(), start="\t")


def show_state():
    for device in all_devices:
        print(f"TANGO device: {device.name()}")
        color_print(device)
        try:
            print(f"\t{str(device.adminMode)}")
        except AttributeError:
            print("raises error in this state")
        print(f"\t{str(device.healthState)}")
        with suppress(AttributeError):
            print(f"\t{str(device.obsState)}")

If system is freshly deployed

`Deviation should be in STANDBY after deployment`

`TANGO device: low-cbf/control/0`    
`	The device is in DISABLE state.`    
`	adminMode.OFFLINE`    
`	healthState.UNKNOWN`    
`TANGO device: low-csp/control/0`   
`	The device is in DISABLE state.`   
`	adminMode.OFFLINE`   
`	healthState.UNKNOWN`   
`TANGO device: low-cbf/subarray/01`   
`	The device is in DISABLE state.`   
`	adminMode.OFFLINE`   
`	healthState.UNKNOWN`   
`	obsState.EMPTY`   
`TANGO device: low-csp/subarray/01`   
`	The device is in DISABLE state.`   
`	adminMode.OFFLINE`   
`	healthState.UNKNOWN`   
`	obsState.EMPTY`  

In [18]:
show_state()

TANGO device: low-csp/control/0
	 The device is in DISABLE state.
	adminMode.OFFLINE
	healthState.UNKNOWN
TANGO device: low-csp/subarray/01
	 The device is in DISABLE state.
	adminMode.OFFLINE
	healthState.UNKNOWN
	obsState.EMPTY
TANGO device: low-cbf/control/0
	 The device is in DISABLE state.
	adminMode.OFFLINE
	healthState.UNKNOWN
TANGO device: low-cbf/subarray/01
	 The device is in DISABLE state.
	adminMode.OFFLINE
	healthState.UNKNOWN
	obsState.EMPTY


### Init devices
`Init() should not be needed as external command`

In [19]:
for device in all_devices:
    print(f"Initializing TANGO device: {device.name()}")
    device.set_timeout_millis(60_000)
    device.Init()

Initializing TANGO device: low-csp/control/0
Initializing TANGO device: low-csp/subarray/01
Initializing TANGO device: low-cbf/control/0
Initializing TANGO device: low-cbf/subarray/01


### AdminMode ONLINE
Low CSP.LMC Controller and Subarrays adminMode have to be set to MAINTENANCE or ONLINE to start the connection with the subordinate Low CBF TANGO Devices.

`Attribute setters should not be needed`

In [22]:
csp_controller.adminMode = AdminMode.ONLINE
wait_for_attribute_value(csp_controller, "isCommunicating", True)

Low CSP.LMC Controller forwards the adminMode value to its Subarrays and subordinated systems devices.

In [27]:
show_state()

TANGO device: low-csp/control/0
	 The device is in ON state.
	adminMode.ONLINE
	healthState.UNKNOWN
TANGO device: low-csp/subarray/01
	 The device is in ON state.
	adminMode.ONLINE
	healthState.UNKNOWN
	obsState.EMPTY
TANGO device: low-cbf/control/0
	 The device is in ON state.
	adminMode.ONLINE
	healthState.UNKNOWN
TANGO device: low-cbf/subarray/01
	 The device is in ON state.
	adminMode.ONLINE
	healthState.UNKNOWN
	obsState.EMPTY


### Transit LOW CSP to OFF state
OFF: power is disconnected. This state cannot be reported by CSP itself

The Off command disables any signal processing capability of a subarray and all its allocated resources are also released. As for the ADR-8, this command can be issued from any observing state.    
https://confluence.skatelescope.org/pages/viewpage.action?pageId=105416556

In [28]:
csp_controller.Off([])
print(csp_controller.commandResult)
print(csp_controller.longRunningCommandResult)

('off', '2')
('1697197236.241705_138129604691925_Off', '[2, "Task queued"]')


In [29]:
csp_controller.commandResult

('off', '1')

The result of next line should be similar to: ('1695980314.1356044_88107131321202_Off', '[3, "off completed  1/1"]')

In [33]:
wait_until(
    lambda: "off completed" in csp_controller.longRunningCommandResult[1],
    "Off is not completed after 300s",
)
csp_controller.longRunningCommandResult
# TODO check state when deviation will be fixed

('1697197236.241705_138129604691925_Off', '[3, "off completed  1/1"]')

### Proof OFF state
### `Deviation command does not change state to OFF`

In [34]:
# TODO update when OFF starts working as expected
if csp_controller.state() == tango.DevState.OFF:
    print_pass(f"{csp_controller.name()} is {csp_controller.state()}")
    print_pass(f"{csp_subarray_1.name()} is {csp_subarray_1.state()}")
    print_pass("Test passed")
else:
    print_fail(f"{csp_controller.name()} is {csp_controller.state()}")
    print_fail(f"{csp_subarray_1.name()} is {csp_subarray_1.state()}")
    print_fail(f"{csp_subarray_1.name()} in {csp_subarray_1.obsState} state")
    print_fail("Test failed")

 [1;31mlow-csp/control/0 is ON[0m
 [1;31mlow-csp/subarray/01 is ON[0m
 [1;31mlow-csp/subarray/01 in 0 state[0m
 [1;31mTest failed[0m


### Transit LOW CSP to STANDBY state
### `Deviation CSP does not enter STANDBY mode`

`LOW CSP.LMC and LOW CBF controllers and subarrays reports STANDBY state`  
`LOW CSP.LMC and LOW CBF obsState reports EMPTY`   
`The LOW CSP healthState is OK`

In [35]:
csp_controller.standby([])
wait_until(
    lambda: "standby completed" in csp_controller.longRunningCommandResult[1],
    "Standby is not completed after 300s",
)
print(csp_controller.commandResult)
print(csp_controller.longRunningCommandResult)
show_state()

('standby', '3')
('1697197376.9392414_233720228317634_Standby', '[3, "standby completed  1/1"]')
TANGO device: low-csp/control/0
	 The device is in ON state.
	adminMode.ONLINE
	healthState.UNKNOWN
TANGO device: low-csp/subarray/01
	 The device is in ON state.
	adminMode.ONLINE
	healthState.UNKNOWN
	obsState.EMPTY
TANGO device: low-cbf/control/0
	 The device is in ON state.
	adminMode.ONLINE
	healthState.UNKNOWN
TANGO device: low-cbf/subarray/01
	 The device is in ON state.
	adminMode.ONLINE
	healthState.UNKNOWN
	obsState.EMPTY


### `Deviation CSP does not enter STANDBY mode`

In [36]:
if csp_controller.state() == tango.DevState.STANDBY:
    print_pass(f"{csp_controller.name()} is {csp_controller.state()}")
    print_pass(f"{csp_subarray_1.name()} is {csp_subarray_1.state()}")
    print_pass("Test passed")
else:
    print_fail(f"{csp_controller.name()} is {csp_controller.state()}")
    print_fail(f"{csp_subarray_1.name()} is {csp_subarray_1.state()}")
    print_fail(f"{csp_subarray_1.name()} in {csp_subarray_1.obsState} state")
    print_fail("Test failed")

 [1;31mlow-csp/control/0 is ON[0m
 [1;31mlow-csp/subarray/01 is ON[0m
 [1;31mlow-csp/subarray/01 in 0 state[0m
 [1;31mTest failed[0m


### Assign and configure a subarray

In [37]:
print("Assign resources")
# resources can only be assigned if the array is empty
print(f"{csp_subarray_1.dev_name()} in {str(csp_subarray_1.obsState)}")
assert csp_subarray_1.obsState == ObsState.EMPTY

assign_resources_json = {
    "interface": "https://schema.skao.int/ska-low-csp-assignresources/2.0",
    "common": {
        "subarray_id": subarray_id,
    },
    "lowcbf": {},
}
print(assign_resources_json)

Assign resources
low-csp/subarray/01 in obsState.EMPTY
{'interface': 'https://schema.skao.int/ska-low-csp-assignresources/2.0', 'common': {'subarray_id': 1}, 'lowcbf': {}}


In [38]:
csp_subarray_1.AssignResources(json.dumps(assign_resources_json))
print("Waiting for subarray to become IDLE")
# TODO change to EMPTY
wait_for_attribute_value(csp_subarray_1, "obsState", ObsState.IDLE, "Assignment not finished")
print(f"{csp_subarray_1.dev_name()} in {str(csp_subarray_1.obsState)}")

Waiting for subarray to become IDLE
low-csp/subarray/01 in obsState.IDLE


### `DEVIATION subarray is assigned resources because STANBY is not reachable`

### Temporary step to release resources while STANDBY is no working

In [39]:
csp_subarray_1.ReleaseAllResources()
print("Waiting for subarray to become EMPTY")
wait_for_attribute_value(
    csp_subarray_1, "obsState", ObsState.EMPTY, "Release resources is not finished"
)
print(f"{csp_subarray_1.dev_name()} in {str(csp_subarray_1.obsState)}")

Waiting for subarray to become EMPTY
low-csp/subarray/01 in obsState.EMPTY


### Transition LOW CSP to ON state
### `Deviation, actually command cannot be tested, and this step pass onlly because we set adminMode=Online`
On() activates LMC and CBF (along with subarray 1) and can be used to switch on PST controller and additional subarrays.    
This functionality is however not yet available and a valid test cannot be performed

`LOW CSP.LMC and LOW CBF controllers and subarrays reports ON state`  
`LOW CSP.LMC and LOW CBF obsState reports EMPTY`   
`The LOW CSP healthState is OK`

In [40]:
csp_controller.on([])
wait_for_attribute_value(csp_controller, "isCommunicating", True)
show_state()

TANGO device: low-csp/control/0
	 The device is in ON state.
	adminMode.ONLINE
	healthState.UNKNOWN
TANGO device: low-csp/subarray/01
	 The device is in ON state.
	adminMode.ONLINE
	healthState.UNKNOWN
	obsState.EMPTY
TANGO device: low-cbf/control/0
	 The device is in ON state.
	adminMode.ONLINE
	healthState.UNKNOWN
TANGO device: low-cbf/subarray/01
	 The device is in ON state.
	adminMode.ONLINE
	healthState.UNKNOWN
	obsState.EMPTY


### Assign and configure a subarray

`CSP and CBF subarray state transitions to IDLE and resources are assigned to the subarray`

In [41]:
print("Assign resources")
# resources can only be assigned if the array is empty
print(f"{csp_subarray_1.dev_name()} in {str(csp_subarray_1.obsState)}")
assert csp_subarray_1.obsState == ObsState.EMPTY

assign_resources_json = {
    "interface": "https://schema.skao.int/ska-low-csp-assignresources/2.0",
    "common": {
        "subarray_id": subarray_id,
    },
    "lowcbf": {},
}
print(assign_resources_json)
csp_subarray_1.AssignResources(json.dumps(assign_resources_json))
print("Waiting for subarray to become IDLE")
wait_for_attribute_value(csp_subarray_1, "obsState", ObsState.IDLE, "Assignment not finished")
print(f"{csp_subarray_1.dev_name()} in {str(csp_subarray_1.obsState)}")

Assign resources
low-csp/subarray/01 in obsState.EMPTY
{'interface': 'https://schema.skao.int/ska-low-csp-assignresources/2.0', 'common': {'subarray_id': 1}, 'lowcbf': {}}
Waiting for subarray to become IDLE
low-csp/subarray/01 in obsState.IDLE


### Proof ON state
Proof ON state by assigning resources to subarray   
This step is selected since it can be executed even without HW in the loop

In [47]:
if csp_subarray_1.obsState == ObsState.IDLE:
    print_pass(f"{csp_controller.name()} is {csp_controller.state()}")
    print_pass(f"{csp_subarray_1.name()} is {csp_subarray_1.state()}")
    print_pass(
        (
            f"{csp_subarray_1.name()} successfully transitioned "
            f"from EMPTY to {csp_subarray_1.obsState}"
        )
    )
else:
    print_fail(f"{csp_controller.name()} is {csp_controller.state()}")
    print_fail(f"{csp_subarray_1.name()} is {csp_subarray_1.state()}")
    print_fail(f"{csp_subarray_1.name()} unsuccessfully transitioned from EMPTY to IDLE")
    print_fail(f"{csp_subarray_1.name()} in {csp_subarray_1.obsState} state")

 [1;32mlow-csp/control/0 is ON[0m
 [1;32mlow-csp/subarray/01 is ON[0m
 [1;32mlow-csp/subarray/01 successfully transitioned from EMPTY to 2[0m


### Release resources

`Subarray resources are released and the subarray state transitions to EMPTY`

In [48]:
csp_subarray_1.ReleaseAllResources()
print("Waiting for subarray to become EMPTY")
wait_for_attribute_value(
    csp_subarray_1, "obsState", ObsState.EMPTY, "Release resources is not finished"
)
print(f"{csp_subarray_1.dev_name()} in {str(csp_subarray_1.obsState)}")

Waiting for subarray to become EMPTY
low-csp/subarray/01 in obsState.EMPTY


### Return CSP to OFFLINE state

`CSP returns to OFFLINE state`

In [49]:
csp_controller.adminMode = AdminMode.OFFLINE
wait_for_attribute_value(csp_controller, "isCommunicating", False)
show_state()

TANGO device: low-csp/control/0
	 The device is in DISABLE state.
	adminMode.OFFLINE
	healthState.UNKNOWN
TANGO device: low-csp/subarray/01
	 The device is in DISABLE state.
	adminMode.OFFLINE
	healthState.UNKNOWN
	obsState.EMPTY
TANGO device: low-cbf/control/0
	 The device is in DISABLE state.
	adminMode.OFFLINE
	healthState.UNKNOWN
TANGO device: low-cbf/subarray/01
	 The device is in DISABLE state.
	adminMode.OFFLINE
	healthState.UNKNOWN
	obsState.EMPTY
