Details
-
Spike
-
Won't have (this time)
-
None
-
None
-
Data Processing
-
-
- Definition of the deployment interface.
- Test implementation of the interface using Helm as a back-end (TBD).
-
-
Description
SDP processing deployments are currently tied to Helm and Kubernetes. This results in the following problems:
- It is difficult for a developer unfamiliar with the technology to effectively write a processing script. The scripting library has made an attempt to abstract away the details for some kinds of deployments (e.g. Dask), but receive and real-time processing deployments still use Helm charts directly.
- If we need to deploy SDP processing scripts on a different platform, then extensive rewriting of the scripting library and scripts (and changes to the Processing Controller) would be required. Examples are:
- a developer's laptop, where the processing scripts and execution engines could be launched as processes;
- a shared cluster with a job scheduler like SLURM.
Both of these problems stem from a poor separation of concerns: the code needs to express what to deploy, but this is tightly bound up with how to deploy it. This could be mitigated by developing a platform-agnostic deployment interface. So far, we have a limited number of kinds of deployment:
- Processing script (a Python process running the script)
- Execution deployments:
- Receive e.g. PSS receive or "simple" visibility receive (a single process)
- Receive + real-time processing with Plasma store (group of processes sharing memory via the store)
- Dask execution engine (scheduler and worker processes)
- Daliuge execution engine
- Buffer reservation (filesystem / directory mount)
so the interface could be reasonably limited in scope.
As part of the spike, you should investigate how best to implement this interface. Should it be a library wrapped around the deployment requests in the configuration database, with the choice of deployment back-end implemented as switch in the library? Or should the deployment requests in the config DB be changed to be platform-agnostic?
This spike was created as part of SP-1660.