Details
-
Enabler
-
Data Processing
-
-
-
2
-
2
-
2
-
Team_ESCAPEES
-
Sprint 5
-
-
-
-
5.6
-
-
Team_ESCAPEES goal_D1
Description
Gain the experience required to test and demonstrate the use of Jupyter notebooks (possibly via Jupyter hub), accelerated by a Dask cluster, and deployed using Kubernetes for executing SKA workflows on an SRC-like environment.
This approach and technology stack is gaining rapid adoption and ever increasing sophistication in recent years and has already been adopted by big-data geoscience platforms such as Pangeo. It would therefore be useful to assess if this model might also prove to be a useful approach for developing SKA1 workflows. This is an attractive option for workflow development as it would allow semi-interactive data analysis during development and commissioning of the telescope, but also batch execution (eg using papermill) in the SDP system using the same representation.
Note that a number of Jupyter notebook Dask workflows are available as part of the SDP ARL that could be used for this test.