Loading...

Change Owns to Parent Ofs

Set start and due date...

Xporter

XML

Word

Printable

Details

Type: Enabler
Fix Version/s: PI5
Component/s: COM SDP SW
Labels:
- Team_ESCAPEES
- goal_D1

ARTs:

Data Processing
Benefit hypothesis:

Hide

Jupyter notebooks, accelerated by a scalable data processing framework such as Dask and deployed using Kubernetes, havev already been adopted by big-data geoscience platforms like Pangeo. This model may prove to be useful for developing and deploying SKA workflows. In order to assess if this is the case, it is useful to gain more experience with these technologies.

Show
Jupyter notebooks, accelerated by a scalable data processing framework such as Dask and deployed using Kubernetes, havev already been adopted by big-data geoscience platforms like Pangeo. This model may prove to be useful for developing and deploying SKA workflows. In order to assess if this is the case, it is useful to gain more experience with these technologies.
Acceptance criteria:
Hide

Identify, demonstrate, and document steps taken to run a simple Jupyter notebook, accelerated by a Dask cluster deployed using Kubernetes. This notebook should implement a simple yet radio astronomy data processing relevant processing script, examples of which can already be found in the SDP ARL.

Stretch: Helm deployment script developed and added to the SKA helm chart repo.
Show
Identify, demonstrate, and document steps taken to run a simple Jupyter notebook, accelerated by a Dask cluster deployed using Kubernetes. This notebook should implement a simple yet radio astronomy data processing relevant processing script, examples of which can already be found in the SDP ARL. Stretch: Helm deployment script developed and added to the SKA helm chart repo.
Feature Points:
2
Initial Size:
2
WSJF:
2
Epic Link:
SDP workflow prototyping
Agile Teams:

Team_ESCAPEES
Due Sprint:
Sprint 5
Story Point Burn-up:
Overdue:
Outcomes:

Hide

See: https://confluence.skatelescope.org/pages/viewpage.action?pageId=96174408

Show
See: https://confluence.skatelescope.org/pages/viewpage.action?pageId=96174408
Resolved PI.Sprint:
5.6

Requirement Status:

PI24 - UNCOVERED
Labels_MIRO:
Team_ESCAPEES goal_D1

Description

Gain the experience required to test and demonstrate the use of Jupyter notebooks (possibly via Jupyter hub), accelerated by a Dask cluster, and deployed using Kubernetes for executing SKA workflows on an SRC-like environment.

This approach and technology stack is gaining rapid adoption and ever increasing sophistication in recent years and has already been adopted by big-data geoscience platforms such as Pangeo. It would therefore be useful to assess if this model might also prove to be a useful approach for developing SKA1 workflows. This is an attractive option for workflow development as it would allow semi-interactive data analysis during development and commissioning of the telescope, but also batch execution (eg using papermill) in the SDP system using the same representation.

Note that a number of Jupyter notebook Dask workflows are available as part of the SDP ARL that could be used for this test.

Attachments

Issue Links

relates to

SP-697 Execution Control workflow interface for Jupyter notebooks

Funnel

mentioned in: Page Loading...; Page Loading...

Structure

Activity

People

Assignee:: Mort, Ben

Reporter:: Mort, Ben

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Feature Progress

Story Point Burn-up: (100.00%)

Feature Estimate: 2.0

	Issues	Story Points
To Do	0	0.0
In Progress	0	0.0
Complete	5	23.0
Total	5	23.0

Dates

Created:: 14/Nov/19 1:51 AM

Updated:: 14/Feb/24 3:08 PM

Resolved:: 19/Feb/20 9:54 AM

Gain experience of Jupyter + Dask + k8s workflows in an SRC-like environment