Details
-
Enabler
-
Should have
-
None
-
Data Processing
-
-
-
Inter Program
-
5
-
5
-
5
-
1
-
Team_ORCA
-
Sprint 5
-
-
-
-
17.4
-
Stories Completed, Outcomes Reviewed, NFRS met, Demonstrated, Satisfies Acceptance Criteria, Accepted by FO
-
-
SPO-1593
Description
Follows work from SP-2086. Aims to address questions posed in goal SPO-1593
Who?
SDP pipeline developers, SDP architects
What?
- Identify and investigate the performance of Dask through focused testing using an i/o insensitive distributed processing pattern that targets one of the most challenging SDP imaging use cases.
- This specifically means addressing the central scaling challenge: Keeping both image and grid data (and therefore visibilities once we get to gridding) distributed independently.
- To prove this, we should show that we can effectively distribute both the image and subgrid load (i.e. demonstrate that memory requirements per node decrease as we add more nodes) - while still being able to correctly transform one into the other.
- Note that checking correctness would generally require solving a full-size FFT here - it is therefore advisable to test with simple patterns where only a few (known) image or uv grid points are set in the input, as that means we can relatively cheaply predict any output uv grid or image point by direct evaluation of the Fourier transformation.
Why?
In the not too distant future (ideally by the end of PI14) we need to make a pivot or persevere decision on whether Dask will be used as an EF for SDP development or will we need to focus on other options. Dask is an attractive option since it has great community support and may improve interoperability with other astronomy software outside of SKA. We do believe there may be a number of problems with Dask scaling, and investigating is clearly very important if we are to adopt Dask for most of the SKA pipeline development for the foreseeable future.
Attachments
Issue Links
- depends on
-
SP-2086 Dask implementation of distributed, I/O-intensive pipeline
- Done
- informs
-
SP-2570 Evaluate scaling properties of Distributed FT algorithm in Dask
- Done
- relates to
-
SP-2374 Streaming distributed FT processing functions adapted to SDP Processing Function Library interface
- Funnel
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...