Loading...

Change Owns to Parent Ofs

Set start and due date...

Xporter

XML

Word

Printable

Details

Type: Enabler
Priority: Must have
Fix Version/s: PI23
Component/s: COM SDP SW
Labels:
- AA2

ARTs:

Data Processing
Benefit hypothesis:
Hide

Scaling strategy implemented in AA2 pipelines might not be enough to cover AA* Mid ICAL use cases

We also need to evaluate technology options

on the storage side (the existing measurement set implementation is a known and recurring bottleneck, even though it likely won't matter until large scales)

on execution frameworks and networking (we have demonstrated that Dask can orchestrate pipeline, but can it provide the throughput necessary?)

portability to compute platforms (e.g. accelerators)

See details in Miro: https://miro.com/app/board/uXjVKZ900uo=/?moveToWidget=3458764589420690446&cot=14
Show
Scaling strategy implemented in AA2 pipelines might not be enough to cover AA* Mid ICAL use cases We also need to evaluate technology options on the storage side (the existing measurement set implementation is a known and recurring bottleneck, even though it likely won't matter until large scales) on execution frameworks and networking (we have demonstrated that Dask can orchestrate pipeline, but can it provide the throughput necessary?) portability to compute platforms (e.g. accelerators) See details in Miro: https://miro.com/app/board/uXjVKZ900uo=/?moveToWidget=3458764589420690446&cot=14
Acceptance criteria:
Hide

Implement imaging portion of a major loop (predict + invert, so from model images to dirty image) using visibility streaming while holding facet data in-memory in a distributed fashion. PANDO

Integration of processing functions with storage interface and workflow PANDO

~~Address wide-field issues (w-stacking)~~

Demonstrate on AA2+-scale datasets PANDO

Investigate and optimise performance

Considering scheduler, network and storage throughput PANDO

~~Investigate hybridisation of pipeline (GPU processing functions).~~

See details in Miro: https://miro.com/app/board/uXjVKZ900uo=/?moveToWidget=3458764589420690446&cot=14
Show
Implement imaging portion of a major loop (predict + invert, so from model images to dirty image) using visibility streaming while holding facet data in-memory in a distributed fashion. PANDO Integration of processing functions with storage interface and workflow PANDO Address wide-field issues (w-stacking) Demonstrate on AA2+-scale datasets PANDO Investigate and optimise performance Considering scheduler, network and storage throughput PANDO Investigate hybridisation of pipeline (GPU processing functions). See details in Miro: https://miro.com/app/board/uXjVKZ900uo=/?moveToWidget=3458764589420690446&cot=14
Feature Points:
5
Initial Size:
5
WSJF:
0
Epic Link:
SDP AA* pipeline scaling
Agile Teams:

Team_PANDO
Due Sprint:
Sprint 5
Story Point Burn-up:
Overdue:
Outcomes:
Hide

Summary:

We don't believe all AC's (and associated objective) has been met as currently there isn't a released pipeline that can do both gridding and de-gridding in a distributed way on AA2+ scale datasets. Summary of PI23 work:

Refactored existing functionality in notebook form into working scripts including tests, documentation and tutorials.

Created a release (v0.1.0) Releases · SKAO / Science Data Processor / Science Pipeline Workflows / ska-sdp-distributed-self-cal-prototype · GitLab and docs are here SKA SDP Distributed Self-Cal Prototype — Distributed Self-Cal Prototype documentation (skao.int). User can now install pipeline and perform dirty imaging in a distributed fashion.

De-gridding and subtract can be performed (pan-219 branch) on the CLI using FITS model image and visibilities as input, but not in a distributed way.

Created new release of Swiftly, solving dependency issues between Swiftly and PFL

Made optimisations to XRADIO to allow conversion of AA2 scale datasets https://confluence.skatelescope.org/display/SE/2024-07-31+DP+ART+System+Demo+23.4

See main branch (predict and subtract not yet merged) Files · main · SKAO / Science Data Processor / Science Pipeline Workflows / ska-sdp-distributed-self-cal-prototype · GitLab

Acceptance Criteria (outcomes in green)

Implement imaging portion of a major loop (predict + invert, so from model images to dirty image) using visibility streaming while holding facet data in-memory in a distributed fashion. Made progress. Visibilities -> dirty image is implemented. Predict functionality implemented but not fully integrated, and not tested in a distributed fashion.

Integration of processing functions with storage interface and workflow Integrated ska-sdp-func for de-gridding. Exists in branch notebooks/degrid_wtower_test.py · pan-219-predict-visibilities · SKAO / Science Data Processor / Science Pipeline Workflows / ska-sdp-distributed-self-cal-prototype · GitLab

~~Address wide-field issues (w-stacking)~~

Demonstrate on AA2+-scale datasets Not achieved because of bugs and schema changes to MSv4 and broken build systems on ska-sdp-func. New changes to xradio (which can convert large MSv2, but includes schema-breaking changes) means significant refactoring of existing code.

Investigate and optimise performance

Considering scheduler, network and storage throughput Not done as we are awaiting pipeline functionality and ability to convert AA2+ datasets for proper metrics.

~~Investigate hybridisation of pipeline (GPU processing functions).~~
Show
Summary: We don't believe all AC's (and associated objective) has been met as currently there isn't a released pipeline that can do both gridding and de-gridding in a distributed way on AA2+ scale datasets. Summary of PI23 work: Refactored existing functionality in notebook form into working scripts including tests, documentation and tutorials. Created a release (v0.1.0) Releases · SKAO / Science Data Processor / Science Pipeline Workflows / ska-sdp-distributed-self-cal-prototype · GitLab and docs are here SKA SDP Distributed Self-Cal Prototype — Distributed Self-Cal Prototype documentation (skao.int) . User can now install pipeline and perform dirty imaging in a distributed fashion. De-gridding and subtract can be performed (pan-219 branch) on the CLI using FITS model image and visibilities as input, but not in a distributed way. Created new release of Swiftly, solving dependency issues between Swiftly and PFL Made optimisations to XRADIO to allow conversion of AA2 scale datasets https://confluence.skatelescope.org/display/SE/2024-07-31+DP+ART+System+Demo+23.4 See main branch (predict and subtract not yet merged) Files · main · SKAO / Science Data Processor / Science Pipeline Workflows / ska-sdp-distributed-self-cal-prototype · GitLab Acceptance Criteria (outcomes in green ) Implement imaging portion of a major loop (predict + invert, so from model images to dirty image) using visibility streaming while holding facet data in-memory in a distributed fashion. Made progress. Visibilities -> dirty image is implemented. Predict functionality implemented but not fully integrated, and not tested in a distributed fashion. Integration of processing functions with storage interface and workflow Integrated ska-sdp-func for de-gridding. Exists in branch notebooks/degrid_wtower_test.py · pan-219-predict-visibilities · SKAO / Science Data Processor / Science Pipeline Workflows / ska-sdp-distributed-self-cal-prototype · GitLab Address wide-field issues (w-stacking) Demonstrate on AA2+-scale datasets Not achieved because of bugs and schema changes to MSv4 and broken build systems on ska-sdp-func. New changes to xradio (which can convert large MSv2, but includes schema-breaking changes) means significant refactoring of existing code. Investigate and optimise performance Considering scheduler, network and storage throughput Not done as we are awaiting pipeline functionality and ability to convert AA2+ datasets for proper metrics. Investigate hybridisation of pipeline (GPU processing functions).

Demos:
- DP_ART_23.4
Requirement Status:

PI24 - UNCOVERED
Labels_MIRO:
AA2 DP_ART_23.4

Description

See details in Miro: https://miro.com/app/board/uXjVKZ900uo=/?moveToWidget=3458764589420690446&cot=14

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

3C123_dirty_image.png
756 kB
12/Sep/24 1:51 PM

Issue Links

mentioned in: Page Loading...; Page Loading...

mentioned on

Commit - Merge branch 'Update-project-details' into 'main'

Commit - SP-4386: Update project name and URL in build files

Merge request - SP-4386 Update project name and URL in build files

Structure

Activity

People

Assignee:: Wortmann, Peter

Reporter:: Graser, Ferdl

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Feature Progress

Story Point Burn-up: (68.00%)

Feature Estimate: 5.0

	Issues	Story Points
To Do	4	8.0
In Progress	0	0.0
Complete	9	17.0
Total	13	25.0

Dates

Created:: 21/May/24 4:03 PM

Updated:: 17/Sep/24 2:41 PM

Due Sprint Date:: 20/Aug/24

Distributed visibility streaming - widefield imaging

Details

Description

Attachments

Attachments

Issue Links

Structure

Activity

People

Feature Progress

Dates

Structure Helper Panel