Loading...

Xporter

XML

Word

Printable

Details

Type: Feature
Priority: Must have
Fix Version/s: PI22
Component/s: COM SDP SW
Labels:
None

ARTs:

Data Processing
Benefit hypothesis:

Hide

Similar as with the scaling feature, for the benefit of system scientists (Commissioning) and operations we need to care about raw performance of (self-)calibration and imaging pipeline. As we are dealing with an exceptionally data-intensive domain, one of the main ways that we can improve performance (or unlock further performance improvements down the line) is to carefully engineer our pipelines for maximising the efficiency of our I/O - specifically reading and writing visibilities and images (the largest data items) from or to storage as rarely as we can get away with.

For self-calibration specifically, the theoretical optimum is to read visibilities only once per major loop or self-calibration iteration. At present, the situation seems to be that we are rather reading measured visibilities once per facet every major loop, as well as an entire read-write cycle for the visibility models. Furthermore, we are writing and re-writing visibilities between DP3 and WSClean. All of these are obvious inefficiencies that we should work towards resolving, even if they might not necessarily give us immediate returns on performance.

Show
Similar as with the scaling feature, for the benefit of system scientists (Commissioning) and operations we need to care about raw performance of (self-)calibration and imaging pipeline. As we are dealing with an exceptionally data-intensive domain, one of the main ways that we can improve performance (or unlock further performance improvements down the line) is to carefully engineer our pipelines for maximising the efficiency of our I/O - specifically reading and writing visibilities and images (the largest data items) from or to storage as rarely as we can get away with. For self-calibration specifically, the theoretical optimum is to read visibilities only once per major loop or self-calibration iteration. At present, the situation seems to be that we are rather reading measured visibilities once per facet every major loop, as well as an entire read-write cycle for the visibility models. Furthermore, we are writing and re-writing visibilities between DP3 and WSClean. All of these are obvious inefficiencies that we should work towards resolving, even if they might not necessarily give us immediate returns on performance.
Acceptance criteria:
Hide

WSClean optimisations

Introduce a way for WSClean to overlap I/O with computation (can boil down to just running instances / tasks in parallel - but need to consider memory usage!) [PANDO]

Allow sharing of visibilities between facets without the need to make copies for applying calibration (ideally for both predict and invert) [PANDO]

Improve storage interface

Support for writing data from DP3

Support for reading data into WSClean
Show
WSClean optimisations Introduce a way for WSClean to overlap I/O with computation (can boil down to just running instances / tasks in parallel - but need to consider memory usage!) [PANDO] Allow sharing of visibilities between facets without the need to make copies for applying calibration (ideally for both predict and invert) [PANDO] Improve storage interface Support for writing data from DP3 Support for reading data into WSClean
Feature Points:
8
Initial Size:
8
WSJF:
0
Agile Teams:

Team_PANDO
Due Sprint:
Sprint 3
Story Point Burn-up:
Overdue:
Overdue

Requirement Status:

PI22 - UNCOVERED

Description

See here for diagrams and more details: https://miro.com/app/board/uXjVN2k7zp0=/?moveToWidget=3458764579524561398&cot=14

Attachments

Structure

Activity

People

Assignee:: Wortmann, Peter

Reporter:: Fenech, Danielle

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Feature Progress

Story Point Burn-up: (84.21%)

Feature Estimate: 8.0

	Issues	Story Points
To Do	1	1.0
In Progress	1	2.0
Complete	7	16.0
Total	9	19.0

Dates

Created:: 19/Feb/24 8:00 PM

Updated:: 01/May/24 6:11 AM

Self-calibration performance improvements through reduced I/O

Details

Description

Attachments

Structure

Activity

People

Feature Progress

Dates

Structure Helper Panel