Loading...

Xporter

XML

Word

Printable

Details

Type: Feature
Priority: Should have
Fix Version/s: PI20
Component/s: COM SDP SW
Labels:
None

ARTs:

Data Processing
Benefit hypothesis:
Hide

Who?

Pipeline developers

Platform developers

What?

A better idea of how effectively the considered technologies could utilise storage infrastructure for the purpose of retrieving visibilites

Why?

Persistent source of bottlenecks in existing software

Potentially far-reaching consequences for design of new pipeline software

Might also affect platform design and interfaces (object stores?)
Show
Who? Pipeline developers Platform developers What? A better idea of how effectively the considered technologies could utilise storage infrastructure for the purpose of retrieving visibilites Why? Persistent source of bottlenecks in existing software Potentially far-reaching consequences for design of new pipeline software Might also affect platform design and interfaces (object stores?)
Acceptance criteria:
Hide

Develop a prototype that loads a visibility data set from disk in a distributed fashion

Sized appropriately for an SKA use case - at least AA2, ideally AA3+

Utilise technologies currently envisioned to be used mid-term (xarray, zarr, dask)

Implement different access patterns (base: frequency slices, then time order, advanced: by u/v/w chunks)

Identify platforms to run benchmarks on

Measure throughput of prototype for identified datasets and access patterns
Show
Develop a prototype that loads a visibility data set from disk in a distributed fashion Sized appropriately for an SKA use case - at least AA2, ideally AA3+ Utilise technologies currently envisioned to be used mid-term (xarray, zarr, dask) Implement different access patterns (base: frequency slices, then time order, advanced: by u/v/w chunks) Identify platforms to run benchmarks on Measure throughput of prototype for identified datasets and access patterns
Feature Points:
2
Initial Size:
2
WSJF:
0
Epic Link:
Data Processing Pipeline Development for AA2 onwards
Agile Teams:

Team_PANDO
Due Sprint:
Sprint 4
Story Point Burn-up:
Overdue:
Outcomes:
Hide

Links

Demo slides: /edit?usp=drive_link&ouid=105107180717003823827&rtpof=true&sd=true

GitHub repo (to be migrated to SKA GitLab): https://github.com/sstansill/msconverter/tree/main

GitHub repo including demo notebook: https://github.com/sstansill/msconverter/tree/converter_demo

Outcomes

We have produced a prototype which converts a measurement set to a chunked and compressed zarr store. Parallel reading is supported by the xarray library, and chunk size is optimised to ~>100MB as recommended by Dask best practises.

The converter was demoed (slides: https://docs.google.com/presentation/d/1MPrI3OMuR3_Zn5T7U31TMZ2z_Sc4k88T) to DP ART. Local benchmarking on an M2 Pro MacBook Pro (512GB) shows that we can saturate the SSD I/O using 7/8 cores (~3GB/s read). Preliminary CSD3 benchmarking shows the Lustre I/O is not saturated using 32 cascade lake CPU cores.

CSD3 zarr effective read rate on zarr store converted from a 43G MeasurementSet:

Cores Memory (GiB) Read Throughput (GiB/s)

8 16 1.52

16 32 2.18

32 64 3.31

Benchmarking was an embarrassingly parallel summation of the complex visibilities, used as a proxy for raw read speeds. Casacore read rates on a MeasurementSet peaks at ~250MB/s using CSD3's Lustre storage giving a ~10x speedup.

Next Steps

Migrate GitHub repo to GitLab (risk: we should merge into xradio and mirror xradio)

Add tests to comply with SKA coding standards

Improve chunk management for AA* scale data sets which will produce billions of tasks in a Dask graph

Implement partitions as well as chunks to work towards planned self-calibration design (https://docs.google.com/document/d/1AtTlGSHDuwnPDL_c8aTp1oJNDWPyLEIRvlev81UInfc/edit?usp=sharing)

Investigate u/v/w traversal either within a chunk or across the whole data set to work towards planned self-calibration design (https://docs.google.com/document/d/1AtTlGSHDuwnPDL_c8aTp1oJNDWPyLEIRvlev81UInfc/edit?usp=sharing)

Improve metadata coverage by the converter (risk 1: SKA Measurement Set standard not finalised. risk 2: compatibility with other radio telescopes not guaranteed which prevents merging with xradio)
Show
Links Demo slides: /edit?usp=drive_link&ouid=105107180717003823827&rtpof=true&sd=true GitHub repo (to be migrated to SKA GitLab): https://github.com/sstansill/msconverter/tree/main GitHub repo including demo notebook: https://github.com/sstansill/msconverter/tree/converter_demo Outcomes We have produced a prototype which converts a measurement set to a chunked and compressed zarr store. Parallel reading is supported by the xarray library, and chunk size is optimised to ~>100MB as recommended by Dask best practises. The converter was demoed (slides: https://docs.google.com/presentation/d/1MPrI3OMuR3_Zn5T7U31TMZ2z_Sc4k88T ) to DP ART. Local benchmarking on an M2 Pro MacBook Pro (512GB) shows that we can saturate the SSD I/O using 7/8 cores (~3GB/s read). Preliminary CSD3 benchmarking shows the Lustre I/O is not saturated using 32 cascade lake CPU cores. CSD3 zarr effective read rate on zarr store converted from a 43G MeasurementSet: Cores Memory (GiB) Read Throughput (GiB/s) 8 16 1.52 16 32 2.18 32 64 3.31 Benchmarking was an embarrassingly parallel summation of the complex visibilities, used as a proxy for raw read speeds. Casacore read rates on a MeasurementSet peaks at ~250MB/s using CSD3's Lustre storage giving a ~10x speedup. Next Steps Migrate GitHub repo to GitLab (risk: we should merge into xradio and mirror xradio) Add tests to comply with SKA coding standards Improve chunk management for AA* scale data sets which will produce billions of tasks in a Dask graph Implement partitions as well as chunks to work towards planned self-calibration design ( https://docs.google.com/document/d/1AtTlGSHDuwnPDL_c8aTp1oJNDWPyLEIRvlev81UInfc/edit?usp=sharing ) Investigate u/v/w traversal either within a chunk or across the whole data set to work towards planned self-calibration design ( https://docs.google.com/document/d/1AtTlGSHDuwnPDL_c8aTp1oJNDWPyLEIRvlev81UInfc/edit?usp=sharing ) Improve metadata coverage by the converter (risk 1: SKA Measurement Set standard not finalised. risk 2: compatibility with other radio telescopes not guaranteed which prevents merging with xradio)

Demos:
- DP_ART_20.6
Requirement Status:

PI22 - UNCOVERED

Description

For visibility processing we will need to read back petabytes of visibility data from the SDP buffer, and potentially hundreds of nodes in parallel. This is not what existing data access layers (e.g. casacore) were designed to do, and it is therefore expected that we will need to investigate alternative technologies.

As part of the NRAO collaboration, we are specifically considering using xarray APIs with zarr backends. This feature is to develop prototypes to investigate whether this can achieve throughput appropriate for SKA-style use cases.

This could potentially inform a number of ongoing efforts, both long term (data modelling / API design, technology selection) as well as short term (option for fixing WSClean bottlenecks / use for benchmarking prospective platforms).

Attachments

Issue Links

mentioned in: Page Loading...

Structure

Activity

People

Assignee:: Wortmann, Peter

Reporter:: Wortmann, Peter

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Feature Progress

Story Point Burn-up: (86.36%)

Feature Estimate: 2.0

	Issues	Story Points
To Do	1	3.0
In Progress	0	0.0
Complete	6	19.0
Total	7	22.0

Dates

Created:: 06/Sep/23 10:09 AM

Updated:: 15/Apr/24 9:35 AM

Prototype distributed visibility retrieval from buffer