Loading...

Xporter

XML

Word

Printable

Details

Type: Enabler
Priority: Not Assigned
Fix Version/s: PI10
Component/s: COM SDP SW
Labels:
None

ARTs:

Services
Benefit hypothesis:

Hide

The SDP hardware and software environment is expected to be continously evolving, so we need to make sure that we are able to track their performance on typical SDP tasks.

Show
The SDP hardware and software environment is expected to be continously evolving, so we need to make sure that we are able to track their performance on typical SDP tasks.
Acceptance criteria:
Hide

Automated pipeline/script that can establish the performance and scaling properties of a radio-astronomy related benchmark (like the imaging I/O test)

Onboard benchmark (Repository is included in the SKA GitLab organisation. Code matches coding guidelines Basic unit/regression/integration tests implemented in GitLab CI/CD pipeline. Code documentation matches SKA guidelines)

Shed some light on the scalability of the prototype beyond 16 nodes (~100 nodes).

Relative performance of the parallel file systems (LUSTRE and GPFS?).

Identify requirements for and demonstrate portability across different HPC environments - to this end we might want to show support for e.g. SLURM and/or containerisation

Show how we can systematically compare cluster environments and benchmark configurations using this framework
Show
Automated pipeline/script that can establish the performance and scaling properties of a radio-astronomy related benchmark (like the imaging I/O test) Onboard benchmark (Repository is included in the SKA GitLab organisation. Code matches coding guidelines Basic unit/regression/integration tests implemented in GitLab CI/CD pipeline. Code documentation matches SKA guidelines) Shed some light on the scalability of the prototype beyond 16 nodes (~100 nodes). Relative performance of the parallel file systems (LUSTRE and GPFS?). Identify requirements for and demonstrate portability across different HPC environments - to this end we might want to show support for e.g. SLURM and/or containerisation Show how we can systematically compare cluster environments and benchmark configurations using this framework
Feature Points:
5
Initial Size:
5
WSJF:
6.6
Epic Link:
SDP Initial workflows
Agile Teams:

Team_PLANET
Due Sprint:
Sprint 5
Story Point Burn-up:
Overdue:
Outcomes:
Hide

The imaging-iotest code is onboarded into SKA Gitlab repository -> https://gitlab.com/ska-telescope/sdp/ska-sdp-exec-iotest

Scalability tests are performed upto 128 nodes on LUSTRE and IBM Spectrum Scale file systems. Results are documented at https://confluence.skatelescope.org/pages/viewpage.action?pageId=142970142.

The prototype is containerised using Singularity and image building is integrated into CI pipeline -> https://gitlab.com/ska-telescope/sdp/ska-sdp-exec-iotest/container_registry/1884364

The prototype can be run on different clusters using the orchestrator scripts developed in Python. The repository is hosted at https://gitlab.com/ska-telescope/platform-scripts/-/tree/master/ska-sdp-benchmark-suite. Although the scripts include only imaging IO test at the moment, any radio astronomy pipeline can be integrated into it to run it on HPC platforms.

A toolkit to monitor the performance metrics as pipelines run in an HPC environment is implemented -> https://gitlab.com/ska-telescope/platform-scripts/-/tree/master/ska-sdp-monitor-cpu-metrics. This can be used on any HPC machine that uses batch schedulers like SLURM, PBS, OAR to submit jobs.

Demonstrations of orchestrator scripts and monitor toolkits have been made in Systems demos.
Show
The imaging-iotest code is onboarded into SKA Gitlab repository -> https://gitlab.com/ska-telescope/sdp/ska-sdp-exec-iotest Scalability tests are performed upto 128 nodes on LUSTRE and IBM Spectrum Scale file systems. Results are documented at https://confluence.skatelescope.org/pages/viewpage.action?pageId=142970142 . The prototype is containerised using Singularity and image building is integrated into CI pipeline -> https://gitlab.com/ska-telescope/sdp/ska-sdp-exec-iotest/container_registry/1884364 The prototype can be run on different clusters using the orchestrator scripts developed in Python. The repository is hosted at https://gitlab.com/ska-telescope/platform-scripts/-/tree/master/ska-sdp-benchmark-suite . Although the scripts include only imaging IO test at the moment, any radio astronomy pipeline can be integrated into it to run it on HPC platforms. A toolkit to monitor the performance metrics as pipelines run in an HPC environment is implemented -> https://gitlab.com/ska-telescope/platform-scripts/-/tree/master/ska-sdp-monitor-cpu-metrics . This can be used on any HPC machine that uses batch schedulers like SLURM, PBS, OAR to submit jobs. Demonstrations of orchestrator scripts and monitor toolkits have been made in Systems demos.
Resolved PI.Sprint:
17.4

Feature Checklist:

Stories Completed, Integrated, Outcomes Reviewed, NFRS met, Demonstrated, Satisfies Acceptance Criteria, Accepted by FO

Demos:
- DP_ART_10.6
Requirement Status:

PI22 - UNCOVERED
Goals_MIRO:
SPO-1002

Description

Sibling feature to ~~SP-1548~~, but focused on hardware+software performance testing: Just as we need to establish and maintain the scientific performance of our pipelines, it is equally important that we stay on top of computational performance. After all, given the intense amount of data the SDP is meant to ingest, falling behind in processing is equivalent to data loss, so we will need high assurances that we can finish in an allotted time.

The goal for this feature is to establish a first prototype of infrastructure that will allow us to track and model performance of our workflows.

Attachments

Issue Links

relates to

SP-1548 Standardise testing of initial (PI9) SDP imaging pipelines

Done

mentioned in: Page Loading...; Page Loading...; Page Loading...

Structure

Activity

People

Assignee:: Wortmann, Peter

Reporter:: Paipuri, Mahendra [X] (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Feature Progress

Story Point Burn-up: (89.58%)

Feature Estimate: 5.0

	Issues	Story Points
To Do	0	0.0
In Progress	1	5.0
Complete	23	43.0
Total	24	48.0

Dates

Created:: 24/Feb/21 3:54 PM

Updated:: 15/Apr/24 9:49 AM

Resolved:: 31/Jan/23 10:43 AM

Standardise SDP processing performance benchmarking