Loading...

Change Owns to Parent Ofs

Set start and due date...

Xporter

XML

Word

Printable

Details

Type: Feature
Priority: Must have
Fix Version/s: PI23
Component/s: SRCnet Science Enabling
Labels:

ARTs:

SRCnet
Benefit hypothesis:

Hide

There are multiple different storage options that can be used at SRCNet nodes. Given an understanding of storage performance (per SRC site and storage type) SRCNet will be enabled to plan and optimise storage utilisation (and identify future requirements).

Whilst some storage performance tests are documented on Confluence, all SRCNet storage benchmarking tests need to be made reusable, and easily reproducible by all SRCs.

These tests have been mentioned in SRCNet documents :Implementation plan v0.1 and SRCNet v0.1 Node requirements.

This feature is to enable existing tests to be run by the participating v0.1 sites as well as developing further storage benchmark tests so that we can build a collective understanding of SRCNet storage performance.

Show
There are multiple different storage options that can be used at SRCNet nodes. Given an understanding of storage performance (per SRC site and storage type) SRCNet will be enabled to plan and optimise storage utilisation (and identify future requirements). Whilst some storage performance tests are documented on Confluence, all SRCNet storage benchmarking tests need to be made reusable, and easily reproducible by all SRCs. These tests have been mentioned in SRCNet documents : Implementation plan v0.1 and SRCNet v0.1 Node requirements . This feature is to enable existing tests to be run by the participating v0.1 sites as well as developing further storage benchmark tests so that we can build a collective understanding of SRCNet storage performance.
Acceptance criteria:

Hide

AC 0: A set of storage benchmark workflow tests are identified and developed. These will give sufficient information to identify v0.1 storage requirements and optimisation opportunities.
(profiling all of the v0.1 sites, actually identifying requirements and/or optimisation opportunities is not a requirement for PI23 - but likely will be for PI24)

AC 1: Benchmark workflows are run (statistical performance data gathered) for at least 3 types of storage
AC 2: Benchmark workflows are run (statistical performance data gathered) for at least 3 sites, belonging to at least 2 SRCs
AC 3: The outcome of these benchmarks are shared with SRCNet participants.
AC 4: The benchmark workflow tests, container images, container image definitions, and scripts for running the tests are made available via an appropriate SRCNet test repository, with usage guidelines included.

(Container images need to not be .sif images, ie it should be possible to run them via any container engine, including but not limited to Singularity. They can be made available either in a new repo or they can be consolidated here: https://gitlab.com/ska-telescope/src/src-workloads/-/tree/master/bench?ref_type=heads The latter option would be in discussion with the maintainers of the repo)

Show
AC 0: A set of storage benchmark workflow tests are identified and developed. These will give sufficient information to identify v0.1 storage requirements and optimisation opportunities. (profiling all of the v0.1 sites, actually identifying requirements and/or optimisation opportunities is not a requirement for PI23 - but likely will be for PI24) AC 1: Benchmark workflows are run (statistical performance data gathered) for at least 3 types of storage AC 2: Benchmark workflows are run (statistical performance data gathered) for at least 3 sites, belonging to at least 2 SRCs AC 3: The outcome of these benchmarks are shared with SRCNet participants. AC 4: The benchmark workflow tests, container images, container image definitions, and scripts for running the tests are made available via an appropriate SRCNet test repository, with usage guidelines included. (Container images need to not be .sif images, ie it should be possible to run them via any container engine, including but not limited to Singularity. They can be made available either in a new repo or they can be consolidated here: https://gitlab.com/ska-telescope/src/src-workloads/-/tree/master/bench?ref_type=heads The latter option would be in discussion with the maintainers of the repo)
Feature Points:
3.5
Initial Size:
4.5
WSJF:
0
Agile Teams:

Team_DAAC, Team_TEAL
Due Sprint:
Sprint 5
Story Point Burn-up:
Overdue:
Outcomes:
Hide

AC0: A set of storage benchmark workflow tests are identified and developed.

Main [documentation|https://confluence.skatelescope.org/x/OGVEEQ].

AC1: Benchmark workflows are run (statistical performance data gathered) for at least 3 types of storage:

DP3 and IO500 tests on 5 differents types of storage

AC 2: Benchmark workflows are run (statistical performance data gathered) for at least 3 sites, belonging to at least 2 SRCs

Benchmarks run in one site (DP3 run & IO500) and expected one more (espSRC).

AC3: The outcome of these benchmarks are shared with SRCNet participants

TBC.

AC 4: The benchmark workflow tests, container images, container image definitions, and scripts for running the tests are made available via an appropriate SRCNet test repository, with usage guidelines included.

DP3 and IO500 code repositories
Show
AC0: A set of storage benchmark workflow tests are identified and developed. Main [documentation| https://confluence.skatelescope.org/x/OGVEEQ ]. AC1: Benchmark workflows are run (statistical performance data gathered) for at least 3 types of storage: DP3 and IO500 tests on 5 differents types of storage AC 2: Benchmark workflows are run (statistical performance data gathered) for at least 3 sites, belonging to at least 2 SRCs Benchmarks run in one site ( DP3 run & IO500 ) and expected one more (espSRC). AC3: The outcome of these benchmarks are shared with SRCNet participants TBC. AC 4: The benchmark workflow tests, container images, container image definitions, and scripts for running the tests are made available via an appropriate SRCNet test repository, with usage guidelines included. DP3 and IO500 code repositories
URL:
https://confluence.skatelescope.org/x/OGVEEQ

Requirement Status:

PI24 - UNCOVERED
Labels_MIRO:
PI24-PB SRC23-PB SRCNet0.1 storage-tests team_DAAC team_TEAL tests-compilation
Goals_MIRO:
SPO-3479

Description

Benchmarks will be tested against a number of potential storage options, including Slurm BeeGFS and others to establish best practice for configuring storage at SRCNet nodes.

One variety of tests can be CASA measurement set tests, on a single node, for serial read, serial write, parallel read and parallel write operations. These CASA tests will make use of a representative set of visibilities and image cubes, likely using a combination of publicly available LOFAR data and simulated SKA data sets.
The initial version of those tests can be found here:https://confluence.skatelescope.org/display/SRCSC/Benchmarking

Attachments

Issue Links

duplicates

SP-4216 Compile v0.1 storage benchmarking tests in common gitlab repo

Discarded

relates to

SP-4680 Measuring storage performance through Storage Benchmarks for SRCNet v.0.1

Implementing

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

(2 mentioned in)

Structure

Activity

People

Assignee:: Parra, Manuel

Reporter:: Watson, Duncan

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Feature Progress

Story Point Burn-up: (94.74%)

Feature Estimate: 3.5

	Issues	Story Points
To Do	0	0.0
In Progress	1	1.0
Complete	15	18.0
Total	16	19.0

Dates

Created:: 13/May/24 8:39 AM

Updated:: 28/Oct/24 11:46 PM

Due Sprint Date:: 20/Aug/24

Develop the Storage Benchmarking functionality required for v0.1