Details
-
Feature
-
Could have
-
SRCnet
-
-
-
Intra Program
-
2
-
2
-
0
-
Team_CORAL
-
Sprint 5
-
-
-
-
24.3
-
Stories Completed, Outcomes Reviewed, Demonstrated
-
-
SRCNet0.1 compute-tests tests-compilation
Description
In order to measure hardware usage of the workflows used by demonstrator cases in the UKSRC, the PyProfQueue package was created. This package takes in bash scripts which it uses to submit jobs to HPC queuing systems with the needed declarations and initialisation to track hardware usage with Prometheus and Node Exporter, as well as a performance measure using Likwid. The hardware usage can be used to understand the resources that are required by a workflow, and track the Hardware usage through out the lifetime of the workload execution.
To test the robustness of this package, and adapt it to the wider communities needs, we aim to use this package with SRCNet repository workloads. By doing this, we can validate if it performs as expected on a wider range of workloads, and change it where it falls short, but we can also learn more about the metrics that other workload developers may require or desire. By combining this with a STARScore measure of UKSRC resources, we can centralise the information of SRC owned or sponsored hardware that is available in the UK, with the aim of simultaneously providing feedback for the development of the STARS suite.
Attachments
Issue Links
- Child Of
-
SP-4870 Science Enabling v0.1 - Roadmap
- Implementing
- relates to
-
SP-4286 Add additional peak performance benchmark metrics alongside STARS
- Program Backlog
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...