Details
-
Spike
-
Must have
-
SRCnet
-
-
-
Team_CORAL
-
Sprint 5
-
-
-
-
21.1
-
Stories Completed, Demonstrated, Satisfies Acceptance Criteria, Accepted by FO
-
PI20-PB
Description
The Workload Management System (WMS) is a fundamental part of a scientific compute network such as the SRCNet. It becomes essential for job submission, job scheduling, job monitoring, and job accounting. In addition, it is beneficial for both users and tools as it provides a unified submission interface, scalability, ensuring reliability, efficiency, and security of job submissions at SRCNet scale.
The biggest and most successful workload management systems have been WLCG deployments: ATLAS uses PanDA, CMS uses GlideinWMS/HTCondor, ALICE uses AliEn, and LHCb uses DIRAC.
Given the relevance of the WMS in the SRCNet, it would be useful to study existing solutions. We could gather information about the following aspects:
- Assumptions made for WLCG use cases that may differ from SRCNet use cases. This may include, but isn't limited to, the WLCG data model, and the requirements placed on the data management system.
- Level of maturity regarding OIDC token support.
- Architectural aspects:
- Architecture diagram detailing the services that run in the ecosystem (central services, local site services, enabling services)
- Comment on how well does it fit with existing SRCNet components, especially SRCNet compute and data APIs
- Comment on how well the WMS solutions could be make to work for astronomy use cases such as those requiring IVOA interfaces.
- Ease of deployment: How many moving parts, how easy are they to setup and configure and modularity of the deployment design.
- Receptiveness/willingness of the maintainers to assist us with
- prototyping, learning and getting started
- collaboration in driving the technology forward jointly in the future (bug fixing, feature contributions, etc).
- How easy would it be to fix and contribute improvements to the project? (including code/language, project lifecycle, contribution methodology).
There is interest / ability in Coral and Magenta teams to drive this work and arrange conversations with the relevant technical folks in the WLCG community.