Details
-
Feature
-
Should have
-
SRCnet
-
-
-
1
-
2
-
0
-
Team_CORAL, Team_DAAC
-
Sprint 4
-
-
-
-
24.1
-
Accepted by FO
-
-
SRCNet0.1 Teal-D operations-and-infrastructure site-provisioning
Description
Some SRCNet services are expected to be best run on Kubernetes. For SRCNet sites that are already looking to use OpenStack, it would be good if we can share efforts on good practices for creating a performant, maintainable and upgradable Kubernetes cluster running on OpenStack. There are currently many ways to achieve that aim, all with different trade off depending on the use case.
Azimuth can provide performant on demand and upgradable Kubernetes clusters, using regular project level user access to an OpenStack project. (There is a work in progress effort for a Magnum driver to expose this stack via Magnum APIs, but that does require cloud changes, that can not be assumed).
Details on how to install Azimuth on an OpenStack are already available. Additional feedback on how repeatable this process is would be very welcome:
https://stackhpc.github.io/azimuth-config/
We are proposing that we work with the UK STFC Cloud team to try and get this up and running and following all required security policies. They do already make some use of the capi-helm-charts directly. It would be good to see if there are other SRCNet sites where sharing effort around Kubernetes on OpenStack would be helpful. It would be good to compare the approach with other efforts globally, and start to build consensus on how best to share efforts in this area.
In particular it would be good to cover areas such as:
- Baremetal cluster support (via Ironic+Nova)
- GPU support (VMs with passthrough, not vGPU)
- RDMA support, Ethernet and Infini-band