Loading...

Xporter

XML

Word

Printable

Details

Type: Enabler
Priority: Not Assigned
Fix Version/s: PI8
Component/s: None
Labels:
None

ARTs:

Services
Benefit hypothesis:

Hide

In order to migrate testing and integration workloads away from isolated environments, it is necessary to establish what the resourcing requirements are so that the shared services can be sized accordingly.

Show
In order to migrate testing and integration workloads away from isolated environments, it is necessary to establish what the resourcing requirements are so that the shared services can be sized accordingly.
Acceptance criteria:
Hide

Each team will provide resource estimates for each containerised application delivered that encompasses:

CPU (millisecs)

Memory (MB/GB)

Ephemeral storage (temporary storage used by running application)

Persistent Storage (Block, Network Filesystem and object storage)

any additional requirements such as GPUs, high performance networks etc

values and helm charts associated with Skampi are updated to reflect these estimates (requests, and limits for CPU, Mem, and Ephemeral storage)
Show
Each team will provide resource estimates for each containerised application delivered that encompasses: CPU (millisecs) Memory (MB/GB) Ephemeral storage (temporary storage used by running application) Persistent Storage (Block, Network Filesystem and object storage) any additional requirements such as GPUs, high performance networks etc values and helm charts associated with Skampi are updated to reflect these estimates (requests, and limits for CPU, Mem, and Ephemeral storage)
Feature Points:
4
Initial Size:
4
WSJF:
0
Epic Link:
DevSecOps Implementation
Agile Teams:

Team_BUTTONS, Team_CIPA, Team_CREAM, Team_KAROO, Team_MCCS, Team_NCRA, Team_PERENTIE, Team_SYSTEM
Due Sprint:
Sprint 5
Story Point Burn-up:
Overdue:
Outcomes:

Hide

NCRA Team: Application resource requirements were logged and provided to System Team
Resource estimates ar captured in following spreadsheet:
https://docs.google.com/spreadsheets/d/14agPTjY88eqjL3rBuxHxnPwPPND3QmawoT7hXScyCdk/edit#gid=0
Resource stats are merged in SKAMPI master through MR https://gitlab.com/ska-telescope/skampi/-/merge_requests/106.
Later we updated cpu (request tag) usage through MR https://gitlab.com/ska-telescope/skampi/-/merge_requests/116 to resolve the Corba exception in AssignResources on Central Node.

CREAM Team:

Values for CSP.LMC:
Reference stories are CT-111 and CT-112. Outcomes are summarised here for convenience.
CT-111 See the document linked to the story - also linked here
https://www.dropbox.com/s/aif3n5vh3afyd6t/CT-111-k8s-resources.pdf?dl=0

The k8s resources values have been set into the CSP.LMC and MID.CBF helm charts (see links below)
https://gitlab.com/ska-telescope/skampi/-/blob/master/charts/skampi/charts/csp-proto/values.yaml
https://gitlab.com/ska-telescope/skampi/-/blob/master/charts/skampi/charts/cbf-proto/values.yaml

The updated charts have been added to the skampi master repo and tests run with success (see link below)
https://gitlab.com/ska-telescope/skampi/-/pipelines/188670445

Values for WebJive Suite:
Reference stories are: CT-114, CT-131, CT-132, CT-133 see comments to the stories.
Merge request:
https://gitlab.com/ska-telescope/skampi/-/merge_requests/114
Chart: https://gitlab.com/ska-telescope/skampi/-/blob/master/charts/skampi/charts/webjive/values.yaml

Buttons Team:
Summary provided in comments of AT2-563
Merge request: https://gitlab.com/ska-telescope/observation-execution-tool/-/merge_requests/49

Perentie Team:
As our Tango devices are not yet integrated in skampi, we have not measured usage. We have applied 'default' values from tango-example to our charts, and will update once integrated with skampi and we can use the monitoring tools there to measure actual resource usage. (See also AT6-661)

MCCS Team
We initially set out to establish metrics as requested by this feature via MCCS-152. A Confluence page records the process we followed to establish our values (https://confluence.skatelescope.org/display/SE/MCCS-152+Analyse+MCCS+k8s+metrics+to+add+resource+settings). However once this was done we found it difficult to exercise MCCS to provide realistic values due to many areas of the software being relatively immature (this is only our second PI). Following advice MCCS-207 was created and implemented to effectively refactor the MCCS helm chart to use the official tango-util library. This provided a set of defaults which were appropriate at this stage to fulfil what was required of MCCS by this feature.

System Team
Applied ResourceQuotas automatically to Skampi Namespaces, and developed LimitRanges (defaults) that enable Skampi to run+test in it's current form.

Show
NCRA Team: Application resource requirements were logged and provided to System Team Resource estimates ar captured in following spreadsheet: https://docs.google.com/spreadsheets/d/14agPTjY88eqjL3rBuxHxnPwPPND3QmawoT7hXScyCdk/edit#gid=0 Resource stats are merged in SKAMPI master through MR https://gitlab.com/ska-telescope/skampi/-/merge_requests/106 . Later we updated cpu (request tag) usage through MR https://gitlab.com/ska-telescope/skampi/-/merge_requests/116 to resolve the Corba exception in AssignResources on Central Node. CREAM Team: Values for CSP.LMC: Reference stories are CT-111 and CT-112. Outcomes are summarised here for convenience. CT-111 See the document linked to the story - also linked here https://www.dropbox.com/s/aif3n5vh3afyd6t/CT-111-k8s-resources.pdf?dl=0 The k8s resources values have been set into the CSP.LMC and MID.CBF helm charts (see links below) https://gitlab.com/ska-telescope/skampi/-/blob/master/charts/skampi/charts/csp-proto/values.yaml https://gitlab.com/ska-telescope/skampi/-/blob/master/charts/skampi/charts/cbf-proto/values.yaml The updated charts have been added to the skampi master repo and tests run with success (see link below) https://gitlab.com/ska-telescope/skampi/-/pipelines/188670445 Values for WebJive Suite: Reference stories are: CT-114, CT-131, CT-132, CT-133 see comments to the stories. Merge request: https://gitlab.com/ska-telescope/skampi/-/merge_requests/114 Chart: https://gitlab.com/ska-telescope/skampi/-/blob/master/charts/skampi/charts/webjive/values.yaml Buttons Team: Summary provided in comments of AT2-563 Merge request: https://gitlab.com/ska-telescope/observation-execution-tool/-/merge_requests/49 Perentie Team: As our Tango devices are not yet integrated in skampi, we have not measured usage. We have applied 'default' values from tango-example to our charts, and will update once integrated with skampi and we can use the monitoring tools there to measure actual resource usage. (See also AT6-661) MCCS Team We initially set out to establish metrics as requested by this feature via MCCS-152. A Confluence page records the process we followed to establish our values ( https://confluence.skatelescope.org/display/SE/MCCS-152+Analyse+MCCS+k8s+metrics+to+add+resource+settings ). However once this was done we found it difficult to exercise MCCS to provide realistic values due to many areas of the software being relatively immature (this is only our second PI). Following advice MCCS-207 was created and implemented to effectively refactor the MCCS helm chart to use the official tango-util library. This provided a set of defaults which were appropriate at this stage to fulfil what was required of MCCS by this feature. System Team Applied ResourceQuotas automatically to Skampi Namespaces, and developed LimitRanges (defaults) that enable Skampi to run+test in it's current form.
Resolved PI.Sprint:
9.3

Feature Checklist:

Stories Completed, Integrated, BDD Testing Passes (no errors), Outcomes Reviewed, NFRS met, Demonstrated, Satisfies Acceptance Criteria, Accepted by FO

Demos:
- OMC_ART_SystemDemo_4
Requirement Status:

PI22 - UNCOVERED
Goals_MIRO:
SPO-718

Description

In order to migrate testing and integration workloads away from isolated environments, it is necessary to establish what the resourcing requirements are so that the shared services can be sized accordingly.

Each team will provide resource estimates for each containerised application delivered that encompasses:

CPU (millisecs)

Memory (MB/GB)

Ephemeral storage (temporary storage used by running application)

Persistent Storage (Block, Network Filesystem and object storage)

any additional requirements such as GPUs, high performance networks etc

Attachments

Issue Links

depends on

SP-1127 Migrate away from Docker Compose exclusively to k8s

Done

relates to

ROAM-125 k8s Resource Management not set correctly causing Resource Eviction on SKAMPI (All teams, Checkpoints with teams during SoS)

Retired

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

mentioned on

Commit - Merge branch 'SAR-180/limit_deployment_resources' into 'master'

Commit - SP-1259 SAR-180 Restored helm charts and added k8s resource constraints

Merge request - SP-1259 SAR-180 Restored helm charts and added k8s resource constraints

(24 mentioned in, 3 mentioned on)

Structure

Activity

People

Assignee:: Bridger, Alan

Reporter:: Valame, Snehal

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Feature Progress

Story Point Burn-up: (100.00%)

Feature Estimate: 4.0

	Issues	Story Points
To Do	0	0.0
In Progress	0	0.0
Complete	19	27.300001
Total	19	27.300001

Dates

Created:: 25/Aug/20 11:54 AM

Updated:: 13/Feb/24 2:50 PM

Resolved:: 15/Jan/21 5:10 PM

Establish resource requirements for each application workload in terms of CPU, Memory, ephemeral storage, and persistent storage