Details
-
Spike
-
Not Assigned
-
None
-
None
-
Obs Mgt & Controls
-
-
-
2
-
7
-
-
-
11.5
Description
After d-carlo.matteo's lightning talk about using Grafana as a possible alternative to WebJive, we need to assess the extent to which the architecture underlying grafana poses performance risks to us.
This works entails doing a performance test of the architecture described by Matteo di Carlo in https://docs.google.com/presentation/d/1H1fd5Arkl7b93nbgebzqxcNy0WyRDMTMcO-7Vy3ET6A/edit#slide=id.p1 against the performance requirements that were stated for Webjive, namely the 1000 changes in a dashboard in less than 1s.
Underlying this throughput specification is also a latency requirement that was assumed in the event driven architecture of WebJive, but may not be possible in a polled monitoring system like Grafana. We need to be able to make control changes and see immediate monitoring feedback responses so there is no discernible lag. Hence we need to assess how long it takes for a round trip action of a command to get from a dashboard to a Tango device and an associated monitoring response to return. This should be in the 100ms range for the lag to be below the detection threshold.
One way to do this could be:
- setup a Tango device that is capable of producing those many changes in several attributes per second
- Have these changes triggered by a command or attribute.
- create a grafana dashboard that has widgets for displaying values of attributes and the basic control mechanism.
- run a test for each combination of settings
See what was done for https://jira.skatelescope.org/browse/SP-296