Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-4326

Service monitoring v0.1 infrastructure

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • SRCnet
    • Hide

      To enable a centralised dashboard view on the local and global services running at the v0.1 SRCNodes, we need a data source containing information from these services. This monitoring 'plumbing' infrastructure is necessary to capture SRCNode services running at Sites

      • as they start to come online in preparation for v0.1
      • provide the ability to monitor service health centrally
      • provide historical data for services running at Sites to capture reliability of services over time
      Show
      To enable a centralised dashboard view on the local and global services running at the v0.1 SRCNodes, we need a data source containing information from these services. This monitoring 'plumbing' infrastructure is necessary to capture SRCNode services running at Sites as they start to come online in preparation for v0.1 provide the ability to monitor service health centrally provide historical data for services running at Sites to capture reliability of services over time
    • Hide

      A data source configured on a Grafana instance that is collecting persistent information, and the 'explore' functionality has be used to see the information from 2 sets of services running at 2 different sites.

      Show
      A data source configured on a Grafana instance that is collecting persistent information, and the 'explore' functionality has be used to see the information from 2 sets of services running at 2 different sites.
    • 1.5
    • 2
    • 0
    • Team_CHOCOLATE, Team_CORAL
    • Sprint 4
    • PI23 - UNCOVERED

    • SRC23-PB SRCNet0.1 Team_Chocolate multi-team operations-and-infrastructure tests-compilation

    Description

      To enable a centralised dashboard view on the local and global services running at the v0.1 SRCNodes, we need a data source containing information from these services.

      This will entail information being collected in a central data store and a dashboard (Grafana) to build a meaningful view of SRCNet service monitoring metrics.
      There can be several ways of doing this

      • In its simplest form, this could be events from services being pushed into a central DB serving as a Prometheus data source that is used to build dashboard views. 
      • It could also mean sites running an metrics/events exporter service that is probed by/scraped by the central monitoring service. In either approach (push vs pull, activeMQ, Kafka etc), an initial format for the corresponding events needs to be established.

      Attachments

        Issue Links

          Structure

            Activity

              People

                b.mort Mort, Ben
                r.joshi Joshi, Rohini
                Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (15.38%)

                  Feature Estimate: 1.5

                  IssuesStory Points
                  To Do611.0
                  In Progress   00.0
                  Complete12.0
                  Total713.0

                  Dates

                    Created:
                    Updated:

                    Structure Helper Panel