Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-1886

Implement Liveness, Readiness, and Startup Probes for SDP services

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • Enabler
    • Must have
    • PI13
    • COM SDP SW
    • None
    • Data Processing
    • Hide

      Probes allow the platform to check whether components are responsive, and therefore detect both successful start-up as well as degradation events. Combined with robust component restart (see SP-1884) this should enable the SDP sub-system to self-heal in a number of failure scenarios.

      Show
      Probes allow the platform to check whether components are responsive, and therefore detect both successful start-up as well as degradation events. Combined with robust component restart (see SP-1884 ) this should enable the SDP sub-system to self-heal in a number of failure scenarios.
    • Hide
      • Define nature of probe (i.e. interface to check) for every part of the sub-system, weighting false-negative (service unresponsive, probe succeeds) against false-positive (service responsive, probe fails). Options might include, but not be limited to:
        1. Simple "does process exist"
        2. Check that a lease key exists in the configuration database (leases need constant keep-alive messages from the creator, or get deleted)
        3. Check that Tango devices exist, attributes are readable and/or a "ping" command gets accepted
        4. When in doubt, can also add a simple local HTTP server to component just for being able to answer probes
      • Implement probes as designed
      • Stretch: Integration test that demonstrates that a probe on a critical component (Tango interface? That's where we had a deadlock at least once!) works
      Show
      Define nature of probe (i.e. interface to check) for every part of the sub-system, weighting false-negative (service unresponsive, probe succeeds) against false-positive (service responsive, probe fails). Options might include, but not be limited to: Simple "does process exist" Check that a lease key exists in the configuration database (leases need constant keep-alive messages from the creator, or get deleted) Check that Tango devices exist, attributes are readable and/or a "ping" command gets accepted When in doubt, can also add a simple local HTTP server to component just for being able to answer probes Implement probes as designed Stretch: Integration test that demonstrates that a probe on a critical component (Tango interface? That's where we had a deadlock at least once!) works
    • 1
    • 5
    • 5
    • 12.5
    • PI24 - UNCOVERED

    Attachments

      Structure

        Activity

          People

            D.Fenech Fenech, Danielle
            b.mort Mort, Ben
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Feature Progress

              Story Point Burn-up: (0%)

              Feature Estimate: 1.0

              IssuesStory Points
              To Do00.0
              In Progress   00.0
              Complete00.0
              Total00.0

              Dates

                Created:
                Updated:
                Resolved:

                Structure Helper Panel