Platform level services need to be easy to find, and access, as well as have appropriate safeguards in place such as authentication. To achieve this, all services need to be DNS addressable, AzureAD authentication integrated (where appropriate/feasible) and indexed in our platform documentation, so that our users can find and interact with our services.
Se Problem Solving Workshop session 3 - https://miro.com/app/board/uXjVN4TI7cU=/?moveToWidget=3458764580717560138&cot=14
Restated Problem:
From the services ART perspective, they built monitoring tools e.g. prometheus, and a logging system (kibana, elastic stack), available to everyone for debugging, but they do not appear to be used effectively.
The primary reasons they are not being used is because:
- DevPortal has some shortcomings:
- Not perceived to be the single entry point for software development information
- Information is not consolidated, rather it is scattered across confluence, and google docs too
- Developers do not know when the DevPortal is updated with useful information
- Dev Portal does not list all tools involved in this sort of analysis
- Dev Portal search functionality is poor
- Dev Portal does not list access paths for different datacenters
- Kibana has shortcomings:
- Is slow, across different environments (MID,LOW, STFC)
- Does not show the pipeline logs
- Sometimes does not work at all
The problem occurs throught the PI, and results in delayed problem resolution, impacting timely delivery, with potential wasted effort building tools that are not used effectively.