Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-3000

Data Discovery/Server Discovery Service Definition

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • Feature
    • Must have
    • PI17
    • None
    • SRCnet
    • Hide

      Currently, there is not a clear way forward to prevent latency of the visualisation of big data cubes within the SRC Net data lake. As these files could be in different repositories, a preliminary data cube transfer (or stream) where the parser server is located (usually tools use a client-server architecture to parse big files) could produce a high-latency starting phase to start the visualisation and, also, provide a possible resources problem in cache storage area and a network use

      A possible discovery service could prevent this movement, allowing remote operations where the data is located

      Show
      Currently, there is not a clear way forward to prevent latency of the visualisation of big data cubes within the SRC Net data lake. As these files could be in different repositories, a preliminary data cube transfer (or stream) where the parser server is located (usually tools use a client-server architecture to parse big files) could produce a high-latency starting phase to start the visualisation and, also, provide a possible resources problem in cache storage area and a network use A possible discovery service could prevent this movement, allowing remote operations where the data is located
    • Hide

      AC1: Discussion and definition of the discovery service specification. Agree on the input parameters and output response (including output response examples per tool) expected for the tools to enable communication. This output response also should describe the metadata included in the response. Provide the response (or responses) expected. This should be defined in a way as close as possible to TAP/ObsCore so the changes into the implementation (ObsCore extension) are minimised

      AC2: Identification of the metadata needed to characterise the Visualisation tools parsing services

      Show
      AC1: Discussion and definition of the discovery service specification. Agree on the input parameters and output response (including output response examples per tool) expected for the tools to enable communication. This output response also should describe the metadata included in the response. Provide the response (or responses) expected. This should be defined in a way as close as possible to TAP/ObsCore so the changes into the implementation (ObsCore extension) are minimised AC2: Identification of the metadata needed to characterise the Visualisation tools parsing services
    • 2
    • 2
    • 0
    • Team_ORANGE
    • Sprint 3
    • Hide

      Outcome in this Confluence page

      Show
      Outcome in this Confluence page
    • 20.6
    • Accepted by FO
    • PI23 - UNCOVERED

    • SRC-Vis SRCPB

    Description

      In order to find a solution for the visualisation of big data files at the SRC Net data lake without the latency of moving data from different repositories before starting the parsing the following approach is proposed:

      • Visualisation tool servers will be deployed in all the SRCs with big data files to be visualised (e.g. big data cubes) (ideally, containerised versions)
      • A REST (or similar) service will be used to discover the server that is close to a particular data entity 
      • The visualisation tool will connect to this selected remote server to start the visualisation

      This could be integrated into an ObsCore response in-line with the VLKB Obscore service definition

      For every record (observation or file level), the service invocation would provide one entry with the location of the server to be contacted to parse the remote file or if there are two or more copies of the data across the network for high availability, an equivalent number of entries with primary and secondary locations. In case multiple copies are found with a similar quality of service on the storage repository I/O rate, the IP used to invoke the service could find out the closer server location but this is an enhancement not really needed for a proof of concept (a possible idea for the science data challenge)

      Notes:

      • Tools should send a user-agent into the HTTP/HTTPS invocation, so this could be used to personalise the response depending on the tool, pointing to its relevant server locations and adding any needed metadata to allow the connection (e.g. server ports or any other server signature parameters)
      • The internal hierarchy of this service could be inspired by the IVOA Data Link standard https://www.ivoa.net/documents/DataLink/
      • That also implies that a list of SRC->servers should be present in a queryable database or similar. Also, not really needed for the proof of concept

       

      This feature only covers the definition of the service but not the implementation itself

      Attachments

        Issue Links

          Structure

            Activity

              People

                Jesus.Salgado Salgado, Jesus
                Jesus.Salgado Salgado, Jesus
                Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (100.00%)

                  Feature Estimate: 2.0

                  IssuesStory Points
                  To Do00.0
                  In Progress   00.0
                  Complete720.0
                  Total720.0

                  Dates

                    Created:
                    Updated:
                    Resolved:

                    Structure Helper Panel