Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-4086

Update example workflows to use SRCNet astroquery module for data download

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • SRCnet
    • Hide

      Adoption of the examples presented in the src-workloads repo will depend on the workflows being understandable by a wide range of audiences. There is therefore value in providing examples that are clear and accessible (such as notebooks) in addition to scripts which are designed to be run with no user intervention.

      Additionally, the src-workloads provide a common set of tools for running functional and performance testing the SRCNet software. Therefore, using the data lake to store data is a useful demonstration, and provides users with examples they can repurpose for their own tasks.

      Show
      Adoption of the examples presented in the src-workloads repo will depend on the workflows being understandable by a wide range of audiences. There is therefore value in providing examples that are clear and accessible (such as notebooks) in addition to scripts which are designed to be run with no user intervention. Additionally, the src-workloads provide a common set of tools for running functional and performance testing the SRCNet software. Therefore, using the data lake to store data is a useful demonstration, and provides users with examples they can repurpose for their own tasks.
    • Hide

      AC1: image-mosaicking task can retrieve data from Rucio data lake using Astroquery and run against that. Example notebook provided.

      AC2: image-cutouts tasks can retrieve data from Rucio data lake using Astroquery and run against that. Example notebook provided.

      AC3: source finding can retrieve data from Rucio data lake using Astroquery and run against that. Example notebook provided.

      AC4 (stretch): crossmatching, CNN classifier

      Show
      AC1: image-mosaicking task can retrieve data from Rucio data lake using Astroquery and run against that. Example notebook provided. AC2: image-cutouts tasks can retrieve data from Rucio data lake using Astroquery and run against that. Example notebook provided. AC3: source finding can retrieve data from Rucio data lake using Astroquery and run against that. Example notebook provided. AC4 (stretch): crossmatching, CNN classifier
    • Intra Program
    • 2
    • 2
    • 0
    • PI23 - UNCOVERED

    • example-workflows-and-benchmarks tests-compilation

    Description

      Continue work shown in PI21 to extend to more example workflows so that we have notebook (astro query) example using data from SRCNet location or external archive source rather than local copy.

      In PI21, the image-coadding task (https://gitlab.com/ska-telescope/src/src-workloads/-/tree/master/tasks/image-coadding-swarp) was modified. The source data was added into the Rucio data lake, with metadata, and the Dockerfiles modified so that a) the SRCNet astroquery module was included, and b) a user could pass an access token in which allow astroquery to download the data from Rucio (with the fallback - if no token is passed - being to use the SDSS SAS). Additionally, a Jupyter notebook was added as an example in the repo to provide examples of how to discover and access the data using the Astroquery module.

      For this feature, remaining tasks should, where possible be modified similarly. This will involve:

      • Adding tasks' requisite data into the src-workloads scope of the Rucio data lake, with metadata set.
      • Modifying Dockerfile's to allow an optional DM_ACCESS_TOKEN env var to be set
      • If set, modify download scripts to retrieve data from Rucio
      • Provide an example notebook which demonstrates how to discover/download data, and then run the task.

      Attachments

        Issue Links

          Structure

            Activity

              People

                r.bolton Bolton, Rosie
                r.bolton Bolton, Rosie
                Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (0%)

                  Feature Estimate: 2.0

                  IssuesStory Points
                  To Do00.0
                  In Progress   00.0
                  Complete00.0
                  Total00.0

                  Dates

                    Created:
                    Updated:

                    Structure Helper Panel