Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-3867

Scale up a task in the Example workflow repository

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • SRCnet
    • Hide

      All of the current tasks in the workloads Gitlab repository use small/minimal datasets to demonstrate the task. Moving forward we want to scale some of these up to use much larger datasets. This can be to test the task can handle significant datasets, but also to test the compute infrastructure running the task. Note that this is not just running the same task over and over again, but using a significantly larger dataset.

      Show
      All of the current tasks in the workloads Gitlab repository use small/minimal datasets to demonstrate the task. Moving forward we want to scale some of these up to use much larger datasets. This can be to test the task can handle significant datasets, but also to test the compute infrastructure running the task. Note that this is not just running the same task over and over again, but using a significantly larger dataset.
    • Hide

      AC1: Develop a scaled up version of PYBDSF: run on multiple images.

      AC2: add alternate version to gitlab.

      Show
      AC1: Develop a scaled up version of PYBDSF: run on multiple images. AC2: add alternate version to gitlab.
    • 1
    • 1
    • 0
    • Team_MAGENTA
    • Sprint 5
    • Show
      Example notebook: https://gitlab.com/ska-telescope/src/src-workloads/-/blob/master/tasks/source-finding-pybdsf/jupyter/scripts/pybdsf-sf-with-rucio.ipynb?ref_type=heads Demoed: https://confluence.skatelescope.org/display/SRCSC/2024-02-15+SRC+ART+System+Demo+21.5+Part+2+PM  
    • 21.6
    • Stories Completed, Demonstrated, Satisfies Acceptance Criteria, Accepted by FO
    • PI24 - UNCOVERED

    • Team_Magenta

    Description

      All of the current tasks in the workloads Gitlab repository use small/minimal datasets to demonstrate the task. Moving forward we want to scale some of these up to use much larger datasets. This can be to test the task can handle significant datasets, but also to test the compute infrastructure running the task. This will significantly increase the compute time, and will help test more realistic workloads being run at SRCs. Note that this is not just running the same task over and over again, but using a significantly larger dataset. 

      This feature would selection at least one of the tasks, and scale it up. It could be split into multiple features, per tasks that are relevant to scale up.

       

      Run PYBDSF source-finding on a significantly larger set of images.

      Mosaicking large areas of sky.

      Image-cutouts for thousands/millions of sources from multiple images.

      Image convolution - convolving many images to alter the resolution.

      CNN image classifier - use a much larger dataset of images.

      Develop a script that runs all tasks in their default state, to test the functionality of an SRC node.

      Attachments

        Issue Links

          Structure

            Activity

              People

                r.bolton Bolton, Rosie
                A.Clarke Clarke, Alex
                Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (100.00%)

                  Feature Estimate: 1.0

                  IssuesStory Points
                  To Do00.0
                  In Progress   00.0
                  Complete35.0
                  Total35.0

                  Dates

                    Created:
                    Updated:
                    Resolved:

                    Structure Helper Panel