Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-4109

Integrate and demonstrate multi-node imaging distribution in Low and Mid AA2 self-calibration pipelines

Details

    • Data Processing
    • Hide

      In order to reach the required performance for AA2 commissioning and science verification, distribution of processing within the self-calibration pipeline is required.

      See details here: https://miro.com/app/board/uXjVN2k7zp0=/?moveToWidget=3458764579524561331&cot=14 

      Show
      In order to reach the required performance for AA2 commissioning and science verification, distribution of processing within the self-calibration pipeline is required. See details here: https://miro.com/app/board/uXjVN2k7zp0=/?moveToWidget=3458764579524561331&cot=14  
    • Hide
      • Demonstrate distribution of self-calibration Pipelines on 3-10 nodes (SCHAAP).
      • Systematically monitor CPU and I/O performance in all stages of the pipeline, and identify bottlenecks - i.e. if we do not get roughly x10 speedup from running on 10 nodes, why not? (SCHAAP)
      • We have documented for the self-calibration pipeline using DP3 and WSClean a "reasonable" extrapolation to what performance we would expect for Low/Mid AA2 ICAL, and compared with an objective formulated at PI planning meeting. (SCHAAP)
      Show
      Demonstrate distribution of self-calibration Pipelines on 3-10 nodes ( SCHAAP ). Systematically monitor CPU and I/O performance in all stages of the pipeline, and identify bottlenecks - i.e. if we do not get roughly x10 speedup from running on 10 nodes, why not? ( SCHAAP ) We have documented for the self-calibration pipeline using DP3 and WSClean a "reasonable" extrapolation to what performance we would expect for Low/Mid AA2 ICAL, and compared with an objective formulated at PI planning meeting. ( SCHAAP )
    • 18
    • 18
    • 0
    • Team_HIPPO, Team_SCHAAP
    • Sprint 5
    • PI22 - UNCOVERED

    Description

      Whilst the distribution of calibration stage has been demonstrated within both the Mid and Low pipelines this has so far exercised single node processing of the imaging and deconvolution stages. This work aims to demonstrate the ability to scale these latter stages by exercising multi-node distribution over ~10 nodes. This should be run as part of the self-calibration pipeline to understand overall performance and ability to scale.

      There is related work to clarify the appropriate parameters for testing this capability (SP-4089), but such tests should include representative precursor data and simulated data that can be realistically extrapolated to AA2 scale in order to gauge expected performance.

       

      We are now aiming to make the pipelines available to appropriate stakeholders which will require that they are versioned, documented and released with appropriate information for setup and usage (see the developer docs https://developer.skao.int/en/latest/tools/documentation.html#documenting-the-public-api and release management pages https://confluence.skatelescope.org/display/SE/Guidelines%3A+Release+Management+Process for project-wide guidance as a starting point).

       

      See more details here: https://miro.com/app/board/uXjVN2k7zp0=/?moveToWidget=3458764579524561331&cot=14 

      Attachments

        Issue Links

          Structure

            Activity

              People

                D.Fenech Fenech, Danielle
                D.Fenech Fenech, Danielle
                Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (73.95%)

                  Feature Estimate: 18.0

                  IssuesStory Points
                  To Do49.5
                  In Progress   46.0
                  Complete2044.0
                  Total2859.5

                  Dates

                    Created:
                    Updated:

                    Structure Helper Panel