Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-3859

Configurable self-calibration pipeline

Details

    • Data Processing
    • Hide

      See Why? in description

      Show
      See Why? in description
    • Hide

      See What? in description

      Show
      See What? in description
    • Intra Program
    • 13
    • 13
    • 0
    • Team_HIPPO, Team_SCHAAP
    • Sprint 5
    • Overdue
    • Hide

      HIPPO PI21:
      Documentation was updated and a new release in the CAR was made.

      After the agreement with FO, it was decided that multiple stages of DDECal will not be implemented in PI21. Instead, a decision was made to implement multi-scale and wideband deconvolution. 

      During PI21 MID Self-cal pipeline was improved and enhanced with new features, and a number of technical debts were resolved. Following was added or changed: 

      The data model classes were largely improved, i.e. how sky models, sources, calibration patches / facet and sky tessellations are represented internally. For example, this has enabled the implementation of arbitrary facet creation strategies in a way that is entirely de-coupled from the rest of the code, allowing more developers to work independently.
      Sky model clustering and filtering at the end of each self-calibration cycle to help with convergence.
      Ability to run both multi-scale and wideband deconvolution was enabled. Wideband deconvolution was found to have a measurable positive impact on RMS residual. (https://jira.skatelescope.org/browse/HIP-794)
      Improved tessellation code that reliably works with unfiltered, real-world source lists returned by WSClean, and creates facets with better flux balance. This is expected to improve calibration solutions, and calibration convergence speed in general. This thread of work will carry on the next PI. (https://jira.skatelescope.org/browse/HIP-808)
      MID pipeline can now reliably perform an arbitrary number of self-calibration cycles on real-world data. Previously in PI20, there were edge cases where bad facet definitions or sky models could have been provided to the next calibration cycle.
      Integration test for the whole pipeline, which significantly improves test coverage (presented in DP System Demo 21.6). It is expected to be systematically run as part of the CI pipeline soon. (https://jira.skatelescope.org/browse/HIP-806, https://jira.skatelescope.org/browse/HIP-822) For the test to run very quickly, a special tiny version of the AA2 Mid dataset was created last PI (https://jira.skatelescope.org/browse/HIP-741)
      Better logging and system monitoring app to aid with benchmarking and identification of bottlenecks in WSClean. This has been presented in System Demo 21.5 and is expected to provide new insights in the next round of benchmarks/scaling tests. 

      MID pipeline was also tested on a MeerKAT dataset (https://archive.sarao.ac.za/search/20190627-0017/ ;https://arxiv.org/abs/2110.00347).
      During the processing of the data by the MID pipeline, several issues were discovered and identified, such as an missing bootstrapping sky model for MeerKAT, a bug in DDEcal due to which DDEcal does not apply MeerKAT primary beam, a lack of a suitable faceting method, a need for filtering of the resulting sky model produce by WSClean, etc. Details are on the confluence page https://confluence.skatelescope.org/display/SE/Getting+a+MeerKAT+dataset+through+the+Mid+pipeline.

      Based on these findings, we have identified the following improvements on the MID Self-cal pipeline to be added:
       * better tessellation that would reflect extended sources and balance out flux per facet,
       * enable multi-scale and wideband deconvolution in MID Self-cal pipeline,
       * implement sky model filtering step to improve source localization.

       

      Show
      HIPPO PI21: Documentation was updated and a new release in the CAR was made. After the agreement with FO, it was decided that multiple stages of DDECal will not be implemented in PI21. Instead, a decision was made to implement multi-scale and wideband deconvolution.  During PI21 MID Self-cal pipeline was improved and enhanced with new features, and a number of technical debts were resolved. Following was added or changed:  The data model classes were largely improved, i.e. how sky models, sources, calibration patches / facet and sky tessellations are represented internally. For example, this has enabled the implementation of arbitrary facet creation strategies in a way that is entirely de-coupled from the rest of the code, allowing more developers to work independently. Sky model clustering and filtering at the end of each self-calibration cycle to help with convergence. Ability to run both multi-scale and wideband deconvolution was enabled. Wideband deconvolution was found to have a measurable positive impact on RMS residual. ( https://jira.skatelescope.org/browse/HIP-794 ) Improved tessellation code that reliably works with unfiltered, real-world source lists returned by WSClean, and creates facets with better flux balance. This is expected to improve calibration solutions, and calibration convergence speed in general. This thread of work will carry on the next PI. ( https://jira.skatelescope.org/browse/HIP-808 ) MID pipeline can now reliably perform an arbitrary number of self-calibration cycles on real-world data. Previously in PI20, there were edge cases where bad facet definitions or sky models could have been provided to the next calibration cycle. Integration test for the whole pipeline, which significantly improves test coverage (presented in DP System Demo 21.6). It is expected to be systematically run as part of the CI pipeline soon. ( https://jira.skatelescope.org/browse/HIP-806 , https://jira.skatelescope.org/browse/HIP-822 ) For the test to run very quickly, a special tiny version of the AA2 Mid dataset was created last PI ( https://jira.skatelescope.org/browse/HIP-741 ) Better logging and system monitoring app to aid with benchmarking and identification of bottlenecks in WSClean. This has been presented in System Demo 21.5 and is expected to provide new insights in the next round of benchmarks/scaling tests.  MID pipeline was also tested on a MeerKAT dataset ( https://archive.sarao.ac.za/search/20190627-0017/ ; https://arxiv.org/abs/2110.00347 ). During the processing of the data by the MID pipeline, several issues were discovered and identified, such as an missing bootstrapping sky model for MeerKAT, a bug in DDEcal due to which DDEcal does not apply MeerKAT primary beam, a lack of a suitable faceting method, a need for filtering of the resulting sky model produce by WSClean, etc. Details are on the confluence page https://confluence.skatelescope.org/display/SE/Getting+a+MeerKAT+dataset+through+the+Mid+pipeline . Based on these findings, we have identified the following improvements on the MID Self-cal pipeline to be added:  * better tessellation that would reflect extended sources and balance out flux per facet,  * enable multi-scale and wideband deconvolution in MID Self-cal pipeline,  * implement sky model filtering step to improve source localization.  
    • PI22 - UNCOVERED

    • Low G4 Mid G3

    Description

      See frame in PI21 Backlog board


      Who? (Beneficiaries)

      • System scientists (Commissioning).
      • Commissioning and Operations staff planning for science commissioning & verification ahead of and during AA2.

      Why? (Benefit hypothesis)

      • We need to start moving towards a production-ready pipeline in order to be ready for AA2.
      • This especially means being able to parametrise the pipeline for different self-calibration loops and input parameter setups.

      What? (Acceptance criteria)

      • An identifiable release of the pipeline, runnable on relevant platforms (CSD3). (Low: SCHAAP, Mid: HIPPO)
      • Usage examples for at least standard agreed AA2 relevant data sets (covering at least Low & Mid), and performance results measured against these. (SCOOP, SCHAAP, HIPPO)
      • Document expected data products produced by the pipeline; this should include a description of pipeline parameters and the range of outputs produced as a result. (Mid: HIPPO, Low (stretch): SCHAAP)

      Attachments

        Issue Links

          Structure

            Activity

              People

                D.Fenech Fenech, Danielle
                m.ashdown Ashdown, Mark
                Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (100.00%)

                  Feature Estimate: 13.0

                  IssuesStory Points
                  To Do00.0
                  In Progress   00.0
                  Complete3876.0
                  Total3876.0

                  Dates

                    Created:
                    Updated:

                    Structure Helper Panel