Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-3822

Prototype visibility streaming pipeline with distributed facets

Details

    • Enabler
    • Should have
    • PI21
    • COM SDP SW
    • None
    • Data Processing
    • Hide

      See Why? in description

      Show
      See Why? in description
    • Hide

      See What? in description

      Show
      See What? in description
    • 8
    • 8
    • 0
    • Team_HIPPO, Team_PANDO
    • Sprint 5
    • Overdue
    • Hide

      HIPPO PI21:
      In PI21, the development of the GPU gridder continued. We have investigated other available gridders like MeerKAT spectral line GPU imager and DUCC (CPU gridder). We examined the polynomial method used to evaluate the convolution kernels in DUCC, but this has not yet proven to be faster on GPUs. Improvements continued on the ska-sdp-func GPU gridder as well. We have added a lookup table for convolution kernel evaluation that has improved performance by a factor of 1.47x. The investigation of the multi-GPU gridder has also progressed and an initial prototype has been added to the ska-sdp-func repository. The performance gain of multi-GPU is marginal at the moment, however, it allows us to produce larger images than a single-GPU implementation can. We have also created a version of the GPU gridder that grids visibilities into a sub-grid and does not produce an image. This is to support the PANDO team, who are planning to implement an imager based on SwiFTly. Modified version of the GPU gridder was added into the SKA SDP processing function library (https://gitlab.com/ska-telescope/sdp/ska-sdp-func; https://jira.skatelescope.org/browse/HIP-820). API is currently provisional and awaiting integration tests. API was discussed with FO (Peter Worthmann) and with the PANDO team, who are planning to use it in tests with a streaming imager.

      Show
      HIPPO PI21: In PI21, the development of the GPU gridder continued. We have investigated other available gridders like MeerKAT spectral line GPU imager and DUCC (CPU gridder). We examined the polynomial method used to evaluate the convolution kernels in DUCC, but this has not yet proven to be faster on GPUs. Improvements continued on the ska-sdp-func GPU gridder as well. We have added a lookup table for convolution kernel evaluation that has improved performance by a factor of 1.47x. The investigation of the multi-GPU gridder has also progressed and an initial prototype has been added to the ska-sdp-func repository. The performance gain of multi-GPU is marginal at the moment, however, it allows us to produce larger images than a single-GPU implementation can. We have also created a version of the GPU gridder that grids visibilities into a sub-grid and does not produce an image. This is to support the PANDO team, who are planning to implement an imager based on SwiFTly. Modified version of the GPU gridder was added into the SKA SDP processing function library ( https://gitlab.com/ska-telescope/sdp/ska-sdp-func ; https://jira.skatelescope.org/browse/HIP-820 ). API is currently provisional and awaiting integration tests. API was discussed with FO (Peter Worthmann) and with the PANDO team, who are planning to use it in tests with a streaming imager.
    • PI22 - UNCOVERED

    • Low G4 Mid G3

    Description

      See frame in PI21 Backlog board


      Who? (Beneficiaries)

      • Pipeline developers.
      • System scientists (Commissioning).
      • Commissioning and Operations staff planning for science commissioning & verification ahead of and during AA2.

      Why? (Benefit hypothesis)

      • Scaling strategy implemented in AA2 pipelines might not be enough to cover AA* Mid ICAL use cases.
      • We also need to evaluate technology options:
        • on the storage side (the existing measurement set implementation is a known and recurring bottleneck, even though it likely won't matter until large scales);
        • on execution frameworks and networking (we have demonstrated that Dask can orchestrate pipeline, but can it provide the throughput necessary?);
        • portability to compute platforms (e.g. accelerators).

      What? (Acceptance criteria)

      • Implement performance prototype that implements visibility streaming while holding facet data in-memory in a distributed fashion,
        • Obvious option would be to start with the "distributed Fourier Transformation / SwiFTly" implementation. (Gridder: HIPPO; memory access patterns and gridding of zarr stores: PANDO)
      • Investigate performance, especially considering scheduler and network throughput.
      • Demonstrate integration with:
        • processing functions;
        • storage backends for loading visibilities.

      Attachments

        Issue Links

          Structure

            Activity

              People

                p.wortmann Wortmann, Peter
                m.ashdown Ashdown, Mark
                Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (93.42%)

                  Feature Estimate: 8.0

                  IssuesStory Points
                  To Do23.0
                  In Progress   12.0
                  Complete2371.0
                  Total2676.0

                  Dates

                    Created:
                    Updated:

                    Structure Helper Panel