Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-1107

Enhance the GPU imaging/gain calibration pipeline code to support FP16 visibilities

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • Enabler
    • Not Assigned
    • PI7, PI8
    • COM SDP SW
    • None
    • Data Processing
    • Hide

      This feature will enable the work in SP-419 to be realised, allowing comparisions between FP16 with FP32 representations of visibilities using data sets of sizes that require the use of hardware accelerators.

      Show
      This feature will enable the work in SP-419 to be realised, allowing comparisions between FP16 with FP32 representations of visibilities using data sets of sizes that require the use of hardware accelerators.
    • Hide

      Code tracked in GitLab with Kubernetes integration and suitable test cases.

      The following project files will be updated to provide a compiler switch to optionally handle representing visibilities in FP16:

      • Cuda kernels gridder.cu (accepts visibilities and kernels in FP16 but accumulates on FP32 grid), direct_fourier_transform.cu (outputs FP16 predicted visibilities), gains.cu (accepts FP16 predicted and measured visibilities, outputs FP16 gains).
      • C/C++ host controller.cpp (accepts FP16 visibilities in ingest memory regions), wprojection.c (generates W-kernels using at least FP32 arithmetic but output kernels in FP16).

      The FP16 compiler switch should result in FP32 calculations for the other steps in the pipeline (eg grids, residual images and cleaned sources will be represented in FP32).

      Note that switching between FP64/FP32/FP16 will require the user to rebuild the project (since for performance the switch should result in conditional compilation, removing any unnecessary code).

      Note that the degridder.cu (GitLab project CUDA_Degridder) is not currently planned to be adapted for FP16 in this feature (due to resource constraints in PI#7 and its accuracy:performance compared to the direct_fourier_transform for FP32).

      Show
      Code tracked in GitLab with Kubernetes integration and suitable test cases. The following project files will be updated to provide a compiler switch to optionally handle representing visibilities in FP16: Cuda kernels gridder.cu (accepts visibilities and kernels in FP16 but accumulates on FP32 grid), direct_fourier_transform.cu (outputs FP16 predicted visibilities), gains.cu (accepts FP16 predicted and measured visibilities, outputs FP16 gains). C/C++ host controller.cpp (accepts FP16 visibilities in ingest memory regions), wprojection.c (generates W-kernels using at least FP32 arithmetic but output kernels in FP16). The FP16 compiler switch should result in FP32 calculations for the other steps in the pipeline (eg grids, residual images and cleaned sources will be represented in FP32). Note that switching between FP64/FP32/FP16 will require the user to rebuild the project (since for performance the switch should result in conditional compilation, removing any unnecessary code). Note that the degridder.cu (GitLab project CUDA_Degridder) is not currently planned to be adapted for FP16 in this feature (due to resource constraints in PI#7 and its accuracy:performance compared to the direct_fourier_transform for FP32).
    • 2
    • 2
    • 0
    • Team_NZAPP
    • Sprint 5
    • Hide

      FP16 visibility support has now been fully implemented into the SEP Imaging Pipeline after making an initial start on the design and development last program increment (NZAPP-226). We have added a compiler directive which supports 16 bit host and device side visibility buffers which are used to transfer incoming and outgoing visibilities between CUDA host and device memory (NZAPP-249). If the compiler directive is not set, then single precision (32bit) visibilities are used instead. Only the visibility complex value itself can be set to half precision not the UVW coordinates of the visibilities. We also have modified kernels for W-projection gridding to support half precision since they are multiplied with visibilities within the gridding module. Visibilities are still accumulated onto a single or double precision grid independently of the 16 bit support compiler directive. We have also used several 16bit compute operations in the DFT, Gridding and Gain Subtraction modules for efficiency (@NZAPP-131).

      We have performed some initial testing with the GLEAM small dataset to validate the use of 16bit visibilities within the pipeline (NZAPP-269). We found that the RRMSE between the dirty image produced from the SEP Imaging Pipeline with 16bit visiblities enabled as opposed to the dirty image produced using the pipeline with 32bit visibilities, to be only 0.018%. When comparing the dirty image generated from 16bit visibilities with the ideal image generated by a Direct Fourier Transform, the RRMSE was 0.423% (note this was using 16x oversampling and 50 UV planes). More thorough testing should be done in another program increment on the larger GLEAM datasets as well as throughput timings for data transfer (NZAPP-134).

      The updated GitLab repository can be found here:
      https://gitlab.com/ska-telescope/sep_pipeline_imaging

      Show
      FP16 visibility support has now been fully implemented into the SEP Imaging Pipeline after making an initial start on the design and development last program increment (NZAPP-226). We have added a compiler directive which supports 16 bit host and device side visibility buffers which are used to transfer incoming and outgoing visibilities between CUDA host and device memory (NZAPP-249). If the compiler directive is not set, then single precision (32bit) visibilities are used instead. Only the visibility complex value itself can be set to half precision not the UVW coordinates of the visibilities. We also have modified kernels for W-projection gridding to support half precision since they are multiplied with visibilities within the gridding module. Visibilities are still accumulated onto a single or double precision grid independently of the 16 bit support compiler directive. We have also used several 16bit compute operations in the DFT, Gridding and Gain Subtraction modules for efficiency (@NZAPP-131). We have performed some initial testing with the GLEAM small dataset to validate the use of 16bit visibilities within the pipeline (NZAPP-269). We found that the RRMSE between the dirty image produced from the SEP Imaging Pipeline with 16bit visiblities enabled as opposed to the dirty image produced using the pipeline with 32bit visibilities, to be only 0.018%. When comparing the dirty image generated from 16bit visibilities with the ideal image generated by a Direct Fourier Transform, the RRMSE was 0.423% (note this was using 16x oversampling and 50 UV planes). More thorough testing should be done in another program increment on the larger GLEAM datasets as well as throughput timings for data transfer (NZAPP-134). The updated GitLab repository can be found here: https://gitlab.com/ska-telescope/sep_pipeline_imaging
    • 9.3
    • Stories Completed, Outcomes Reviewed, Satisfies Acceptance Criteria, Accepted by FO
    • PI22 - UNCOVERED

    Description

      As the precursor to SP-419, this feature prepares a GPU-accelerated imaging/gain calibration pipeline that supports the storage and movement of visibilities in FP16 through multiple major cycles. This will be achived by adapting the code developed for SP-417.

      It is envisaged that this will be able to accept FP16 visibilities from the CSP, and pass them through imaging and gain calibration cycles in FP16 format. This includes using FP16 for visibilities and kernels in the gridder (but accumulating on an FP32 grid), predicting visibilities in FP16 in the DFT, and calculating antenna gains using FP16 visibilities.

      Attachments

        Issue Links

          Structure

            Activity

              People

                b.mort Mort, Ben
                A.Griffin Griffin, Anthony
                Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (78.57%)

                  Feature Estimate: 2.0

                  IssuesStory Points
                  To Do13.0
                  In Progress   00.0
                  Complete411.0
                  Total514.0

                  Dates

                    Created:
                    Updated:
                    Resolved:

                    Structure Helper Panel