Loading...

Change Owns to Parent Ofs

Set start and due date...

Xporter

XML

Word

Printable

Details

Type: Enabler
Priority: Not Assigned
Fix Version/s: PI7, PI8
Component/s: COM SDP SW
Labels:
None

ARTs:

Data Processing
Benefit hypothesis:

Hide

This feature will enable the work in SP-419 to be realised, allowing comparisions between FP16 with FP32 representations of visibilities using data sets of sizes that require the use of hardware accelerators.

Show
This feature will enable the work in SP-419 to be realised, allowing comparisions between FP16 with FP32 representations of visibilities using data sets of sizes that require the use of hardware accelerators.
Acceptance criteria:
Hide

Code tracked in GitLab with Kubernetes integration and suitable test cases.

The following project files will be updated to provide a compiler switch to optionally handle representing visibilities in FP16:

Cuda kernels gridder.cu (accepts visibilities and kernels in FP16 but accumulates on FP32 grid), direct_fourier_transform.cu (outputs FP16 predicted visibilities), gains.cu (accepts FP16 predicted and measured visibilities, outputs FP16 gains).

C/C++ host controller.cpp (accepts FP16 visibilities in ingest memory regions), wprojection.c (generates W-kernels using at least FP32 arithmetic but output kernels in FP16).

The FP16 compiler switch should result in FP32 calculations for the other steps in the pipeline (eg grids, residual images and cleaned sources will be represented in FP32).

Note that switching between FP64/FP32/FP16 will require the user to rebuild the project (since for performance the switch should result in conditional compilation, removing any unnecessary code).

Note that the degridder.cu (GitLab project CUDA_Degridder) is not currently planned to be adapted for FP16 in this feature (due to resource constraints in PI#7 and its accuracy:performance compared to the direct_fourier_transform for FP32).
Show
Code tracked in GitLab with Kubernetes integration and suitable test cases. The following project files will be updated to provide a compiler switch to optionally handle representing visibilities in FP16: Cuda kernels gridder.cu (accepts visibilities and kernels in FP16 but accumulates on FP32 grid), direct_fourier_transform.cu (outputs FP16 predicted visibilities), gains.cu (accepts FP16 predicted and measured visibilities, outputs FP16 gains). C/C++ host controller.cpp (accepts FP16 visibilities in ingest memory regions), wprojection.c (generates W-kernels using at least FP32 arithmetic but output kernels in FP16). The FP16 compiler switch should result in FP32 calculations for the other steps in the pipeline (eg grids, residual images and cleaned sources will be represented in FP32). Note that switching between FP64/FP32/FP16 will require the user to rebuild the project (since for performance the switch should result in conditional compilation, removing any unnecessary code). Note that the degridder.cu (GitLab project CUDA_Degridder) is not currently planned to be adapted for FP16 in this feature (due to resource constraints in PI#7 and its accuracy:performance compared to the direct_fourier_transform for FP32).
Feature Points:
2
Initial Size:
2
WSJF:
0
Epic Link:
SDP workflow prototyping
Agile Teams:

Team_NZAPP
Due Sprint:
Sprint 5
Story Point Burn-up:
Overdue:
Outcomes:

Hide

FP16 visibility support has now been fully implemented into the SEP Imaging Pipeline after making an initial start on the design and development last program increment (NZAPP-226). We have added a compiler directive which supports 16 bit host and device side visibility buffers which are used to transfer incoming and outgoing visibilities between CUDA host and device memory (NZAPP-249). If the compiler directive is not set, then single precision (32bit) visibilities are used instead. Only the visibility complex value itself can be set to half precision not the UVW coordinates of the visibilities. We also have modified kernels for W-projection gridding to support half precision since they are multiplied with visibilities within the gridding module. Visibilities are still accumulated onto a single or double precision grid independently of the 16 bit support compiler directive. We have also used several 16bit compute operations in the DFT, Gridding and Gain Subtraction modules for efficiency (@NZAPP-131).

We have performed some initial testing with the GLEAM small dataset to validate the use of 16bit visibilities within the pipeline (NZAPP-269). We found that the RRMSE between the dirty image produced from the SEP Imaging Pipeline with 16bit visiblities enabled as opposed to the dirty image produced using the pipeline with 32bit visibilities, to be only 0.018%. When comparing the dirty image generated from 16bit visibilities with the ideal image generated by a Direct Fourier Transform, the RRMSE was 0.423% (note this was using 16x oversampling and 50 UV planes). More thorough testing should be done in another program increment on the larger GLEAM datasets as well as throughput timings for data transfer (NZAPP-134).

The updated GitLab repository can be found here:
https://gitlab.com/ska-telescope/sep_pipeline_imaging

Show
FP16 visibility support has now been fully implemented into the SEP Imaging Pipeline after making an initial start on the design and development last program increment (NZAPP-226). We have added a compiler directive which supports 16 bit host and device side visibility buffers which are used to transfer incoming and outgoing visibilities between CUDA host and device memory (NZAPP-249). If the compiler directive is not set, then single precision (32bit) visibilities are used instead. Only the visibility complex value itself can be set to half precision not the UVW coordinates of the visibilities. We also have modified kernels for W-projection gridding to support half precision since they are multiplied with visibilities within the gridding module. Visibilities are still accumulated onto a single or double precision grid independently of the 16 bit support compiler directive. We have also used several 16bit compute operations in the DFT, Gridding and Gain Subtraction modules for efficiency (@NZAPP-131). We have performed some initial testing with the GLEAM small dataset to validate the use of 16bit visibilities within the pipeline (NZAPP-269). We found that the RRMSE between the dirty image produced from the SEP Imaging Pipeline with 16bit visiblities enabled as opposed to the dirty image produced using the pipeline with 32bit visibilities, to be only 0.018%. When comparing the dirty image generated from 16bit visibilities with the ideal image generated by a Direct Fourier Transform, the RRMSE was 0.423% (note this was using 16x oversampling and 50 UV planes). More thorough testing should be done in another program increment on the larger GLEAM datasets as well as throughput timings for data transfer (NZAPP-134). The updated GitLab repository can be found here: https://gitlab.com/ska-telescope/sep_pipeline_imaging
Resolved PI.Sprint:
9.3

Feature Checklist:

Stories Completed, Outcomes Reviewed, Satisfies Acceptance Criteria, Accepted by FO

Requirement Status:

PI22 - UNCOVERED

Description

As the precursor to SP-419, this feature prepares a GPU-accelerated imaging/gain calibration pipeline that supports the storage and movement of visibilities in FP16 through multiple major cycles. This will be achived by adapting the code developed for ~~SP-417~~.

It is envisaged that this will be able to accept FP16 visibilities from the CSP, and pass them through imaging and gain calibration cycles in FP16 format. This includes using FP16 for visibilities and kernels in the gridder (but accumulating on an FP32 grid), predicting visibilities in FP16 in the DFT, and calculating antenna gains using FP16 visibilities.

Attachments

Issue Links

depends on

SP-417 Enhance imaging evolutionary prototype to include gain calibration

Done

followed by

SP-1411 Enhance the GPU imaging/gain calibration pipeline to support visibility weighting schemes

Done

is required by

SP-419 Evaluate effects of FP16 visibilities on imaging & calibration in support of the 'half-precision' ECP

Funnel

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

(8 mentioned in)

Structure

Activity

People

Assignee:: Mort, Ben

Reporter:: Griffin, Anthony

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Feature Progress

Story Point Burn-up: (78.57%)

Feature Estimate: 2.0

	Issues	Story Points
To Do	1	3.0
In Progress	0	0.0
Complete	4	11.0
Total	5	14.0

Dates

Created:: 03/Jun/20 9:58 AM

Updated:: 17/Feb/24 1:25 PM

Resolved:: 26/Jan/21 12:11 AM

Enhance the GPU imaging/gain calibration pipeline code to support FP16 visibilities