Uploaded image for project: 'SAFe Program'
  1. SAFe Program
  2. SP-3783

Benchmark the performance of selected GPU and CPU implementations of the Complex Fourier Transform (CXFT) for future improvement of the PSS CXFT module

Change Owns to Parent OfsSet start and due date...
    XporterXMLWordPrintable

Details

    • Feature
    • Should have
    • PI20
    • COM PSS SW
    • None
    • Data Processing
    • Hide

      Provides reproducible measurements of the performance of different implementations of the Complex Fourier Transform (CXFT)  and helps towards converging on an architecture for the PSS CXFT component based on those measurements.

      Show
      Provides reproducible measurements of the performance of different implementations of the Complex Fourier Transform (CXFT)  and helps towards converging on an architecture for the PSS CXFT component based on those measurements.
    • Hide

      Documentation that

      • describes the set up of the benchmarking tests and instructions for running the tests
      • summarises the results and identifies the bottlenecks
      • describes the outlook based on the results, including a comparison of the suitability of the CXFT implementations for the final (AA2+) design
      Show
      Documentation that describes the set up of the benchmarking tests and instructions for running the tests summarises the results and identifies the bottlenecks describes the outlook based on the results, including a comparison of the suitability of the CXFT implementations for the final (AA2+) design
    • 2
    • 2
    • 0
    • Team_PSS
    • Sprint 4
    • Hide
      • We ensured that the cuFFT (GPU-based FFT) and FFTW (CPU-based FFT) worked correctly in simple signal-recovery tests.
      • We found the cuFFT to work about a factor 10 faster than FFTW.
      • Tests were run on three different machines in Oxford (Sharabha, Tengu, and Hippalectryon). The hardware setup of Sharabha enabled a significantly better performance.
      • Overall, the performance was sufficiently fast. We decided that there is no need to test even more advanced FFT algorithms (such as heFFT, see also https://jira.skatelescope.org/browse/AT4-1090 )

       

      • Bottlenecks are outliers/jitter which caused the total time for 5000 FFTs to exceptionally be a factor 1.5 larger than the average.

       

      A summary document, FFTW_cuFFT_SP3783.pdf, has been copied to PSS's google drive:

      https://drive.google.com/drive/folders/1r-qi-pGBrE6TdHEakjQo_zDfpD9TB7ri?usp=drive_link

      Show
      We ensured that the cuFFT (GPU-based FFT) and FFTW (CPU-based FFT) worked correctly in simple signal-recovery tests. We found the cuFFT to work about a factor 10 faster than FFTW. Tests were run on three different machines in Oxford (Sharabha, Tengu, and Hippalectryon). The hardware setup of Sharabha enabled a significantly better performance. Overall, the performance was sufficiently fast. We decided that there is no need to test even more advanced FFT algorithms (such as heFFT, see also  https://jira.skatelescope.org/browse/AT4-1090 )   Bottlenecks are outliers/jitter which caused the total time for 5000 FFTs to exceptionally be a factor 1.5 larger than the average.   A summary document, FFTW_cuFFT_SP3783.pdf, has been copied to PSS's google drive: https://drive.google.com/drive/folders/1r-qi-pGBrE6TdHEakjQo_zDfpD9TB7ri?usp=drive_link
    • 20.6
    • Stories Completed, Outcomes Reviewed, Satisfies Acceptance Criteria, Accepted by FO
    • PI22 - UNCOVERED

    Description

      PSS currently has no CPU implementation for FFT, the performance of the GPU implementation is unclear. A CPU FFTW implementation is under consideration. An evaluation of the current FFT literature (P19 spike) identified also  new, potentially more efficient implementations. 

      A test of the performance of FFTW (CPU), CuFFT (GPU) for realistic data will be performed.

      It needs to be investigated whether running another FFT implementation is possible within the framework of PSS.  If so, a  test of performance will be carried out.

       

      Attachments

        Issue Links

          Structure

            Activity

              People

                A.Noutsos Noutsos, Aristeidis
                L.Levin-Preston Levin-Preston, Lina
                Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Feature Progress

                  Story Point Burn-up: (100.00%)

                  Feature Estimate: 2.0

                  IssuesStory Points
                  To Do00.0
                  In Progress   00.0
                  Complete617.0
                  Total617.0

                  Dates

                    Created:
                    Updated:
                    Resolved:

                    Structure Helper Panel