Simulate Diffraction Patterns Using CUDA FFT Libraries

Open Script

This example shows how to use GPU Coder™ to leverage the CUDA® Fast Fourier Transform library (cuFFT) to compute two-dimensional FFT on a NVIDIA® GPU. The two-dimensional Fourier transform is used in optics to calculate far-field diffraction patterns. When a monochromatic light source passes through a small aperture, such as in Young's double-slit experiment, you can observe these diffraction patterns. This example also shows how to pass GPU inputs to an entry-point function when generating CUDA MEX, source code, static libraries, dynamic libraries, and executables. By using this functionality, the performance of the generated code is improved by minimizing the number of cudaMemcpy calls in the generated code.

Third-Party Prerequisites

Required

This example generates CUDA MEX and has the following third-party requirements.

CUDA enabled NVIDIA GPU and compatible driver.

Optional

For non-MEX builds such as static, dynamic libraries or executables, this example has the following additional requirements.

NVIDIA CUDA toolkit.
Environment variables for the compilers and libraries. For more information, see Third-Party Hardware and Setting Up the Prerequisite Products.

Verify GPU Environment

To verify that the compilers and libraries necessary for running this example are set up correctly, use the coder.checkGpuInstall function.

envCfg = coder.gpuEnvConfig('host');
envCfg.BasicCodegen = 1;
envCfg.Quiet = 1;
coder.checkGpuInstall(envCfg);

Define the Coordinate System

Before simulating the light that has passed through an aperture, you must define your coordinate system. To get the correct numeric behavior when you call fft2, you must carefully arrange $x$ and $y$ so that the zero value is in the correct place. N2 is half the size in each dimension.

N2 = 1024;
[gx, gy] = meshgrid(-1:1/N2:(N2-1)/N2);

Simulate the Diffraction Pattern for a Rectangular Aperture

Simulate the effect of passing a parallel beam of monochromatic light through a small rectangular aperture. The two-dimensional Fourier transform describes the light field at a large distance from the aperture. Form aperture as a logical mask based on the coordinate system. The light source is a double-precision version of the aperture. Find the far-field light signal by using the fft2 function.

aperture       = (abs(gx) < 4/N2) .* (abs(gy) < 2/N2);
lightsource    = double(aperture);
farfieldsignal = fft2(lightsource);

Display the Light Intensity for a Rectangular Aperture

The visualize.m function displays the light intensity for a rectangular aperture. Calculate the far-field light intensity from the magnitude squared of the light field. To aid visualization, use the fftshift function.

type visualize

function visualize(farfieldsignal, titleStr)

farfieldintensity = real( farfieldsignal .* conj( farfieldsignal ) );
imagesc( fftshift( farfieldintensity ) );
axis( 'equal' ); axis( 'off' );
title(titleStr);

end

str = sprintf('Rectangular Aperture Far-Field Diffraction Pattern in MATLAB');
visualize(farfieldsignal,str);

Generate CUDA MEX for the Function

You do not have to create an entry-point function. You can directly generate code for the MATLAB® fft2 function. To generate CUDA MEX for the MATLAB fft2 function, in the configuration object, set the EnablecuFFT property and use the codegen function. GPU Coder replaces fft, ifft, fft2, ifft2, fftn, and ifftn function calls in your MATLAB code to the appropriate cuFFT library calls. For two-dimensional transforms and higher, GPU Coder creates multiple 1-D batched transforms. These batched transforms have higher performance than single transforms. After generating the MEX function, you can verify that it has the same functionality as the original MATLAB entry-point function. Run the generated fft2_mex and plot the results.

cfg = coder.gpuConfig('mex');
cfg.GpuConfig.EnableCUFFT = 1;
codegen -config cfg -args {lightsource} fft2

farfieldsignalGPU = fft2_mex(lightsource);
str = sprintf('Rectangular Aperture Far-Field Diffraction Pattern on GPU');
visualize(farfieldsignalGPU,str);

Code generation successful.

Simulate The Young's Double-Slit Experiment

Young's double-slit experiment shows light interference when an aperture comprises two parallel slits. A series of bright points is visible where constructive interference takes place. In this case, form the aperture representing two slits. Restrict the aperture in the $y$ direction to ensure that the resulting pattern is not entirely concentrated along the horizontal axis.

slits          = (abs(gx) <= 10/N2) .* (abs(gx) >= 8/N2);
aperture       = slits .* (abs(gy) < 20/N2);
lightsource    = double(aperture);

Display the Light Intensity for Young's Double-Slit

Because the size, type and complexity of the inputs remains the same, reuse the fft2_mex generated MEX-function and display the intensity as before.

farfieldsignalGPU = fft2_mex(lightsource);
str = sprintf('Double Slit Far-Field Diffraction Pattern on GPU');
visualize(farfieldsignalGPU,str);

Generate CUDA MEX That Accepts GPU Input

In the CUDA MEX generated above, the input provided to MEX is copied from CPU to GPU memory, the computation is performed on the GPU and the result is copied back to the CPU. Alternatively, CUDA code can be generated such that it accepts inputs from the GPU directly. For MEX targets, you can pass GPU inputs from MATLAB® to CUDA MEX by using gpuArray. For other targets, GPU memory must be allocated and inputs must be copied from CPU to GPU inside the handwritten main function, before they are passed to the entry-point function.

lightsource_gpu = gpuArray(lightsource);
cfg = coder.gpuConfig('mex');
cfg.GpuConfig.EnableCUFFT = 1;
codegen -config cfg -args {lightsource_gpu} fft2 -o fft2_gpu_mex

Code generation successful.

You can pass GPU inputs of numeric or logical matrix types to the entry-point function. Other data types that are not supported can be passed as CPU inputs. During code generation, if at least one of the inputs provided to the entry-point function is a GPU input, GPU Coder tries to return the output directly from the GPU. However, if the data type of the output is not supported as a GPU output, such as a struct or a cell array, the generated code returns the output on the CPU. For more information on passing GPU inputs to entry-point function, see Support for GPU Arrays.

Notice the difference in the generated CUDA code when using lightsource_gpu GPU input. It avoids copying the input from CPU to GPU memory and avoids copying the result back from GPU to CPU memory. This results in fewer cudaMemcpys and improves the performance of the generated CUDA MEX.

Verify Results of CUDA MEX That Accepts GPU Input

To verify that the generated CUDA MEX using gpuArray has the same functionality, run the generated fft2_gpu_mex, gather the results on the host and plot the results.

farfieldsignal_gpu = fft2_gpu_mex(lightsource_gpu);
farfieldsignal_cpu = gather(farfieldsignal_gpu);
str = sprintf('Double Slit Far-Field Diffraction Pattern on GPU using gpuArray');
visualize(farfieldsignal_cpu,str);