Simulate Diffraction Patterns Using CUDA FFT Libraries
This example shows how to use GPU Coder™ to leverage the CUDA® Fast Fourier Transform library (cuFFT) to compute two-dimensional FFT on a NVIDIA® GPU. The two-dimensional Fourier transform is used in optics to calculate far-field diffraction patterns. When a monochromatic light source passes through a small aperture, such as in Young's double-slit experiment, you can observe these diffraction patterns. This example also shows you how to use GPU pointers as inputs to an entry-point function when generating CUDA MEX, source code, static libraries, dynamic libraries, and executables. By using this functionality, the performance of the generated code is improved by minimizing the number of cudaMemcpy calls in the generated code.
Third-Party Prerequisites
Required
This example generates CUDA MEX and has the following third-party requirements.
CUDA enabled NVIDIA GPU and compatible driver.
Optional
For non-MEX builds such as static, dynamic libraries or executables, this example has the following additional requirements.
NVIDIA toolkit.
Environment variables for the compilers and libraries. For more information, see Third-Party Hardware and Setting Up the Prerequisite Products.
Verify GPU Environment
To verify that the compilers and libraries necessary for running this example are set up correctly, use the coder.checkGpuInstall
function.
envCfg = coder.gpuEnvConfig('host');
envCfg.BasicCodegen = 1;
envCfg.Quiet = 1;
coder.checkGpuInstall(envCfg);
Define the Coordinate System
Before simulating the light that has passed through an aperture, you must define your coordinate system. To get the correct numeric behavior when you call fft2
, you must carefully arrange and so that the zero value is in the correct place. N2
is half the size in each dimension.
N2 = 1024; [gx, gy] = meshgrid(-1:1/N2:(N2-1)/N2);
Simulate the Diffraction Pattern for a Rectangular Aperture
Simulate the effect of passing a parallel beam of monochromatic light through a small rectangular aperture. The two-dimensional Fourier transform describes the light field at a large distance from the aperture. Form aperture
as a logical mask based on the coordinate system. The light source is a double-precision version of the aperture. Find the far-field light signal by using the fft2
function.
aperture = (abs(gx) < 4/N2) .* (abs(gy) < 2/N2); lightsource = double(aperture); farfieldsignal = fft2(lightsource);
Display the Light Intensity for a Rectangular Aperture
The visualize.m
function displays the light intensity for a rectangular aperture. Calculate the far-field light intensity from the magnitude squared of the light field. To aid visualization, use the fftshift
function.
type visualize
function visualize(farfieldsignal, titleStr) farfieldintensity = real( farfieldsignal .* conj( farfieldsignal ) ); imagesc( fftshift( farfieldintensity ) ); axis( 'equal' ); axis( 'off' ); title(titleStr); end
str = sprintf('Rectangular Aperture Far-Field Diffraction Pattern in MATLAB');
visualize(farfieldsignal,str);
Generate CUDA MEX for the Function
You do not have to create an entry-point function. You can directly generate code for the MATLAB® fft2
function. To generate CUDA MEX for the MATLAB fft2
function, in the configuration object, set the EnablecuFFT
property and use the codegen
function. GPU Coder replaces fft
, ifft
, fft2
, ifft2
, fftn
, and ifftn
function calls in your MATLAB code to the appropriate cuFFT library calls. For two-dimensional transforms and higher, GPU Coder creates multiple 1-D batched transforms. These batched transforms have higher performance than single transforms. After generating the MEX function, you can verify that it has the same functionality as the original MATLAB entry-point function. Run the generated fft2_mex
and plot the results.
cfg = coder.gpuConfig('mex'); cfg.GpuConfig.EnableCUFFT = 1; codegen -config cfg -args {lightsource} fft2 farfieldsignalGPU = fft2_mex(lightsource); str = sprintf('Rectangular Aperture Far-Field Diffraction Pattern on GPU'); visualize(farfieldsignalGPU,str);
Code generation successful.
Simulate The Young's Double-Slit Experiment
Young's double-slit experiment shows light interference when an aperture comprises two parallel slits. A series of bright points is visible where constructive interference takes place. In this case, form the aperture representing two slits. Restrict the aperture in the direction to ensure that the resulting pattern is not entirely concentrated along the horizontal axis.
slits = (abs(gx) <= 10/N2) .* (abs(gx) >= 8/N2); aperture = slits .* (abs(gy) < 20/N2); lightsource = double(aperture);
Display the Light Intensity for Young's Double-Slit
Because the size, type and complexity of the inputs remains the same, reuse the fft2_mex
generated MEX-function and display the intensity as before.
farfieldsignalGPU = fft2_mex(lightsource);
str = sprintf('Double Slit Far-Field Diffraction Pattern on GPU');
visualize(farfieldsignalGPU,str);
Generate CUDA MEX Using GPU Pointer as Input
In the CUDA MEX generated above, the input provided to MEX is copied from CPU to GPU memory, the computation is performed on the GPU and the result is copied back to the CPU. Alternatively, CUDA code can be generated such that it accepts GPU pointers directly. For MEX targets, GPU pointers can be passed from MATLAB® to CUDA MEX using gpuArray
. For other targets, GPU memory must be allocated and inputs must be copied from CPU to GPU inside the handwritten main function, before they are passed to the entry-point function.
lightsource_gpu = gpuArray(lightsource); cfg = coder.gpuConfig('mex'); cfg.GpuConfig.EnableCUFFT = 1; codegen -config cfg -args {lightsource_gpu} fft2 -o fft2_gpu_mex
Code generation successful.
Only numeric and logical input matrix types can be passed as GPU pointers to the entry-point function. Other data types that are not supported can be passed as CPU inputs. During code generation, if at least one of the inputs provided to the entry-point function is a GPU pointer, the outputs returned from the function are also GPU pointers. However, if the data type of the output is not supported as a GPU pointer, such as a struct or a cell-array, the output will be returned as a CPU pointer. For more information on passing GPU pointers to entry-point function, see Support for GPU Arrays.
Notice the difference in the generated CUDA code when using lightsource_gpu
GPU input. It avoids copying the input from CPU to GPU memory and avoids copying the result back from GPU to CPU memory. This results in fewer cudaMemcpys
and improves the performance of the generated CUDA MEX.
Verify Results of CUDA MEX Using GPU Pointer as Input
To verify that the generated CUDA MEX using gpuArray
has the same functionality, run the generated fft2_gpu_mex
, gather the results on the host and plot the results.
farfieldsignal_gpu = fft2_gpu_mex(lightsource_gpu);
farfieldsignal_cpu = gather(farfieldsignal_gpu);
str = sprintf('Double Slit Far-Field Diffraction Pattern on GPU using gpuArray');
visualize(farfieldsignal_cpu,str);
See Also
Functions
codegen
|coder.gpu.kernel
|coder.gpu.kernelfun
|gpucoder.matrixMatrixKernel
|coder.gpu.constantMemory
|gpucoder.stencilKernel
|coder.checkGpuInstall