GAN 코드에서 output image size 변경방법(Changing the output image size in GAN code)

Question

형구 on 5 Dec 2023

0
Link

Direct link to this question

https://uk.mathworks.com/matlabcentral/answers/2056439-gan-output-image-size-changing-the-output-image-size-in-gan-code

Answered: Angelo Yeo on 6 Dec 2023

imds = imageDatastore("C:\Users\COMPUTER\Documents\MATLAB\pixel\rpgan\512\sand\20.jpg",IncludeSubfolders=true);
augmenter = imageDataAugmenter(RandXReflection=true);
augimds = augmentedImageDatastore([64 64],imds,DataAugmentation=augmenter)

filterSize = 5;
numFilters = 64;
numLatentInputs = 100;
projectionSize = [4 4 512];
layersGenerator = [
    featureInputLayer(numLatentInputs)
    projectAndReshapeLayer(projectionSize)
    transposedConv2dLayer(filterSize,4*numFilters)
    batchNormalizationLayer
    reluLayer
    transposedConv2dLayer(filterSize,2*numFilters,'Stride',2,'Cropping','same')
    batchNormalizationLayer
    reluLayer
    transposedConv2dLayer(filterSize,numFilters,'Stride',2,'Cropping','same')
    batchNormalizationLayer
    reluLayer
    transposedConv2dLayer(filterSize,3,'Stride',2,'Cropping','same')
    tanhLayer];

netG = dlnetwork(layersGenerator);

dropoutProb = 0.5;
numFilters = 64;
scale = 0.2;
inputSize = [512 512 3];
filterSize = 5;
layersDiscriminator = [
    imageInputLayer(inputSize,Normalization="none")
    dropoutLayer(dropoutProb)
    convolution2dLayer(filterSize,numFilters,Stride=2,Padding="same")
    leakyReluLayer(scale)
    convolution2dLayer(filterSize,2*numFilters,Stride=2,Padding="same")
    batchNormalizationLayer
    leakyReluLayer(scale)
    convolution2dLayer(filterSize,4*numFilters,Stride=2,Padding="same")
    batchNormalizationLayer
    leakyReluLayer(scale)
    convolution2dLayer(filterSize,8*numFilters,Stride=2,Padding="same")
    batchNormalizationLayer
    leakyReluLayer(scale)
    convolution2dLayer(4,1)
    sigmoidLayer];
netD = dlnetwork(layersDiscriminator);

numEpochs = 100;
miniBatchSize = 128;
learnRate = 0.0004;
gradientDecayFactor = 0.5;
squaredGradientDecayFactor = 0.999;
flipProb = 0.5;
validationFrequency = 25;

augimds.MiniBatchSize = miniBatchSize;
mbq = minibatchqueue(augimds, ...
    MiniBatchSize=miniBatchSize, ...
    PartialMiniBatch="discard", ...
    MiniBatchFcn=@preprocessMiniBatch, ...
    MiniBatchFormat="SSCB");

trailingAvgG = [];
trailingAvgSqG = [];
trailingAvg = [];
trailingAvgSqD = [];

numValidationImages = 20;
ZValidation = randn(numLatentInputs,numValidationImages,"single");
ZValidation = dlarray(ZValidation,"CB");
if canUseGPU
    ZValidation = gpuArray(ZValidation);
end
f = figure;
f.Position(3) = 2*f.Position(3);
imageAxes = subplot(1,2,1);
scoreAxes = subplot(1,2,2);
C = colororder;
lineScoreG = animatedline(scoreAxes,Color=C(1,:));
lineScoreD = animatedline(scoreAxes,Color=C(2,:));
legend("Generator","Discriminator");
ylim([0 1])
xlabel("Iteration")
ylabel("Score")
grid on

iteration = 0;
start = tic;
% Loop over epochs.
for epoch = 1:numEpochs
    % Reset and shuffle datastore.
    shuffle(mbq);
    % Loop over mini-batches.
    while hasdata(mbq)
        iteration = iteration + 1;
        % Read mini-batch of data.
        X = next(mbq);
        % Generate latent inputs for the generator network. Convert to
        % dlarray and specify the format "CB" (channel, batch). If a GPU is
        % available, then convert latent inputs to gpuArray.
        Z = randn(numLatentInputs,miniBatchSize,"single");
        Z = dlarray(Z,"CB");
        if canUseGPU
            Z = gpuArray(Z);
        end
        % Evaluate the gradients of the loss with respect to the learnable
        % parameters, the generator state, and the network scores using
        % dlfeval and the modelLoss function.
        [~,~,gradientsG,gradientsD,stateG,lossG,lossD] = ...
            dlfeval(@modelLoss,netG,netD,X,Z,flipProb);
        netG.State = stateG;
        % Update the discriminator network parameters.
        [netD,trailingAvg,trailingAvgSqD] = adamupdate(netD, gradientsD, ...
            trailingAvg, trailingAvgSqD, iteration, ...
            learnRate, gradientDecayFactor, squaredGradientDecayFactor);
        % Update the generator network parameters.
        [netG,trailingAvgG,trailingAvgSqG] = adamupdate(netG, gradientsG, ...
            trailingAvgG, trailingAvgSqG, iteration, ...
            learnRate, gradientDecayFactor, squaredGradientDecayFactor);
        % Every validationFrequency iterations, display batch of generated
        % images using the held-out generator input.
        if mod(iteration,validationFrequency) == 0 || iteration == 1
            % Generate images using the held-out generator input.
            XGeneratedValidation = predict(netG,ZValidation);
            % Tile and rescale the images in the range [0 1].
            I = imtile(extractdata(XGeneratedValidation));
            I = rescale(I);
            % Display the images.
            subplot(1,2,1);
            image(imageAxes,I)
            xticklabels([]);
            yticklabels([]);
            title("Generated Images");
        end
        % Update the scores plot.
        subplot(1,2,2)
        lossG = double(extractdata(lossG));
        addpoints(lineScoreG,iteration,lossG);
        lossD = double(extractdata(lossD));
        addpoints(lineScoreD,iteration,lossD);
        % Update the title with training progress information.
        D = duration(0,0,toc(start),Format="hh:mm:ss");
        title(...
            "Epoch: " + epoch + ", " + ...
            "Iteration: " + iteration + ", " + ...
            "Elapsed: " + string(D))
        drawnow
    end
end

numObservations = 5;
ZNew = randn(numLatentInputs,numObservations,"single");
ZNew = dlarray(ZNew,"CB");
if canUseGPU
    ZNew = gpuArray(ZNew);
end
XGeneratedNew = predict(netG,ZNew);
I = imtile(extractdata(XGeneratedNew));
I = rescale(I);
figure
image(I)
axis off
title("Generated Images")

다음과 같은 코드로 GAN을 사용중인데 output image가 64 by 64밖에 추출이 안됩니다. 원하는 output size로 추출하려면 어떻게 해야하나요?

(I'm using the following code for GAN, but the output image is limited to 64 by 64. How can I extract images with the desired output size?)

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Angelo Yeo on 6 Dec 2023

1
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/2056439-gan-output-image-size-changing-the-output-image-size-in-gan-code#answer_1366389

Transposed Conv Layer가 이미지를 생성할 때에는 Filter의 사이즈, Stride의 크기 등의 하이퍼파라미터에 의해 출력 이미지 사이즈가 결정됩니다. 가령 참고하신 예제에서 filterSize 값을 5에서 4로 바꾸면 생성되는 이미지의 크기는 56 x 56이 됩니다. 만약, 더 큰 이미지를 출력하고 싶다면 Transposed Conv Layer를 추가하는 등의 작업을 수행할 수 있습니다.

Conv2DLayer와 TransposedConv2DLayer가 작동하는 방법에 관해서는 아래의 공식 문서에서 확인할 수 있습니다.

convolution2dLayer

transposedConv2dLayer

마지막으로 GAN을 정상적으로 훈련하기 위해서는 Generator에서 생성되는 이미지와 Discriminator에 입력되는 이미지의 사이즈를 맞춰주는 것도 잊으면 안됩니다.