Steps to perform VDSR:

Mahalakshmi Sundaresan
May 4, 2020
2 min read

1.1 Steps followed:

1.1.1 Training:

1. Conversion of the image to its luminance and chrominance channels (Y, Cb, Cr colorspace).

2. In the luminance channel, application of different scaling factors to downsize to a low-resolution image and then resized back to the original size using bicubic interpolation.

3. Finding the difference between the original image and the resized image obtained from the previous step to get the residual image.

1.1.2 Preprocessing of the training data

Training data can be augmented to increase the amount of training data. In the chosen model, it is obtained by including random rotations by 90 degrees and reflections along the x-direction.
Random patch extraction is also used which extracts several small image patches from a single image.

1.1.3 VDSR layers:

In MATLAB, using the Deep Learning Toolbox, 41 such layers are included in the model for a network with 20 convolutional layers.

The first layer is the Image input layer which works on the image patches. The size of these patches is based on the receptive field of the network given by (2D+1)-by-(2D+1) for a network with D convolutional layers.
It is followed by a 2D Convolutional layer wherein the weights are randomized to introduce asymmetry in neuron learning according to He’s method [6].
The above layer is then followed by a ReLu layer to introduce non-linearity.
The middle layers consist of alternating convolutional and ReLu layers such that the last but one layer is a convolutional layer that reconstructs the image.
The final layer is a regression layer that calculates the mean squared error between the network’s prediction with the residual image.

Comparison to the traditional method of Bicubic interpolation that does not involve deep learning:

Bicubic interpolation steps:
The original image is taken as the reference image.
It is scaled down by a factor ( 0.1,0.2, 0.25,0.5, and 0.75) such that the high-frequency details are lost resulting in a low-resolution image.
To obtain a high-resolution image, the low-resolution image from the previous step is upscaled using bicubic interpolation and resized to match the size of the original image.

1.2 VDSR steps:

The low-resolution image is converted from RGB colorspace to luminance and chrominance color space.
The luminance and chrominance channels are upscaled by using bicubic interpolation.
Only the result from the luminance channel is passed on to the trained VDSR network and from the final layer, the residual image is obtained.
To get a high-resolution luminance component, the upscaled luminance channel is added with the residual image.
For getting a high-resolution colored image, the high-resolution VDSR luminance component from the previous step is concatenated with the upscaled chrominance channels and converted back to RGB colorspace.

1.3 Visual comparison:

A region of interest is chosen from the results of Bicubic interpolation and VDSR images and compared alongside to qualitatively see which is more clear, sharp, and better.
The PSNR and SSIM values are calculated for the results of Bicubic interpolation and VDSR images with respect to their original reference images.