Using Python and Keras for principal component analysis, neural network construction and image reconstruction

the introduce

These days, there's a ton of data in just about every app we use—listening to music, browsing a friend's images, or watching a new trailer.

Recently we were asked by a client to write a research report on image reconstruction, including some graphical and statistical output.

Principal Component Analysis PCA Dimensionality Reduction Method and R Language Analysis Wine Visualization Example

Principal Component Analysis PCA Dimensionality Reduction Method and R Language Analysis Wine Visualization Example

, duration 04:30

For a single user this is not a problem. However, imagine processing thousands of requests (if not millions) with big data at the same time. These streams of data have to be reduced somehow so that we can physically deliver it to the user - this is where data compression begins.

There are many compression techniques and they vary in usage and compatibility.

There are two main types of compression:

  • Lossless: Data Integrity and Accuracy is preferred even if we're not too "smart"
  • Lossy: data integrity and accuracy are not as important as the speed at which we deliver the service - imagine live video delivery where "real time" delivery is more important than having high quality video

For example, using Autoencoders, we can decompose this image and represent it as the 32-vector code below. Using it, we can reconstruct the image. Of course, this is an example of lossy compression, since we've already lost a lot of information.

However, we can use the exact same technique to do it more precisely by allocating more space for the representation:

Keras is a Python framework that simplifies the construction of neural networks. the

First, let's install Keras using pip:

$ pip install keras

Preprocess data

Similarly, we will use LFW dataset . As usual for such projects we will preprocess the data.

To do this, we will first define a few paths:

 ATTRS_NAME = "lfw_attributes.txt"

 IMAGES_NAME = "lfw-deepfunneled.tgz"

 RAW_IMAGES_NAME = "lfw.tgz"

Then, we'll use two functions - one to convert the original matrix to an image and change the color system to RGB:

def decode_image_from_raw_bytes(raw_bytes):
    img = cv2.imdecode(np.asarray(bytearray(raw_bytes), dtype=np.uint8), 1)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    return img

The other is to actually load the dataset and adapt it to our needs:



Our data X exists in a matrix as a 3D matrix, which is the default representation for RGB images. By providing three matrices - red, green and blue, the combination of these three matrices produces the image color.

Each pixel of these images will have a large value ranging from 0 to 255. Usually in machine learning we tend to keep the values ​​small and centered around 0 as this helps our model train faster and get better results, so let's normalize the image:

X = X.astype('float32') / 255.0 - 0.5

Now, if we test the min and max of the X array, it will be -.5 and .5, which you can verify:

print(X.max(), X.min())
0.5 -0.5

To be able to see the image, let's create a show_image function. 0.5 will be added to the image since pixel values ​​cannot be negative:


Now, let's take a quick look at our data:


Now let's split the data into training and testing sets:


The sklearn train_test_split() function is able to split the data by giving it the test ratio, the rest is of course the training volume. random_state, you'll see a lot of machine learning, used to produce the same result no matter how many times you run the code.

Now it's time to model:


This function takes image_shape (image dimensions) and code_size (size of output representation) as parameters. the

Logically, the smaller the value of code_size, the more the image will be compressed, but the less features will be preserved, and the copied image will be more different from the original.

Since the network architecture does not accept 3D matrices, the job of this Flatten layer is to flatten the (32,32,3) matrix into a 1D array (3072).

Now, wire it all together and start our model:


After that, we create a link them through Model with inp and reconstruction parameters and compile it with adamax optimizer and mse loss function.

Compiling a model here means defining its goals and how to achieve them. In our context, the goal is to minimize mse and we do this by using an optimizer - essentially an algorithm tuned to find the global minimum.

the result:

Layer (type)                 Output Shape              Param #
input_6 (InputLayer)         (None, 32, 32, 3)         0
sequential_3 (Sequential)    (None, 32)                98336
sequential_4 (Sequential)    (None, 32, 32, 3)         101376
Total params: 199,712
Trainable params: 199,712
Non-trainable params: 0

Here we can see that the input is 32,32,3.  

The hidden layer is 32 and the decoder output you see is (32,32,3).



In this example, we will compare the constructed image with the original image, so both x and y are equal to X_train. Ideally, input equals output.

The epochs variable defines how many times we want the training data to pass through the model, and validation_data is the validation set we use to evaluate the trained model:

Train on 11828 samples, validate on 1315 samples
Epoch 1/20
11828/11828 [==============================] - 3s 272us/step - loss: 0.0128 - val_loss: 0.0087
Epoch 2/20
11828/11828 [==============================] - 3s 227us/step - loss: 0.0078 - val_loss: 0.0071
Epoch 20/20
11828/11828 [==============================] - 3s 237us/step - loss: 0.0067 - val_loss: 0.0066

We can visualize the loss to get an overview.

plt.title('model loss')
plt.legend(['train', 'test'], loc='upper left')

We can see that after the third epoch, the loss does not progress significantly.  

This can also cause the model to overfit, making it perform poorly on new data outside of the training and testing datasets.

Now, the most anticipated part - let's visualize the result:

def visualize(img,encoder,decoder):
""draw raw, encoded and decoded image"""
     #img [None]has shape (1, 32, 32, 3), which is the same as the model input
    code = encoder.predict(img[None])[0]
    reco = decoder.predict(code[None])[0]




for i in range(5):
    img = X_test[i]

Now, let's increase code_size to 1000:

What we just did is Principal Component Analysis (PCA), which is a dimensionality reduction technique. We can use it to reduce the size of the feature set by generating smaller new features, but still capture important information.

Principal component analysis is a very popular usage.

Image Denoising

Another popular usage is denoising. Let's add some random noise to the picture:

def apply_gaussian_noise(X, sigma=0.1):
    noise = np.random.normal(loc=0.0, scale=sigma, size=X.shape)
    return X + noise

Here we have added some random noise from a standard normal distribution with magnitude sigma, which is 0.1 by default.

For reference, this is what the noise looks like with different values ​​of sigma:


As we can see, the 0.5 increase in sigma of the image is barely visible. We will try to reproduce the original image from a noisy image with σ 0.1.

The model we will generate for this is the same as the previous model, although we will train it differently. This time, we'll train it using the original and corresponding noisy images:


Now let's look at the model results:


in conclusion

Principal component analysis, which is a dimensionality reduction technique, image denoising, etc.

Tags: Python neural networks keras

Posted by ranjita on Wed, 07 Dec 2022 01:48:04 +0300