Day 4 of deep learning - generating confrontation network (GAN): handwritten digit generation

>- **🍨 This article is:[🔗365 Day deep learning training camp](**
>- **🍦 Reference article address: [🔗100 cases of deep learning-Generate countermeasure network( GAN)Handwritten digit generation | Day 18](**
>- **🍖 Author:[K Classmate](**

1. Set GPU

import tensorflow as tf

gpus = tf.config.list_physical_devices("GPU")

if gpus:
    tf.config.experimental.set_memory_growth(gpus[0], True)  #Set the amount of GPU video memory and use it as needed
# Print the graphics card information and confirm that the GPU is available

Results obtained:

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

from tensorflow.keras import layers, datasets, Sequential, Model, optimizers
from tensorflow.keras.layers import LeakyReLU, UpSampling2D, Conv2D

import matplotlib.pyplot as plt
import numpy             as np
import sys,os,pathlib

2. Define training parameters

img_shape  = (28, 28, 1)
latent_dim = 200

2, What is adversarial neural network

  1. Brief introduction
    The generated countermeasure network (GAN) includes a generator and a discriminator. The two models are constantly learning and evolving through countermeasure training.

Generator: generates data (mostly images) to "cheat" the discriminator.

Discriminator: judge whether the image is real or machine generated, with the purpose of finding out the "false data" generated by the generator.

  1. application area
    GAN has a wide range of applications, including image synthesis, style migration, photo repair, photo editing, data enhancement and so on.

1) Style transfer

Image style migration is to transfer the style of image A to image B to obtain A new image.

2) Image generation

GAN can generate not only faces, but also other types of images, such as cartoon characters.

3, Network structure
To put it simply, the generator is used to generate handwritten digital images, and the discriminator is used to identify the authenticity of the images. They learn from each other (volume), and constantly improve themselves in the process of confrontation learning (volume), until the generator can generate a picture that is false but not true (the discriminator cannot judge whether it is true or false). The structure diagram is as follows:

GAN steps:

1. A Generator receives the random number and returns the generated image.
2. Send the generated digital image to a Discriminator together with the digital image in the actual data set.
3. The Discriminator receives the real and false images and returns the probability. A number between 0 and 1 indicates true and 0 indicates false.
4, Build builder

def build_generator():
    # ======================================= #
    #     Generator, input a string of random numbers to generate pictures
    # ======================================= #
    model = Sequential([
        layers.Dense(256, input_dim=latent_dim),
        layers.LeakyReLU(alpha=0.2),               # Advanced activation function
        layers.BatchNormalization(momentum=0.8),   # BN normalization
        layers.Dense(, activation='tanh'),

    noise = layers.Input(shape=(latent_dim,))
    img = model(noise)

    return Model(noise, img)

5, Build discriminator

def build_discriminator():
    # ===================================== #
    #   Discriminator, which discriminates whether the input picture is true or false
    # ===================================== #
    model = Sequential([
        layers.Dense(1, activation='sigmoid')

    img = layers.Input(shape=img_shape)
    validity = model(img)

    return Model(img, validity)

  • Discriminator training principle: improve the effect by identifying the input pictures
  • Training principle of generator: identify the generated pictures by the discriminator to achieve improvement
# Create discriminator
discriminator = build_discriminator()
# Define optimizer
optimizer = tf.keras.optimizers.Adam(1e-4)

# Create generator 
generator = build_generator()
gan_input = layers.Input(shape=(latent_dim,))
img = generator(gan_input)

# When training generate, do not train discriminator
discriminator.trainable = False

# Predict the generated false picture
validity = discriminator(img)
combined = Model(gan_input, validity)
combined.compile(loss='binary_crossentropy', optimizer=optimizer)

6, Training model
1. Save the sample image

def sample_images(epoch):
    Save sample image
    row, col = 4, 4
    noise = np.random.normal(0, 1, (row*col, latent_dim))
    gen_imgs = generator.predict(noise)

    fig, axs = plt.subplots(row, col)
    cnt = 0
    for i in range(row):
        for j in range(col):
            axs[i,j].imshow(gen_imgs[cnt, :,:,0], cmap='gray')
            cnt += 1
    fig.savefig("images/%05d.png" % epoch)

2. Training model
train_on_batch: the function accepts a single batch of data, performs back propagation, and then updates the model parameters. The size of the batch of data can be arbitrary, that is, it does not need to provide a clear batch size. It belongs to a fine control training model.

def train(epochs, batch_size=128, sample_interval=50):
    # Load data
    (train_images,_), (_,_) = tf.keras.datasets.mnist.load_data()

    # Normalize the picture to [- 1, 1] interval   
    train_images = (train_images - 127.5) / 127.5
    # data
    train_images = np.expand_dims(train_images, axis=3)

    # create label
    true = np.ones((batch_size, 1))
    fake = np.zeros((batch_size, 1))
    # Cycle training
    for epoch in range(epochs): 

        # Randomly select batch_size pictures
        idx = np.random.randint(0, train_images.shape[0], batch_size)
        imgs = train_images[idx]      
        # Generate noise
        noise = np.random.normal(0, 1, (batch_size, latent_dim))
        # Generator generates pictures through noise, Gen_ The shape of IMGs is: (128, 28, 28, 1)
        gen_imgs = generator.predict(noise)
        # Training discriminator 
        d_loss_true = discriminator.train_on_batch(imgs, true)
        d_loss_fake = discriminator.train_on_batch(gen_imgs, fake)
        # Return loss value
        d_loss = 0.5 * np.add(d_loss_true, d_loss_fake)

        # Training generator
        noise = np.random.normal(0, 1, (batch_size, latent_dim))
        g_loss = combined.train_on_batch(noise, true)
        print ("%d [D loss: %f, acc.: %.2f%%] [G loss: %f]" % (epoch, d_loss[0], 100*d_loss[1], g_loss))

        # Save sample image
        if epoch % sample_interval == 0:

train(epochs=30000, batch_size=256, sample_interval=200)

Results obtained:

0 [D loss: 0.587824, acc.: 67.77%] [G loss: 0.634870]
1 [D loss: 0.387015, acc.: 74.22%] [G loss: 0.541133]
2 [D loss: 0.380705, acc.: 63.67%] [G loss: 0.455188]
3 [D loss: 0.408720, acc.: 56.25%] [G loss: 0.405431]
4 [D loss: 0.445802, acc.: 52.34%] [G loss: 0.343866]
176 [D loss: 0.394246, acc.: 66.41%] [G loss: 0.648134]
177 [D loss: 0.393966, acc.: 66.21%] [G loss: 0.640118]
178 [D loss: 0.402815, acc.: 65.62%] [G loss: 0.641665]
179 [D loss: 0.404573, acc.: 65.82%] [G loss: 0.647686]
180 [D loss: 0.394707, acc.: 67.19%] [G loss: 0.631329]

7, Generate dynamic graph

import imageio

def compose_gif():
    # Picture address:
    data_dir = "F:/jupyter notebook/DL-100-days/code/images"
    data_dir = pathlib.Path(data_dir)
    paths    = list(data_dir.glob('*'))
    gif_images = []
    for path in paths:

Results obtained:

F:\jupyter notebook\DL-100-days\code\images\00000.png
F:\jupyter notebook\DL-100-days\code\images\00200.png
F:\jupyter notebook\DL-100-days\code\images\00400.png
F:\jupyter notebook\DL-100-days\code\images\00600.png
F:\jupyter notebook\DL-100-days\code\images\00800.png
F:\jupyter notebook\DL-100-days\code\images\01000.png

Tags: neural networks Deep Learning

Posted by kaszu on Thu, 18 Aug 2022 12:29:07 +0300