Keras vs PyTorch vs Caffe:CNN implementation comparison

Compile | VK
Source|Analytics Indiamag

In today's world, AI has been used by most business operations and is very easy to deploy due to the advanced in-depth learning framework. These in-depth learning frameworks provide advanced programming interfaces to help us design in-depth learning models. Using the in-depth learning framework, which reduces the work of developers by providing built-in library functions, allows us to build models faster and easier.

In this article, we will build the same in-depth learning framework, that is, image classification of convolution neural networks for the same dataset in Keras, PyTorch, and Cafe, and compare the implementation of all these methods. Finally, we'll see how PyTorch's CNN model outperforms its built-in Keras and Cafe peers.

Topics covered in this article

  • How to choose a framework for in-depth learning.
  • Advantages and disadvantages of Keras
  • Advantages and disadvantages of PyTorch
  • Advantages and disadvantages of Caffe
  • Implement CNN models in Keras, PyTorch, and Cafe.

Select a framework for in-depth learning

When choosing the in-depth learning framework, there are some indicators to find the best framework. It should provide parallel computing, interface to run the model well, a large number of built-in packages, it should optimize performance, but also consider our business issues and flexibility, which are the basic issues we should consider before choosing the in-depth learning framework. Let's compare the three most commonly used in-depth learning frameworks, Keras, Pytorch, and Cafe.


Keras is an open source framework developed by Google Engineer Francois Chollet and is a deep learning framework that allows us to easily use and evaluate our model with just a few lines of code.

If you are unfamiliar with in-depth learning, Keras is the best starter framework for beginners. Keras is very friendly to beginners and easy to work with python, and it has many pre-training models (VGG, Inception, etc.). Not only is it easy to learn, it also supports Tensorflow as a backend.

Limitations of using Keras

  • Keras needs to improve some features
  • We need to sacrifice speed in exchange for its user friendliness
  • Sometimes even using a gpu takes a long time.

Actual implementation using Keras framework

In the code snippet below, we will import the required libraries.

import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K


batch_size = 128
num_classes = 10
epochs = 12
img_rows, img_cols = 28, 28
(x_train, y_train), (x_test, y_test) = mnist.load_data()

In the code snippet below, we will build a deep learning model with several layers and assign optimizers, activation functions, and loss functions.

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

In the code snippet below, we will train and evaluate the model., y_train,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])


PyTorch is an open source framework developed by the Facebook research team. It is an implementation of the in-depth learning model. It provides all the services and functions provided by the python environment. It allows automatic differentiation and helps accelerate the reverse propagation process. PyTorch provides modules such as torchvision, torchaudio, torchtext, which can work flexibly in NLP and computer vision. PyTorch is more flexible for researchers than for developers.

Limitations of PyTorch

  • PyTorch is more popular among researchers than among developers.
  • It lacks productivity.

Implemented using the PyTorch framework

Install required libraries

import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import as dataloader
import torch.optim as optim
from import TensorDataset
from torchvision import transforms
from torchvision.datasets import MNIST

In the code snippet below, we will load the dataset and split it into training and test sets.

train = MNIST('./data', train=True, download=True, transform=transforms.Compose([
]), )
test = MNIST('./data', train=False, download=True, transform=transforms.Compose([
]), )
dataloader_args = dict(shuffle=True, batch_size=64,num_workers=1, pin_memory=True)
train_loader = dataloader.DataLoader(train, **dataloader_args)
test_loader = dataloader.DataLoader(test, **dataloader_args)
train_data = train.train_data
train_data = train.transform(train_data.numpy())

In the code snippet below, we will build our model and set up activation functions and optimizers.

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.fc1 = nn.Linear(784, 548)
        self.bc1 = nn.BatchNorm1d(548) 
        self.fc2 = nn.Linear(548, 252)
        self.bc2 = nn.BatchNorm1d(252)
        self.fc3 = nn.Linear(252, 10)              
    def forward(self, x):
        a = x.view((-1, 784))
        b = self.fc1(a)
        b = self.bc1(b)
        b = F.relu(b)
        b = F.dropout(b, p=0.5) 
        b = self.fc2(b)
        b = self.bc2(b)
        b = F.relu(b)
        b = F.dropout(b, p=0.2)
        b = self.fc3(b)
        out = F.log_softmax(b)
        return out
model = Model()
optimizer = optim.SGD(model.parameters(), lr=0.001)

In the code snippet below, we will train our model, and in training, we will specify the loss function, cross-entropy.

losses = []
for epoch in range(12):
    for batch_idx, (data,data_1) in enumerate(train_loader):
        data,data_1 = Variable(data.cuda()), Variable(target.cuda())
        y_pred = model(data) 
        loss = F.cross_entropy(y_pred, target)
        if batch_idx % 100 == 1:
            print('\r Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                batch_idx * len(data), 
                len(train_loader.dataset), 100. * batch_idx / len(train_loader),

#Evaluation Model

output = model(evaluate)
predict =[1]
pred = pred.eq(
accuracy = pred.sum()/pred.size()[0]
print('Accuracy:', accuracy)


Caffe (Convolutional Architecture for Fast Feature Embedding) is an open source deep learning framework developed by Yang Qing Jia. The framework supports researchers and industrial applications in the field of artificial intelligence.

Most developers use Caffe because of its speed, which uses an NVIDIA K40 GPU to process 60 million images per day. Caffe has many contributors to update and maintain the framework, and it works well on computer visual models compared to other areas of in-depth learning.

Limitations of Caffe

Caffe doesn't have a higher-level API, so it's hard to experiment with.

In Caffe, in order to deploy our model, we need to compile the source code.

Install Caffe

!apt install -y caffe-tools-cpu

Import required libraries

import os
import numpy as np
import math
import caffe
import lmdb

In the code snippet below, we will specify the hardware environment.

os.environ["GLOG_minloglevel"] = '2'
USE_GPU = True

In the following code snippet, we will define an image_that will help with data conversion Generator and batch_generator.

def image_generator(db_path):
    db_handle =, readonly=True) 
    with db_handle.begin() as db:
        cur = db.cursor() 
        for _, value in cur: 
            datum = caffe.proto.caffe_pb2.Datum()
            int_x = 
            x = np.asfarray(int_x, dtype=np.float32) t
            yield x - 128 

def batch_generator(shape, db_path):
    gen = image_generator(db_path)
    res = np.zeros(shape) 
    while True: 
        for i in range(shape[0]):
            res[i] = next(gen) 

        yield res

In the code snippet below, we will give the path to the MNIST dataset.

num_epochs = 0 
iter_num = 0 
db_path = "content/mnist/mnist_train_lmdb"
db_path_test = "content/mnist/mnist_test_lmdb"
base_lr = 0.01
gamma = 1e-4
power = 0.75

for epoch in range(num_epochs):
    print("Starting epoch {}".format(epoch))
    input_shape = net.blobs["data"].data.shape
    for batch in batch_generator(input_shape, db_path):
        iter_num += 1
        net.blobs["data"].data[...] = batch
        for name, l in zip(net._layer_names, net.layers):
            for b in l.blobs:
                b.diff[...] = net.blob_loss_weights[name]
        learning_rate = base_lr * math.pow(1 + gamma * iter_num, - power)
        for l in net.layers:
            for b in l.blobs:
      [...] -= learning_rate * b.diff
        if iter_num % 50 == 0:
            print("Iter {}: loss={}".format(iter_num, net.blobs["loss"].data))
        if iter_num % 200 == 0:
            print("Testing network: accuracy={}, loss={}".format(*test_network(test_net, db_path_test)))

Using the code snippet below, we will get the final accuracy.

print("Training finished after {} iterations".format(iter_num))
print("Final performance: accuracy={}, loss={}".format(*test_network(test_net, db_path_test)))


In this article, we demonstrate the implementation of a CNN image classification model using three well-known frameworks: Keras, PyTorch, and Cafe. We can see that the CNN model developed by PyTorch is both more accurate and faster than those developed by Keras and Cafe.

As a beginner, I started using Keras, which is a very simple framework for beginners, but its application is limited. But PyTorch and Cafe are very powerful frameworks for speed, optimization, and parallel computing.

Original Link:

Welcome to Panchuang AI Blog:

sklearn Machine Learns Chinese Official Documents:

Welcome to Panchuang Blog Resource Summary:

Tags: AI

Posted by centered effect on Thu, 19 May 2022 19:44:36 +0300