CNN : Watching the world through Neural Networks - Part 1

Introduction

CNNs are the go-to neural network for images

The fundamental difference between a densely connected layer and a convolution
layer is this: Dense layers learn global patterns in their input feature space (for example, for a MNIST digit, patterns involving all pixels), whereas convolution layers learn
local patterns

Properties

This key characteristic gives convnets two interesting properties:

  • The patterns they learn are translation invariant : After learning a certain pattern in the lower-right corner of a picture, a convnet can recognize it anywhere . A densely connected network would have to learn the pattern anew if it appeared at a new location . This makes convnets data efficient when processing images
  • They can learn spatial hierarchies of patterns : A first convolution layer will learn small local patterns such as edges, a second convolution layer will learn larger patterns made of the features of the first layers, and so on. This allows convnets to efficiently learn increasingly complex and abstract visual concepts (because the visual world is fundamentally spatially hierarchical)

Feature Maps

Convolutions operate over 3D tensors, called feature maps, with two spatial axes (height and width) as well as a depth axis (also called the channels axis). For an RGB image, the dimension of the depth axis is 3, because the image has three color channels: red, green, and blue. For a black-and-white picture, like the MNIST digits,the depth is 1 (levels of gray).

The convolution operation extracts patches from its input feature map and applies the same transformation to all of these patches, producing an output feature map. This output feature map is still a 3D tensor: it has a width and a height. Its depth can be arbitrary, because the output depth is a parameter of the layer , and the different channels in that depth axis no longer stand for specific colors as in RGB input; rather, they stand for filters . The 2D tensor output[:, :, n] is the 2D spatial map of the response of this filter over the input

Filters

Filters encode specific aspects of the input data: at a high level, a single filter could encode the concept “presence of a face in the input,” for instance.

Parameters

Convolutions are defined by two key parameters:

  • Size of the patches extracted from the inputs —These are typically 3 × 3 or 5 × 5
  • Depth of the output feature map—The number of filters computed by the convolution

In Keras Conv2D layers, these parameters are the first arguments passed to the layer: Conv2D(output_depth, (window_height, window_width))

Convolution Mechanism

A convolution works by sliding these windows of size 3 × 3 or 5 × 5 over the 3D input feature map, stopping at every possible location, and extracting the 3D patch of surrounding features (shape (window_height, window_width, input_depth)). Each such 3D patch is then transformed (via a tensor product with the same learned weight
matrix, called the convolution kernel) into a 1D vector of shape (output_depth,).

All of these vectors are then spatially reassembled into a 3D output map of shape (height,width, output_depth). Every spatial location in the output feature map corresponds to the same location in the input feature map

Note that the output width and height may differ from the input width and height . They may differ for two reasons:

  • Border effects
    • If you want to get an output feature map with the same spatial dimensions as the input, you can use padding . Padding consists of adding an appropriate number of rows and columns on each side of the input feature map so as to make it possible to fit center convolution windows around every input tile
    • In Conv2D layers, padding is configurable via the padding argument, which takes two values: valid, which means no padding (only valid window locations will be used); and same, which means “**pad in such a way as to have an output with the same width and height as the input.**” The padding argument defaults to valid
  • Stride
    • The description of convolution so far has assumed that the center tiles of the convolution windows are all contiguous. But the distance between two successive windows is a parameter of the convolution, called its stride, which defaults to 1.
    • Using stride 2 means the width and height of the feature map are downsampled by a factor of 2 (in addition to any changes induced by border effects). Strided convolutions are rarely used in practice, although they can come in handy for some types of models; it’s good to be familiar with the concept
    • To downsample feature maps, instead of strides, we tend to use the max-pooling operation

Max Pooling

The role of max pooling: to aggressively downsample feature maps, much like strided convolutions

Max pooling consists of extracting windows from the input feature maps and outputting the max value of each channel. It’s conceptually similar to convolution, except that instead of transforming local patches via a learned linear transformation (the convolution kernel), they’re transformed via a hardcoded max tensor operation

A big difference from convolution is that max pooling is usually done with 2 × 2 windows and stride 2, in order to downsample the feature maps by a factor of 2. On the other hand, convolution is typically done with 3 × 3 windows and no stride (stride 1)

The reason to use downsampling is to reduce the number of feature-map coefficients to process, as well as to induce spatial-filter hierarchies by making successive convolution layers look at increasingly large windows (in terms of the fraction of the original input they cover)

Training CNN from scratch on a small dataset

Trainning on large dataset is easy , but getting similar good result from small dataset is the real challange because in real life it is not easy to get a large dataset

Because convnets learn local, translation-invariant features, they’re highly data efficient on perceptual problems. Training a convnet from scratch on a very small image dataset will still yield reasonable results despite a relative lack of data, without the need for any custom feature engineering.

Dataset

https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition/data

The train folder contains 25,000 images of dogs and cats. Each image in this folder has the label as part of the filename. The test folder contains 12,500 images, named according to a numeric id. For each image in the test set, you should predict a probability that the image is a dog (1 = dog, 0 = cat).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All"
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session
/kaggle/input/test-save/cats_and_dogs_small_input_format_change_1.h5
/kaggle/input/test-save/cats_and_dogs_small_input_format_change_augment.h5
/kaggle/input/dogs-vs-cats-redux-kernels-edition/sample_submission.csv
/kaggle/input/dogs-vs-cats-redux-kernels-edition/train.zip
/kaggle/input/dogs-vs-cats-redux-kernels-edition/test.zip

Extracting zip files

1
2
3
import zipfile
with zipfile.ZipFile("../input/dogs-vs-cats-redux-kernels-edition/"+"train"+".zip","r") as z:
z.extractall(".")
1
2
3
import zipfile
with zipfile.ZipFile("../input/dogs-vs-cats-redux-kernels-edition/"+"test"+".zip","r") as z:
z.extractall(".")

Importing Dependencies

1
2
3
4
5
6
7
8
import os, cv2, re, random
import numpy as np
import pandas as pd
from keras.preprocessing.image import ImageDataGenerator
from keras.preprocessing.image import img_to_array, load_img
from keras import layers, models, optimizers
from keras import backend as K
from sklearn.model_selection import train_test_split

Preparing Data

  • Image Size - (150 x 150) (somewhat arbitrary)
1
2
3
4
5
6
img_width = 150
img_height = 150
TRAIN_DIR = '/kaggle/working/train/'
TEST_DIR = '/kaggle/working/test/'
train_images_dogs_cats = [TRAIN_DIR+i for i in os.listdir(TRAIN_DIR)] # use this for full dataset
test_images_dogs_cats = [TEST_DIR+i for i in os.listdir(TEST_DIR)]
1
2
print(len(train_images_dogs_cats))
print(len(test_images_dogs_cats))
25000
12500

Some helper functions

1
2
3
4
5
def atoi(text):
return int(text) if text.isdigit() else text

def natural_keys(text):
return [ atoi(c) for c in re.split('(\d+)', text) ]
1
print(natural_keys("cat.0.txt"))
['cat.', 0, '.txt']

We are not training on the ful 25k dataset . Rather, we will train on 1500 from each class (total 3000) and 20% of that for validation

1
num_of_each_sample = 1500
1
2
train_images_dogs_cats.sort(key=natural_keys)
print(len(train_images_dogs_cats))
25000
1
2
3
train_images_dogs_cats = train_images_dogs_cats[0:num_of_each_sample] + train_images_dogs_cats[12500:12500+num_of_each_sample] 
test_images_dogs_cats.sort(key=natural_keys)
print(len(train_images_dogs_cats))
3000

More helper functions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def prepare_data(list_of_images):
"""
Returns two arrays:
x is an array of resized images
y is an array of labels
"""
x = [] # images as arrays
y = [] # labels

for image in list_of_images:
x.append(cv2.resize(cv2.imread(image), (img_width,img_height), interpolation=cv2.INTER_CUBIC))

for i in list_of_images:
if 'dog' in i:
y.append(1)
elif 'cat' in i:
y.append(0)
#else:
#print('neither cat nor dog name present in images')
return x, y
1
2
X, Y = prepare_data(train_images_dogs_cats)
print(K.image_data_format())
channels_last

Train Validation Split

1
2
# First split the data in two sets, 80% for training, 20% for Val/Test)
X_train, X_val, Y_train, Y_val = train_test_split(X,Y, test_size=0.2, random_state=1)
1
2
3
4
5
6
nb_train_samples = len(X_train)
nb_validation_samples = len(X_val)
batch_size = 16

print(nb_train_samples)
print(nb_validation_samples)
2400
600

Building the CNN

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from keras import layers
from keras import models

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(img_width, img_height, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
1
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 148, 148, 32)      896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 74, 74, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 72, 72, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 36, 36, 64)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 34, 34, 128)       73856     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 17, 17, 128)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 15, 15, 128)       147584    
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 7, 7, 128)         0         
_________________________________________________________________
flatten (Flatten)            (None, 6272)              0         
_________________________________________________________________
dense (Dense)                (None, 512)               3211776   
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 513       
=================================================================
Total params: 3,453,121
Trainable params: 3,453,121
Non-trainable params: 0
_________________________________________________________________
1
2
3
4
5
from keras import optimizers

model.compile(loss='binary_crossentropy',
optimizer=optimizers.RMSprop(lr=1e-4),
metrics=['acc'])

Data preprocessing

As you already know by now, data should be formatted into appropriately pre-processed floating point tensors before being fed into our network. Currently, our data sits on a drive as JPEG files, so the steps for getting it into our network are roughly:

  • Read the picture files.
  • Decode the JPEG content to RBG grids of pixels.
  • Convert these into floating point tensors.
  • Rescale the pixel values (between 0 and 255) to the [0, 1] interval (as you know, neural networks prefer to deal with small input values).

It may seem a bit daunting, but thankfully Keras has utilities to take care of these steps automatically. Keras has a module with image processing helper tools, located at keras.preprocessing.image. In particular, it contains the class ImageDataGenerator which allows to quickly set up Python generators that can automatically turn image files on disk into batches of pre-processed tensors. This is what we will use here

1
2
3
4
# All images will be rescaled by 1./255

train_datagen = ImageDataGenerator(rescale=1. / 255)
test_datagen = ImageDataGenerator(rescale=1./255)
1
2
train_generator = train_datagen.flow(np.array(X_train), Y_train, batch_size=batch_size)
validation_generator = test_datagen.flow(np.array(X_val), Y_val, batch_size=batch_size)

Train / Fit

Let’s fit our model to the data using the generator. We do it using the fit_generator method, the equivalent of fit for data generators like ours.

It expects as first argument a Python generator that will yield batches of inputs and targets indefinitely, like ours does. Because the data is being generated endlessly, the generator needs to know example how many samples to draw from the generator before declaring an epoch over. This is the role of the steps_per_epoch argument: after having drawn steps_per_epoch batches from the generator, i.e. after having run for steps_per_epoch gradient descent steps, the fitting process will go to the next epoch

So , steps_per_epoch = np.ceil(number_of_samples/batch_size)

1
2
3
4
5
6
7
history = model.fit_generator(
train_generator,
steps_per_epoch=np.ceil(nb_train_samples/batch_size),
epochs=30,
validation_data=validation_generator,
validation_steps=np.ceil(nb_validation_samples/batch_size)
)
Epoch 1/30
150/150 [==============================] - 2s 14ms/step - loss: 0.6928 - acc: 0.5246 - val_loss: 0.6863 - val_acc: 0.5133
Epoch 2/30
150/150 [==============================] - 2s 12ms/step - loss: 0.6583 - acc: 0.6129 - val_loss: 0.6577 - val_acc: 0.5650
Epoch 3/30
150/150 [==============================] - 2s 13ms/step - loss: 0.6234 - acc: 0.6496 - val_loss: 0.5908 - val_acc: 0.7067
Epoch 4/30
150/150 [==============================] - 2s 13ms/step - loss: 0.5805 - acc: 0.6967 - val_loss: 0.5614 - val_acc: 0.7117
Epoch 5/30
150/150 [==============================] - 2s 13ms/step - loss: 0.5432 - acc: 0.7283 - val_loss: 0.5243 - val_acc: 0.7583
Epoch 6/30
150/150 [==============================] - 2s 13ms/step - loss: 0.5048 - acc: 0.7558 - val_loss: 0.5174 - val_acc: 0.7417
Epoch 7/30
150/150 [==============================] - 2s 12ms/step - loss: 0.4759 - acc: 0.7708 - val_loss: 0.5186 - val_acc: 0.7400
Epoch 8/30
150/150 [==============================] - 2s 12ms/step - loss: 0.4391 - acc: 0.7942 - val_loss: 0.4720 - val_acc: 0.7850
Epoch 9/30
150/150 [==============================] - 2s 13ms/step - loss: 0.4030 - acc: 0.8183 - val_loss: 0.4758 - val_acc: 0.7833
Epoch 10/30
150/150 [==============================] - 2s 12ms/step - loss: 0.3795 - acc: 0.8283 - val_loss: 0.4686 - val_acc: 0.7667
Epoch 11/30
150/150 [==============================] - 2s 13ms/step - loss: 0.3461 - acc: 0.8554 - val_loss: 0.4711 - val_acc: 0.7700
Epoch 12/30
150/150 [==============================] - 2s 12ms/step - loss: 0.3205 - acc: 0.8579 - val_loss: 0.4720 - val_acc: 0.7817
Epoch 13/30
150/150 [==============================] - 2s 12ms/step - loss: 0.2980 - acc: 0.8750 - val_loss: 0.5153 - val_acc: 0.7617
Epoch 14/30
150/150 [==============================] - 2s 13ms/step - loss: 0.2590 - acc: 0.9000 - val_loss: 0.5618 - val_acc: 0.7600
Epoch 15/30
150/150 [==============================] - 2s 12ms/step - loss: 0.2253 - acc: 0.9150 - val_loss: 0.5242 - val_acc: 0.7617
Epoch 16/30
150/150 [==============================] - 2s 15ms/step - loss: 0.2100 - acc: 0.9187 - val_loss: 0.5096 - val_acc: 0.7717
Epoch 17/30
150/150 [==============================] - 2s 12ms/step - loss: 0.1727 - acc: 0.9396 - val_loss: 0.5207 - val_acc: 0.7683
Epoch 18/30
150/150 [==============================] - 2s 12ms/step - loss: 0.1538 - acc: 0.9463 - val_loss: 0.5911 - val_acc: 0.7600
Epoch 19/30
150/150 [==============================] - 2s 13ms/step - loss: 0.1271 - acc: 0.9579 - val_loss: 0.6003 - val_acc: 0.7767
Epoch 20/30
150/150 [==============================] - 2s 12ms/step - loss: 0.1086 - acc: 0.9663 - val_loss: 0.6471 - val_acc: 0.7600
Epoch 21/30
150/150 [==============================] - 2s 12ms/step - loss: 0.0920 - acc: 0.9688 - val_loss: 0.7451 - val_acc: 0.7450
Epoch 22/30
150/150 [==============================] - 3s 17ms/step - loss: 0.0736 - acc: 0.9767 - val_loss: 0.7504 - val_acc: 0.7533
Epoch 23/30
150/150 [==============================] - 2s 13ms/step - loss: 0.0676 - acc: 0.9792 - val_loss: 0.7973 - val_acc: 0.7400
Epoch 24/30
150/150 [==============================] - 2s 12ms/step - loss: 0.0551 - acc: 0.9858 - val_loss: 0.7448 - val_acc: 0.7683
Epoch 25/30
150/150 [==============================] - 2s 12ms/step - loss: 0.0434 - acc: 0.9892 - val_loss: 1.0532 - val_acc: 0.7483
Epoch 26/30
150/150 [==============================] - 2s 12ms/step - loss: 0.0384 - acc: 0.9908 - val_loss: 0.8506 - val_acc: 0.7583
Epoch 27/30
150/150 [==============================] - 2s 13ms/step - loss: 0.0332 - acc: 0.9900 - val_loss: 0.8640 - val_acc: 0.7700
Epoch 28/30
150/150 [==============================] - 2s 13ms/step - loss: 0.0319 - acc: 0.9912 - val_loss: 0.8931 - val_acc: 0.7650
Epoch 29/30
150/150 [==============================] - 2s 13ms/step - loss: 0.0239 - acc: 0.9917 - val_loss: 1.0431 - val_acc: 0.7667
Epoch 30/30
150/150 [==============================] - 2s 12ms/step - loss: 0.0197 - acc: 0.9942 - val_loss: 1.1201 - val_acc: 0.7383
1
model.save('cats_and_dogs_small_input_format_change_1.h5')
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

import matplotlib.pyplot as plt

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()

png

png

These plots are characteristic of overfitting. Our training accuracy increases linearly over time, until it reaches nearly 100%, while our validation accuracy stalls at 73-75%. Our validation loss reaches its minimum after only five epochs then stalls, while the training loss keeps decreasing linearly until it reaches nearly 0

Because we only have relatively few training samples (2400), overfitting is going to be our number one concern. You already know about a number of techniques that can help mitigate overfitting, such as dropout and weight decay (L2 regularization) . We are now going to introduce a new one, specific to computer vision, and used almost universally when processing images with deep learning models: data augmentation.

Predict

1
2
print(len(test_images_dogs_cats))
X_test, Y_test = prepare_data(test_images_dogs_cats) #Y_test in this case will be []
12500
1
2
3
from keras.models import load_model
model = load_model('/kaggle/input/test-save/cats_and_dogs_small_input_format_change_1.h5')
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 148, 148, 32)      896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 74, 74, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 72, 72, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 36, 36, 64)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 34, 34, 128)       73856     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 17, 17, 128)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 15, 15, 128)       147584    
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 7, 7, 128)         0         
_________________________________________________________________
flatten (Flatten)            (None, 6272)              0         
_________________________________________________________________
dense (Dense)                (None, 512)               3211776   
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 513       
=================================================================
Total params: 3,453,121
Trainable params: 3,453,121
Non-trainable params: 0
_________________________________________________________________
1
2
test_generator = test_datagen.flow(np.array(X_test), batch_size=batch_size)
prediction_probabilities = model.predict_generator(test_generator, verbose=1)
782/782 [==============================] - 4s 5ms/step
1
print(prediction_probabilities.shape)
(12500, 1)

Creating Submission File

1
2
3
4
5
6
7
8
counter = range(1, len(test_images_dogs_cats) + 1)
solution = pd.DataFrame({"id": counter, "label":list(prediction_probabilities)})
cols = ['label']

for col in cols:
solution[col] = solution[col].map(lambda x: str(x).lstrip('[').rstrip(']')).astype(float)

solution.to_csv("dogsVScats2.csv", index = False)

Data Augmentation

Overfitting is caused by having too few samples to learn from, rendering us unable to train a model able to generalize to new data. Given infinite data, our model would be exposed to every possible aspect of the data distribution at hand: we would never overfit. Data augmentation takes the approach of generating more training data from existing training samples, by “augmenting” the samples via a number of random transformations that yield believable-looking images. The goal is that at training time, our model would never see the exact same picture twice. This helps the model get exposed to more aspects of the data and generalize better.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
from keras import layers
from keras import models
from keras import optimizers

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
optimizer=optimizers.RMSprop(lr=1e-4),
metrics=['acc'])
1
2
3
4
5
6
7
8
9
10
11
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,)

# Note that the validation data should not be augmented!
test_datagen = ImageDataGenerator(rescale=1./255)

These are just a few of the options available (for more, see the Keras documentation). Let’s quickly go over what we just wrote:

  • rotation_range is a value in degrees (0-180), a range within which to randomly rotate pictures.
  • width_shift and height_shift are ranges (as a fraction of total width or height) within which to randomly translate pictures vertically or horizontally.
  • shear_range is for randomly applying shearing transformations.
  • zoom_range is for randomly zooming inside pictures.
  • horizontal_flip is for randomly flipping half of the images horizontally – relevant when there are no assumptions of horizontal asymmetry (e.g. real-world pictures).
  • fill_mode is the strategy used for filling in newly created pixels, which can appear after a rotation or a width/height shift.

Note

validation data should not be augmented

1
2
train_generator = train_datagen.flow(np.array(X_train), Y_train, batch_size=batch_size)
validation_generator = test_datagen.flow(np.array(X_val), Y_val, batch_size=batch_size)
1
2
3
4
5
6
history = model.fit_generator(
train_generator,
steps_per_epoch=np.ceil(nb_train_samples/batch_size),
epochs=100,
validation_data=validation_generator,
validation_steps=np.ceil(nb_validation_samples/batch_size))
Epoch 1/100
150/150 [==============================] - 12s 83ms/step - loss: 0.6913 - acc: 0.5317 - val_loss: 0.6714 - val_acc: 0.5983
Epoch 2/100
150/150 [==============================] - 13s 84ms/step - loss: 0.6788 - acc: 0.5700 - val_loss: 0.6493 - val_acc: 0.6267
Epoch 3/100
150/150 [==============================] - 12s 81ms/step - loss: 0.6621 - acc: 0.5925 - val_loss: 0.6307 - val_acc: 0.6183
Epoch 4/100
150/150 [==============================] - 12s 81ms/step - loss: 0.6475 - acc: 0.6133 - val_loss: 0.6504 - val_acc: 0.5850
Epoch 5/100
150/150 [==============================] - 14s 92ms/step - loss: 0.6345 - acc: 0.6267 - val_loss: 0.6253 - val_acc: 0.6267
Epoch 6/100
150/150 [==============================] - 12s 80ms/step - loss: 0.6183 - acc: 0.6579 - val_loss: 0.5527 - val_acc: 0.7067
Epoch 7/100
150/150 [==============================] - 13s 86ms/step - loss: 0.6148 - acc: 0.6488 - val_loss: 0.5648 - val_acc: 0.7067
Epoch 8/100
150/150 [==============================] - 12s 82ms/step - loss: 0.6049 - acc: 0.6733 - val_loss: 0.5266 - val_acc: 0.7517
Epoch 9/100
150/150 [==============================] - 12s 81ms/step - loss: 0.5902 - acc: 0.6842 - val_loss: 0.5117 - val_acc: 0.7450
Epoch 10/100
150/150 [==============================] - 14s 90ms/step - loss: 0.5824 - acc: 0.6938 - val_loss: 0.5332 - val_acc: 0.7300
Epoch 11/100
150/150 [==============================] - 12s 80ms/step - loss: 0.5740 - acc: 0.6988 - val_loss: 0.4963 - val_acc: 0.7633
Epoch 12/100
150/150 [==============================] - 13s 84ms/step - loss: 0.5878 - acc: 0.6913 - val_loss: 0.5169 - val_acc: 0.7383
Epoch 13/100
150/150 [==============================] - 13s 87ms/step - loss: 0.5559 - acc: 0.7142 - val_loss: 0.5108 - val_acc: 0.7467
Epoch 14/100
150/150 [==============================] - 13s 83ms/step - loss: 0.5529 - acc: 0.7237 - val_loss: 0.5053 - val_acc: 0.7550
Epoch 15/100
150/150 [==============================] - 14s 91ms/step - loss: 0.5610 - acc: 0.7075 - val_loss: 0.4721 - val_acc: 0.7717
Epoch 16/100
150/150 [==============================] - 12s 79ms/step - loss: 0.5489 - acc: 0.7163 - val_loss: 0.4639 - val_acc: 0.7683
Epoch 17/100
150/150 [==============================] - 13s 84ms/step - loss: 0.5369 - acc: 0.7292 - val_loss: 0.4881 - val_acc: 0.7500
Epoch 18/100
150/150 [==============================] - 13s 85ms/step - loss: 0.5464 - acc: 0.7321 - val_loss: 0.4489 - val_acc: 0.7817
Epoch 19/100
150/150 [==============================] - 13s 84ms/step - loss: 0.5376 - acc: 0.7375 - val_loss: 0.4577 - val_acc: 0.7750
Epoch 20/100
150/150 [==============================] - 14s 92ms/step - loss: 0.5441 - acc: 0.7233 - val_loss: 0.4655 - val_acc: 0.7733
Epoch 21/100
150/150 [==============================] - 12s 79ms/step - loss: 0.5329 - acc: 0.7258 - val_loss: 0.4382 - val_acc: 0.7950
Epoch 22/100
150/150 [==============================] - 13s 86ms/step - loss: 0.5405 - acc: 0.7292 - val_loss: 0.4337 - val_acc: 0.7933
Epoch 23/100
150/150 [==============================] - 14s 91ms/step - loss: 0.5170 - acc: 0.7379 - val_loss: 0.4228 - val_acc: 0.8017
Epoch 24/100
150/150 [==============================] - 13s 89ms/step - loss: 0.5247 - acc: 0.7450 - val_loss: 0.4446 - val_acc: 0.7833
Epoch 25/100
150/150 [==============================] - 13s 88ms/step - loss: 0.5146 - acc: 0.7483 - val_loss: 0.4597 - val_acc: 0.7850
Epoch 26/100
150/150 [==============================] - 12s 80ms/step - loss: 0.5237 - acc: 0.7529 - val_loss: 0.4326 - val_acc: 0.7917
Epoch 27/100
150/150 [==============================] - 13s 89ms/step - loss: 0.5112 - acc: 0.7446 - val_loss: 0.4885 - val_acc: 0.7633
Epoch 28/100
150/150 [==============================] - 13s 88ms/step - loss: 0.5107 - acc: 0.7521 - val_loss: 0.4418 - val_acc: 0.7967
Epoch 29/100
150/150 [==============================] - 13s 87ms/step - loss: 0.5008 - acc: 0.7563 - val_loss: 0.4096 - val_acc: 0.8050
Epoch 30/100
150/150 [==============================] - 14s 91ms/step - loss: 0.4918 - acc: 0.7588 - val_loss: 0.4319 - val_acc: 0.8050
Epoch 31/100
150/150 [==============================] - 12s 79ms/step - loss: 0.5029 - acc: 0.7550 - val_loss: 0.4339 - val_acc: 0.7983
Epoch 32/100
150/150 [==============================] - 13s 87ms/step - loss: 0.4960 - acc: 0.7675 - val_loss: 0.4259 - val_acc: 0.8150
Epoch 33/100
150/150 [==============================] - 13s 88ms/step - loss: 0.4961 - acc: 0.7483 - val_loss: 0.4117 - val_acc: 0.8033
Epoch 34/100
150/150 [==============================] - 12s 83ms/step - loss: 0.4986 - acc: 0.7633 - val_loss: 0.4326 - val_acc: 0.7967
Epoch 35/100
150/150 [==============================] - 14s 90ms/step - loss: 0.4892 - acc: 0.7646 - val_loss: 0.4050 - val_acc: 0.8217
Epoch 36/100
150/150 [==============================] - 12s 79ms/step - loss: 0.4891 - acc: 0.7646 - val_loss: 0.4288 - val_acc: 0.8100
Epoch 37/100
150/150 [==============================] - 13s 86ms/step - loss: 0.4780 - acc: 0.7704 - val_loss: 0.4018 - val_acc: 0.8083
Epoch 38/100
150/150 [==============================] - 14s 94ms/step - loss: 0.4876 - acc: 0.7617 - val_loss: 0.4317 - val_acc: 0.7817
Epoch 39/100
150/150 [==============================] - 13s 83ms/step - loss: 0.4804 - acc: 0.7721 - val_loss: 0.3901 - val_acc: 0.8217
Epoch 40/100
150/150 [==============================] - 14s 94ms/step - loss: 0.4748 - acc: 0.7729 - val_loss: 0.4204 - val_acc: 0.7967
Epoch 41/100
150/150 [==============================] - 13s 84ms/step - loss: 0.4834 - acc: 0.7758 - val_loss: 0.4114 - val_acc: 0.8017
Epoch 42/100
150/150 [==============================] - 14s 93ms/step - loss: 0.4674 - acc: 0.7763 - val_loss: 0.4137 - val_acc: 0.8067
Epoch 43/100
150/150 [==============================] - 14s 94ms/step - loss: 0.4671 - acc: 0.7842 - val_loss: 0.4207 - val_acc: 0.7967
Epoch 44/100
150/150 [==============================] - 13s 87ms/step - loss: 0.4643 - acc: 0.7896 - val_loss: 0.3832 - val_acc: 0.8183
Epoch 45/100
150/150 [==============================] - 14s 97ms/step - loss: 0.4554 - acc: 0.7846 - val_loss: 0.5359 - val_acc: 0.7700
Epoch 46/100
150/150 [==============================] - 14s 90ms/step - loss: 0.4634 - acc: 0.7767 - val_loss: 0.3861 - val_acc: 0.8250
Epoch 47/100
150/150 [==============================] - 12s 79ms/step - loss: 0.4548 - acc: 0.7821 - val_loss: 0.4177 - val_acc: 0.8067
Epoch 48/100
150/150 [==============================] - 14s 92ms/step - loss: 0.4544 - acc: 0.7921 - val_loss: 0.4165 - val_acc: 0.8100
Epoch 49/100
150/150 [==============================] - 13s 84ms/step - loss: 0.4559 - acc: 0.7837 - val_loss: 0.3776 - val_acc: 0.8367
Epoch 50/100
150/150 [==============================] - 14s 91ms/step - loss: 0.4528 - acc: 0.7917 - val_loss: 0.3620 - val_acc: 0.8417
Epoch 51/100
150/150 [==============================] - 13s 90ms/step - loss: 0.4499 - acc: 0.7858 - val_loss: 0.4067 - val_acc: 0.8233
Epoch 52/100
150/150 [==============================] - 12s 78ms/step - loss: 0.4483 - acc: 0.8004 - val_loss: 0.3932 - val_acc: 0.8233
Epoch 53/100
150/150 [==============================] - 15s 98ms/step - loss: 0.4530 - acc: 0.7946 - val_loss: 0.4569 - val_acc: 0.7750
Epoch 54/100
150/150 [==============================] - 12s 79ms/step - loss: 0.4479 - acc: 0.7962 - val_loss: 0.3650 - val_acc: 0.8367
Epoch 55/100
150/150 [==============================] - 14s 91ms/step - loss: 0.4396 - acc: 0.7992 - val_loss: 0.3847 - val_acc: 0.8250
Epoch 56/100
150/150 [==============================] - 13s 89ms/step - loss: 0.4324 - acc: 0.7967 - val_loss: 0.4703 - val_acc: 0.7917
Epoch 57/100
150/150 [==============================] - 12s 81ms/step - loss: 0.4382 - acc: 0.7975 - val_loss: 0.3553 - val_acc: 0.8500
Epoch 58/100
150/150 [==============================] - 15s 100ms/step - loss: 0.4264 - acc: 0.8083 - val_loss: 0.3588 - val_acc: 0.8400
Epoch 59/100
150/150 [==============================] - 12s 78ms/step - loss: 0.4220 - acc: 0.8163 - val_loss: 0.3755 - val_acc: 0.8333
Epoch 60/100
150/150 [==============================] - 13s 85ms/step - loss: 0.4356 - acc: 0.8112 - val_loss: 0.3613 - val_acc: 0.8300
Epoch 61/100
150/150 [==============================] - 15s 100ms/step - loss: 0.4293 - acc: 0.8087 - val_loss: 0.3683 - val_acc: 0.8333
Epoch 62/100
150/150 [==============================] - 12s 79ms/step - loss: 0.4221 - acc: 0.8125 - val_loss: 0.3842 - val_acc: 0.8200
Epoch 63/100
150/150 [==============================] - 14s 97ms/step - loss: 0.4298 - acc: 0.8079 - val_loss: 0.3976 - val_acc: 0.8267
Epoch 64/100
150/150 [==============================] - 13s 86ms/step - loss: 0.4262 - acc: 0.8050 - val_loss: 0.4451 - val_acc: 0.7867
Epoch 65/100
150/150 [==============================] - 13s 83ms/step - loss: 0.4156 - acc: 0.8067 - val_loss: 0.3641 - val_acc: 0.8250
Epoch 66/100
150/150 [==============================] - 14s 96ms/step - loss: 0.4161 - acc: 0.8054 - val_loss: 0.3554 - val_acc: 0.8400
Epoch 67/100
150/150 [==============================] - 12s 79ms/step - loss: 0.4098 - acc: 0.8158 - val_loss: 0.4377 - val_acc: 0.8167
Epoch 68/100
150/150 [==============================] - 13s 87ms/step - loss: 0.4072 - acc: 0.8200 - val_loss: 0.3489 - val_acc: 0.8350
Epoch 69/100
150/150 [==============================] - 15s 103ms/step - loss: 0.4057 - acc: 0.8142 - val_loss: 0.3842 - val_acc: 0.8183
Epoch 70/100
150/150 [==============================] - 12s 82ms/step - loss: 0.4007 - acc: 0.8150 - val_loss: 0.3725 - val_acc: 0.8367
Epoch 71/100
150/150 [==============================] - 13s 87ms/step - loss: 0.4097 - acc: 0.8192 - val_loss: 0.3371 - val_acc: 0.8367
Epoch 72/100
150/150 [==============================] - 13s 88ms/step - loss: 0.3939 - acc: 0.8183 - val_loss: 0.3465 - val_acc: 0.8450
Epoch 73/100
150/150 [==============================] - 13s 86ms/step - loss: 0.4022 - acc: 0.8146 - val_loss: 0.3362 - val_acc: 0.8550
Epoch 74/100
150/150 [==============================] - 15s 98ms/step - loss: 0.4003 - acc: 0.8271 - val_loss: 0.3516 - val_acc: 0.8500
Epoch 75/100
150/150 [==============================] - 13s 83ms/step - loss: 0.3979 - acc: 0.8221 - val_loss: 0.3615 - val_acc: 0.8383
Epoch 76/100
150/150 [==============================] - 12s 79ms/step - loss: 0.3997 - acc: 0.8271 - val_loss: 0.4037 - val_acc: 0.8200
Epoch 77/100
150/150 [==============================] - 16s 104ms/step - loss: 0.3882 - acc: 0.8179 - val_loss: 0.3508 - val_acc: 0.8550
Epoch 78/100
150/150 [==============================] - 12s 82ms/step - loss: 0.3911 - acc: 0.8308 - val_loss: 0.4287 - val_acc: 0.8200
Epoch 79/100
150/150 [==============================] - 13s 86ms/step - loss: 0.4035 - acc: 0.8192 - val_loss: 0.3530 - val_acc: 0.8383
Epoch 80/100
150/150 [==============================] - 14s 94ms/step - loss: 0.3823 - acc: 0.8250 - val_loss: 0.3689 - val_acc: 0.8500
Epoch 81/100
150/150 [==============================] - 12s 78ms/step - loss: 0.4039 - acc: 0.8138 - val_loss: 0.3573 - val_acc: 0.8417
Epoch 82/100
150/150 [==============================] - 17s 112ms/step - loss: 0.3857 - acc: 0.8254 - val_loss: 0.3612 - val_acc: 0.8417
Epoch 83/100
150/150 [==============================] - 12s 79ms/step - loss: 0.3851 - acc: 0.8304 - val_loss: 0.3506 - val_acc: 0.8433
Epoch 84/100
150/150 [==============================] - 12s 79ms/step - loss: 0.3875 - acc: 0.8342 - val_loss: 0.3266 - val_acc: 0.8567
Epoch 85/100
150/150 [==============================] - 12s 82ms/step - loss: 0.3790 - acc: 0.8325 - val_loss: 0.3239 - val_acc: 0.8750
Epoch 86/100
150/150 [==============================] - 12s 78ms/step - loss: 0.3691 - acc: 0.8354 - val_loss: 0.3252 - val_acc: 0.8683
Epoch 87/100
150/150 [==============================] - 15s 102ms/step - loss: 0.3833 - acc: 0.8296 - val_loss: 0.3486 - val_acc: 0.8500
Epoch 88/100
150/150 [==============================] - 13s 86ms/step - loss: 0.3758 - acc: 0.8275 - val_loss: 0.4556 - val_acc: 0.7933
Epoch 89/100
150/150 [==============================] - 12s 78ms/step - loss: 0.3875 - acc: 0.8275 - val_loss: 0.3343 - val_acc: 0.8583
Epoch 90/100
150/150 [==============================] - 12s 82ms/step - loss: 0.3691 - acc: 0.8446 - val_loss: 0.3399 - val_acc: 0.8533
Epoch 91/100
150/150 [==============================] - 13s 83ms/step - loss: 0.3715 - acc: 0.8408 - val_loss: 0.3991 - val_acc: 0.8117
Epoch 92/100
150/150 [==============================] - 14s 91ms/step - loss: 0.3627 - acc: 0.8400 - val_loss: 0.3537 - val_acc: 0.8400
Epoch 93/100
150/150 [==============================] - 15s 98ms/step - loss: 0.3830 - acc: 0.8296 - val_loss: 0.3945 - val_acc: 0.8350
Epoch 94/100
150/150 [==============================] - 12s 78ms/step - loss: 0.3808 - acc: 0.8325 - val_loss: 0.3871 - val_acc: 0.8283
Epoch 95/100
150/150 [==============================] - 12s 83ms/step - loss: 0.3621 - acc: 0.8396 - val_loss: 0.4357 - val_acc: 0.8183
Epoch 96/100
150/150 [==============================] - 13s 85ms/step - loss: 0.3833 - acc: 0.8371 - val_loss: 0.3107 - val_acc: 0.8800
Epoch 97/100
150/150 [==============================] - 13s 84ms/step - loss: 0.3619 - acc: 0.8383 - val_loss: 0.4303 - val_acc: 0.8250
Epoch 98/100
150/150 [==============================] - 16s 104ms/step - loss: 0.3667 - acc: 0.8358 - val_loss: 0.4421 - val_acc: 0.8233
Epoch 99/100
150/150 [==============================] - 12s 82ms/step - loss: 0.3714 - acc: 0.8321 - val_loss: 0.4067 - val_acc: 0.8267
Epoch 100/100
150/150 [==============================] - 12s 83ms/step - loss: 0.3736 - acc: 0.8296 - val_loss: 0.3370 - val_acc: 0.8550
1
model.save('cats_and_dogs_small_input_format_change_augment.h5')
1
2
3
from keras.models import load_model
model = load_model('/kaggle/input/test-save/cats_and_dogs_small_input_format_change_augment.h5')
model.summary()
Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_8 (Conv2D)            (None, 148, 148, 32)      896       
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 74, 74, 32)        0         
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 72, 72, 64)        18496     
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 36, 36, 64)        0         
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 34, 34, 128)       73856     
_________________________________________________________________
max_pooling2d_10 (MaxPooling (None, 17, 17, 128)       0         
_________________________________________________________________
conv2d_11 (Conv2D)           (None, 15, 15, 128)       147584    
_________________________________________________________________
max_pooling2d_11 (MaxPooling (None, 7, 7, 128)         0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 6272)              0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 6272)              0         
_________________________________________________________________
dense_4 (Dense)              (None, 512)               3211776   
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 513       
=================================================================
Total params: 3,453,121
Trainable params: 3,453,121
Non-trainable params: 0
_________________________________________________________________
1
2
3
4
test_generator = test_datagen.flow(np.array(X_test), batch_size=batch_size)

prediction_probabilities = model.predict_generator(test_generator, verbose=1)
print(prediction_probabilities.shape)
782/782 [==============================] - 4s 5ms/step
(12500, 1)
1
2
3
4
5
6
7
8
counter = range(1, len(test_images_dogs_cats) + 1)
solution = pd.DataFrame({"id": counter, "label":list(prediction_probabilities)})
cols = ['label']

for col in cols:
solution[col] = solution[col].map(lambda x: str(x).lstrip('[').rstrip(']')).astype(float)

solution.to_csv("dogsVScats2.csv", index = False)
1