# Generate CNN training data.

Posted on January 02, 2017 in notebooks

When training convolution neural networks (CNNs) doesn't work, it's difficult to know what went wrong. A starting point for debugging these networks is training them on clean data with clear patters. In this notebook I create a simple image sequence of a moving square and attempt to predict its x (horizontal) cordinate.

```
import numpy as np
import random
from matplotlib.pyplot import imshow
from matplotlib import pyplot as plt
from matplotlib import cm
%matplotlib inline
```

This function can return the x and y possition but for now we just want to predict the x possition of the square.

```
def moving_square(n_frames=100, return_x=True, return_y=True):
'''
Generate sequence of images of square bouncing around
the image and labels of its coordinates. Can be used as a
basic simple performance test of convolution networks.
'''
row = 120
col = 160
movie = np.zeros((n_frames, row, col, 3), dtype=np.float)
labels = np.zeros((n_frames, 2), dtype=np.float)
#initial possition
x = np.random.randint(20, col-20)
y = np.random.randint(20, row-20)
# Direction of motion
directionx = -1
directiony = 1
# Size of the square
w = 4
for t in range(n_frames):
#move
x += directionx
y += directiony
#make square bounce off walls
if y < 5 or y > row-5:
directiony *= -1
if x < 5 or x > col-5:
directionx *= -1
#draw square and record labels
movie[t, y - w: y + w, x - w: x + w, 1] += 1
labels[t] = np.array([x, y])
#only return requested labels
if return_x and return_y:
return movie, labels
elif return_x and not return_y:
return movie, labels[:,0]
else:
return movie, labels[:,1]
```

Here we create images and the labels of the x possition. Both are numpy arrays containing 2000 samples each.

```
movie, labels = moving_square(2000, return_y=False)
```

Here we can see how the square bounces around the image to give us a wide range of possitions.

```
fig, ax = plt.subplots(1, 10, figsize=(15, 6),
subplot_kw={'adjustable': 'box-forced'})
axoff = np.vectorize(lambda ax:ax.axis('off'))
axoff(ax)
for i in range(10):
frame = i*20
ax[i].imshow(movie[frame])
ax[i].set_title(labels[frame])
```

```
def split_data(X, Y, test_frac=.8):
count = len(X)
assert len(X) == len(Y)
cutoff = int((count * test_frac) // 1)
X_train = X[:cutoff]
Y_train = Y[:cutoff]
X_test = X[cutoff:]
Y_test = Y[cutoff:]
return X_train, Y_train, X_test, Y_test
movie_train, labels_train, movie_test, labels_test = split_data(movie, labels)
print('training samples: %s, test samples: %s' %(len(movie_train), len(movie_test)))
```

Now that we have split our data into our training and test sets, we can create our model to test. For this example, well use a 3 layer convolution network with a dense layer at the end.

```
from keras.layers import Input, Embedding, LSTM, Dense, merge
from keras.models import Model
from keras.layers import Convolution2D, MaxPooling2D,
from keras.layers import Activation, Dropout, Flatten, Dense
```

```
def cnn3_full1():
img_in = Input(shape=(120, 160, 3), name='img_in')
angle_in = Input(shape=(1,), name='angle_in')
x = Convolution2D(8, 3, 3)(img_in)
x = Activation('relu')(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Convolution2D(16, 3, 3)(x)
x = Activation('relu')(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Convolution2D(32, 3, 3)(x)
x = Activation('relu')(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
merged = Flatten()(x)
#merged = merge([flat, angle_in], mode='concat', concat_axis=-1)
x = Dense(256)(merged)
x = Activation('linear')(x)
x = Dropout(.2)(x)
angle_out = Dense(1, name='angle_out')(x)
model = Model(input=[img_in], output=[angle_out])
return model
```

Since we are estimating a floating point value between 1 and 120 we've set our activation function to be linear and we'll use a mean squared error(mse) loss function.

```
model = cnn3_full1()
model.compile(loss='mse', optimizer='adam')
```

```
model.fit(movie_train, labels_train, batch_size=32, nb_epoch=10,
validation_data=(movie_test, labels_test))
```

Our model here shows addequate training after 10 epochs. By showing the actual vs predited values of the test data, we can see that most predictions are within 1 pixel of the actual value. Now that we're sure that our model can learn a simple environment, we can try it on more complicated ones.

```
import pandas as pd
predictions = model.predict(movie_test[:400])
results = {'angle_pred':predictions[:,0],
'angle_actual': labels_test[:400]}
df = pd.DataFrame(data=results,)
df
```

```
ax = df.plot()
ax.set_xlabel("samples")
ax.set_ylabel("x value")
```

These results of the test data show the actual angle and the predicted angle. The prediction accuracy in the middle of the image (x value between 100 and 40) is more accurate than the x values less than 40 and greater than 100. This is likely due to the view samples that occurred in those ranges.

```
```