In part 1 of this tutorial, we explained the advantages of and proposed a methodology for using DCNNs for time series analysis by converting time series into gray-scale images. In part 2, we defined a Python class and various methods to perform data processing. In our third and final part, we will explain the topology of our model.Discuss how it is possible to establish its computation graph, and demonstrate how to run this graph in a Tensorflow session.
Our Proposed DCNN Topology
As explained in part 1 of this series, CNNs have only convoluted layers and pooling. DCNNs, on the other hand, include more diverse layers of the dense/fully connected layer. There are several combinations of layer connections that produce many different possible topologies. Additionally, selecting hyper-parameters can produce different models. So one challenge in solving our particular problem is determining how to establish an effective architecture since architectures have a great impact on model performance.
In this tutorial, we will use a simple DCNN model with two convolute layers, two pooling layers, and three dense layers. To avoid overfitting, we’ll apply a dropout technique that refers to ignoring units (i.e. neurons) during the training phase of a certain set of neurons that is chosen at random. For developing this prediction model, we adopt Tensorflow official website guidelines, which can be accessed here and here:
def cnn_model():
# building computation graph
# the input images are 32*32, so flattened images are a size of 1024
x = tf.placeholder(tf.float32, [None, 1024])
# because we have just two labels
y = tf.placeholder(tf.float32, [None, 2])
# the input images are 32*32
input_layer_of_load_images = tf.reshape(x, [-1, 32, 32, 1])
# covulate layer 1 specifications are
convolute_layer_1 = tf.layers.conv2d(
inputs = input_layer_of_load_images,
filters = 32,
kernel_size = [5, 5],
padding = "same",
activation = tf.nn.relu)
# pooling layer 1 specifications are
pool_layer_1 = tf.layers.max_pooling2d(
inputs=convolute_layer_1,
pool_size=[2, 2],
strides=2)
# covulate layer 2 specifications are
convolute_layer_2 = tf.layers.conv2d(
inputs=pool_layer_1,
filters=64,
kernel_size=[5, 5],
padding="same",
activation=tf.nn.relu)
# pooling layer 2 specifications are
pool_layer_2 = tf.layers.max_pooling2d(
inputs = convolute_layer_2,
pool_size=[2, 2],
strides=2)
# pooling layer 2 output is going to be flattened as follows to be the input for the next layer
pool_layer_2_flat = tf.reshape(pool_layer_2,
[-1, 8 * 8 * 64])
# dense layer 1 specifications are
dense_layer_1 = tf.layers.dense(inputs = pool_layer_2_flat,
units=1024,
activation=tf.nn.relu)
# dense layer 2 specifications are
dense_layer_2 = tf.layers.dense(inputs = dense_layer_1,
units = 600,
activation=tf.nn.relu)
# using dropout technique to avoid overfitting, which shuts down neurons at a rate of 40%
dropout = tf.layers.dropout(
inputs = dense_layer_2, rate = 0.4)
# dense layer 3 specifications are
dense_layer_3 = tf.layers.dense(inputs = dropout,
units=400)
# prediction and final outputs
prediction = tf.layers.dense(dense_layer_3, activation=tf.nn.softmax, units = 2)
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y * tf.log(prediction), reduction_indices=[1]))
train_step = tf.train.AdamOptimizer(0.001).minimize(cross_entropy)
number_of_epochs = 15
# creating session for running computation graph
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(number_of_epochs):
epoch_loss = 0
makeImage.index_of_columns_2_be_batched = [i for i in range(0, X_train.shape[0])]
for _ in range(number_of_batches):
epoch_x, epoch_y = makeImage.next_batch(X_train, y_train, batch_size)
_, c = sess.run([train_step, cross_entropy], feed_dict={x: epoch_x, y: epoch_y})
epoch_loss += c
print('Epoch', epoch + 1, 'completed out of', number_of_epochs,' with loss:',epoch_loss)
correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
print('Accuracy in epoch', epoch + 1, 'is:',accuracy.eval({x:X_test, y:y_test}))
cnn_model()
Results
After running the model, the following results might be produced. (Your own results will be different than the following due to the random configuration we deployed while building the model.)
Epoch 1 completed out of 15 with loss: 51.4090359211
Accuracy in epoch 1 is: 0.853059
Epoch 2 completed out of 15 with loss: 36.055752486
Accuracy in epoch 2 is: 0.862016
Epoch 3 completed out of 15 with loss: 30.448112756
Accuracy in epoch 3 is: 0.87955
Epoch 4 completed out of 15 with loss: 24.6789700091
Accuracy in epoch 4 is: 0.885649
Epoch 5 completed out of 15 with loss: 19.7951619923
Accuracy in epoch 5 is: 0.874404
Epoch 6 completed out of 15 with loss: 15.8194133416
Accuracy in epoch 6 is: 0.883362
Epoch 7 completed out of 15 with loss: 12.7372778915
Accuracy in epoch 7 is: 0.877644
Epoch 8 completed out of 15 with loss: 10.4595579039
Accuracy in epoch 8 is: 0.881456
Epoch 9 completed out of 15 with loss: 7.60100067966
Accuracy in epoch 9 is: 0.879169
Epoch 10 completed out of 15 with loss: 6.00782689825
Accuracy in epoch 10 is: 0.877835
Epoch 11 completed out of 15 with loss: 5.18157571554
Accuracy in epoch 11 is: 0.872689
Epoch 12 completed out of 15 with loss: 4.55696826754
Accuracy in epoch 12 is: 0.883362
Epoch 13 completed out of 15 with loss: 3.64296960807
Accuracy in epoch 13 is: 0.876882
Epoch 14 completed out of 15 with loss: 2.26440465293
Accuracy in epoch 14 is: 0.883743
Epoch 15 completed out of 15 with loss: 2.38258726092
Accuracy in epoch 15 is: 0.879931
Conclusion
In this three-part blog, we provided a tutorial on how to use DCNNs for time series analysis. We first provided information on our proposed methodology, explaining how it’s possible to convert a time series into gray-scale images. Next, we provided Python codes for data processing and preparation. Finally, we used Tensorflow to complete our predictions. As seen in our results, this method allows us to reach an accuracy of about 88% when predicting trends for the next state of electricity consumption.
At Avenue Code, our data science teams enjoy providing solutions for real-world challenges by using DCNNs and many other technologies. We serve diverse industries, including retail, energy, telecommunications, etc., and we invite you to contact us to discover our solutions to your challenges.
Author
Hossein Javedani Sadaei
Hossein Javedani Sadaei is a Machine Learning Practice Lead at Avenue Code with a post-doctoral in big data mining and a PhD in statistics. He works mostly with machine learning and deep learning in retail, telecommunication, energy, and stock. His main expertise is developing scalable machine learning and deep learning algorithms using fuzzy logics.