# Handwritten Digit Recognition using Neural Network

**Introduction:**

Handwritten digit recognition using MNIST dataset is a major project made with the help of Neural Network. It basically detects the scanned images of handwritten digits.

We have taken this a step further where our handwritten digit recognition system not only detects scanned images of handwritten digits but also allows writing digits on the screen with the help of an integrated GUI for recognition.__ __

Attention reader! Don’t stop learning now. Get hold of all the important Machine Learning Concepts with the **Machine Learning Foundation Course** at a student-friendly price and become industry ready.

**Approach: **

We will approach this project by using a three-layered Neural Network.

**The input layer:**It distributes the features of our examples to the next layer for calculation of activations of the next layer.**The hidden layer:**They are made of hidden units called activations providing nonlinear ties for the network. A number of hidden layers can vary according to our requirements.**The output layer:**The nodes here are called output units. It provides us with the final prediction of the Neural Network on the basis of which final predictions can be made.

A neural network is a model inspired by how the brain works. It consists of multiple layers having many activations, this activation resembles neurons of our brain. A neural network tries to learn a set of parameters in a set of data which could help to recognize the underlying relationships. Neural networks can adapt to changing input; so the network generates the best possible result without needing to redesign the output criteria.

**Methodology:**

We have implemented a Neural Network with 1 hidden layer having *100* activation units (excluding bias units). The data is loaded from a *.mat* file, features(X) and labels(y) were extracted. Then features are divided by *255* to rescale them into a range of *[0,1]* to avoid overflow during computation. Data is split up into *60,000* training and *10,000* testing examples. Feedforward is performed with the training set for calculating the hypothesis and then backpropagation is done in order to reduce the error between the layers. The regularization parameter lambda is set to 0.1 to address the problem of overfitting. Optimizer is run for 70 iterations to find the best fit model.

**Note:**

- Save all
*.py*files in the same directory. - Download dataset from https://www.kaggle.com/avnishnish/mnist-original/download

**Main.py**

Importing all the required libraries, extract the data from *mnist-original.mat* file. Then features and labels will be separated from extracted data. After that data will be split into training (60,000) and testing (10,000) examples. Randomly initialize Thetas in the range of [-0.15, +0.15] to break symmetry and get better results. Further, the optimizer is called for the training of weights, to minimize the cost function for appropriate predictions. We have used the “*minimize*” optimizer from “*scipy.optimize*” library with “*L-BFGS-B*” method. We have calculated the test, the â€œtraining set accuracy and precision using â€œpredict” function.

## Python3

`from` `scipy.io ` `import` `loadmat` `import` `numpy as np` `from` `Model ` `import` `neural_network` `from` `RandInitialize ` `import` `initialise` `from` `Prediction ` `import` `predict` `from` `scipy.optimize ` `import` `minimize` `# Loading mat file` `data ` `=` `loadmat(` `'mnist-original.mat'` `)` `# Extracting features from mat file` `X ` `=` `data[` `'data'` `]` `X ` `=` `X.transpose()` `# Normalizing the data` `X ` `=` `X ` `/` `255` `# Extracting labels from mat file` `y ` `=` `data[` `'label'` `]` `y ` `=` `y.flatten()` `# Splitting data into training set with 60,000 examples` `X_train ` `=` `X[:` `60000` `, :]` `y_train ` `=` `y[:` `60000` `]` `# Splitting data into testing set with 10,000 examples` `X_test ` `=` `X[` `60000` `:, :]` `y_test ` `=` `y[` `60000` `:]` `m ` `=` `X.shape[` `0` `]` `input_layer_size ` `=` `784` `# Images are of (28 X 28) px so there will be 784 features` `hidden_layer_size ` `=` `100` `num_labels ` `=` `10` `# There are 10 classes [0, 9]` `# Randomly initialising Thetas` `initial_Theta1 ` `=` `initialise(hidden_layer_size, input_layer_size)` `initial_Theta2 ` `=` `initialise(num_labels, hidden_layer_size)` `# Unrolling parameters into a single column vector` `initial_nn_params ` `=` `np.concatenate((initial_Theta1.flatten(), initial_Theta2.flatten()))` `maxiter ` `=` `100` `lambda_reg ` `=` `0.1` `# To avoid overfitting` `myargs ` `=` `(input_layer_size, hidden_layer_size, num_labels, X_train, y_train, lambda_reg)` `# Calling minimize function to minimize cost function and to train weights` `results ` `=` `minimize(neural_network, x0` `=` `initial_nn_params, args` `=` `myargs,` ` ` `options` `=` `{` `'disp'` `: ` `True` `, ` `'maxiter'` `: maxiter}, method` `=` `"L-BFGS-B"` `, jac` `=` `True` `)` `nn_params ` `=` `results[` `"x"` `] ` `# Trained Theta is extracted` `# Weights are split back to Theta1, Theta2` `Theta1 ` `=` `np.reshape(nn_params[:hidden_layer_size ` `*` `(input_layer_size ` `+` `1` `)], (` ` ` `hidden_layer_size, input_layer_size ` `+` `1` `)) ` `# shape = (100, 785)` `Theta2 ` `=` `np.reshape(nn_params[hidden_layer_size ` `*` `(input_layer_size ` `+` `1` `):],` ` ` `(num_labels, hidden_layer_size ` `+` `1` `)) ` `# shape = (10, 101)` `# Checking test set accuracy of our model` `pred ` `=` `predict(Theta1, Theta2, X_test)` `print` `(` `'Test Set Accuracy: {:f}'` `.` `format` `((np.mean(pred ` `=` `=` `y_test) ` `*` `100` `)))` `# Checking train set accuracy of our model` `pred ` `=` `predict(Theta1, Theta2, X_train)` `print` `(` `'Training Set Accuracy: {:f}'` `.` `format` `((np.mean(pred ` `=` `=` `y_train) ` `*` `100` `)))` `# Evaluating precision of our model` `true_positive ` `=` `0` `for` `i ` `in` `range` `(` `len` `(pred)):` ` ` `if` `pred[i] ` `=` `=` `y_train[i]:` ` ` `true_positive ` `+` `=` `1` `false_positive ` `=` `len` `(y_train) ` `-` `true_positive` `print` `(` `'Precision ='` `, true_positive` `/` `(true_positive ` `+` `false_positive))` `# Saving Thetas in .txt file` `np.savetxt(` `'Theta1.txt'` `, Theta1, delimiter` `=` `' '` `)` `np.savetxt(` `'Theta2.txt'` `, Theta2, delimiter` `=` `' '` `)` |

**RandInitialise.py**

It randomly initializes theta between a range of [-epsilon, +epsilon].

## Python3

`import` `numpy as np` `def` `initialise(a, b):` ` ` `epsilon ` `=` `0.15` ` ` `c ` `=` `np.random.rand(a, b ` `+` `1` `) ` `*` `(` ` ` `# Randomly initialises values of thetas between [-epsilon, +epsilon]` ` ` `2` `*` `epsilon) ` `-` `epsilon ` ` ` `return` `c` |

**Model.py**

The function performs feed-forward and backpropagation.

- Forward propagation: Input data is fed in the forward direction through the network. Each hidden layer accepts the input data, processes it as per the activation function and passes it to the successive layer. We will use the sigmoid function as our “activation function”.
- Backward propagation: It is the practice of fine-tuning the weights of a neural net based on the error rate obtained in the previous iteration.

It also calculates cross-entropy costs for checking the errors between the prediction and original values. In the end, the gradient is calculated for the optimization objective.

## Python3

`import` `numpy as np` `def` `neural_network(nn_params, input_layer_size, hidden_layer_size, num_labels, X, y, lamb):` ` ` `# Weights are split back to Theta1, Theta2` ` ` `Theta1 ` `=` `np.reshape(nn_params[:hidden_layer_size ` `*` `(input_layer_size ` `+` `1` `)],` ` ` `(hidden_layer_size, input_layer_size ` `+` `1` `))` ` ` `Theta2 ` `=` `np.reshape(nn_params[hidden_layer_size ` `*` `(input_layer_size ` `+` `1` `):],` ` ` `(num_labels, hidden_layer_size ` `+` `1` `))` ` ` `# Forward propagation` ` ` `m ` `=` `X.shape[` `0` `]` ` ` `one_matrix ` `=` `np.ones((m, ` `1` `))` ` ` `X ` `=` `np.append(one_matrix, X, axis` `=` `1` `) ` `# Adding bias unit to first layer` ` ` `a1 ` `=` `X` ` ` `z2 ` `=` `np.dot(X, Theta1.transpose())` ` ` `a2 ` `=` `1` `/` `(` `1` `+` `np.exp(` `-` `z2)) ` `# Activation for second layer` ` ` `one_matrix ` `=` `np.ones((m, ` `1` `))` ` ` `a2 ` `=` `np.append(one_matrix, a2, axis` `=` `1` `) ` `# Adding bias unit to hidden layer` ` ` `z3 ` `=` `np.dot(a2, Theta2.transpose())` ` ` `a3 ` `=` `1` `/` `(` `1` `+` `np.exp(` `-` `z3)) ` `# Activation for third layer` ` ` `# Changing the y labels into vectors of boolean values.` ` ` `# For each label between 0 and 9, there will be a vector of length 10` ` ` `# where the ith element will be 1 if the label equals i` ` ` `y_vect ` `=` `np.zeros((m, ` `10` `))` ` ` `for` `i ` `in` `range` `(m):` ` ` `y_vect[i, ` `int` `(y[i])] ` `=` `1` ` ` `# Calculating cost function` ` ` `J ` `=` `(` `1` `/` `m) ` `*` `(np.` `sum` `(np.` `sum` `(` `-` `y_vect ` `*` `np.log(a3) ` `-` `(` `1` `-` `y_vect) ` `*` `np.log(` `1` `-` `a3)))) ` `+` `(lamb ` `/` `(` `2` `*` `m)) ` `*` `(` ` ` `sum` `(` `sum` `(` `pow` `(Theta1[:, ` `1` `:], ` `2` `))) ` `+` `sum` `(` `sum` `(` `pow` `(Theta2[:, ` `1` `:], ` `2` `))))` ` ` `# backprop` ` ` `Delta3 ` `=` `a3 ` `-` `y_vect` ` ` `Delta2 ` `=` `np.dot(Delta3, Theta2) ` `*` `a2 ` `*` `(` `1` `-` `a2)` ` ` `Delta2 ` `=` `Delta2[:, ` `1` `:]` ` ` `# gradient` ` ` `Theta1[:, ` `0` `] ` `=` `0` ` ` `Theta1_grad ` `=` `(` `1` `/` `m) ` `*` `np.dot(Delta2.transpose(), a1) ` `+` `(lamb ` `/` `m) ` `*` `Theta1` ` ` `Theta2[:, ` `0` `] ` `=` `0` ` ` `Theta2_grad ` `=` `(` `1` `/` `m) ` `*` `np.dot(Delta3.transpose(), a2) ` `+` `(lamb ` `/` `m) ` `*` `Theta2` ` ` `grad ` `=` `np.concatenate((Theta1_grad.flatten(), Theta2_grad.flatten()))` ` ` `return` `J, grad` |

**Prediction.py**

It performs forward propagation to predict the digit.

## Python3

`import` `numpy as np` `def` `predict(Theta1, Theta2, X):` ` ` `m ` `=` `X.shape[` `0` `]` ` ` `one_matrix ` `=` `np.ones((m, ` `1` `))` ` ` `X ` `=` `np.append(one_matrix, X, axis` `=` `1` `) ` `# Adding bias unit to first layer` ` ` `z2 ` `=` `np.dot(X, Theta1.transpose())` ` ` `a2 ` `=` `1` `/` `(` `1` `+` `np.exp(` `-` `z2)) ` `# Activation for second layer` ` ` `one_matrix ` `=` `np.ones((m, ` `1` `))` ` ` `a2 ` `=` `np.append(one_matrix, a2, axis` `=` `1` `) ` `# Adding bias unit to hidden layer` ` ` `z3 ` `=` `np.dot(a2, Theta2.transpose())` ` ` `a3 ` `=` `1` `/` `(` `1` `+` `np.exp(` `-` `z3)) ` `# Activation for third layer` ` ` `p ` `=` `(np.argmax(a3, axis` `=` `1` `)) ` `# Predicting the class on the basis of max value of hypothesis` ` ` `return` `p` |

**GUI.py**

It launches a GUI for writing digits. The image of the digit is stored in the same directory after converting it to greyscale and reducing the size to* (28 X 28)* pixels.

## Python3

`from` `tkinter ` `import` `*` `import` `numpy as np` `from` `PIL ` `import` `ImageGrab` `from` `Prediction ` `import` `predict` `window ` `=` `Tk()` `window.title(` `"Handwritten digit recognition"` `)` `l1 ` `=` `Label()` `def` `MyProject():` ` ` `global` `l1` ` ` `widget ` `=` `cv` ` ` `# Setting co-ordinates of canvas` ` ` `x ` `=` `window.winfo_rootx() ` `+` `widget.winfo_x()` ` ` `y ` `=` `window.winfo_rooty() ` `+` `widget.winfo_y()` ` ` `x1 ` `=` `x ` `+` `widget.winfo_width()` ` ` `y1 ` `=` `y ` `+` `widget.winfo_height()` ` ` `# Image is captured from canvas and is resized to (28 X 28) px` ` ` `img ` `=` `ImageGrab.grab().crop((x, y, x1, y1)).resize((` `28` `, ` `28` `))` ` ` `# Converting rgb to grayscale image` ` ` `img ` `=` `img.convert(` `'L'` `)` ` ` `# Extracting pixel matrix of image and converting it to a vector of (1, 784)` ` ` `x ` `=` `np.asarray(img)` ` ` `vec ` `=` `np.zeros((` `1` `, ` `784` `))` ` ` `k ` `=` `0` ` ` `for` `i ` `in` `range` `(` `28` `):` ` ` `for` `j ` `in` `range` `(` `28` `):` ` ` `vec[` `0` `][k] ` `=` `x[i][j]` ` ` `k ` `+` `=` `1` ` ` `# Loading Thetas` ` ` `Theta1 ` `=` `np.loadtxt(` `'Theta1.txt'` `)` ` ` `Theta2 ` `=` `np.loadtxt(` `'Theta2.txt'` `)` ` ` `# Calling function for prediction` ` ` `pred ` `=` `predict(Theta1, Theta2, vec ` `/` `255` `)` ` ` `# Displaying the result` ` ` `l1 ` `=` `Label(window, text` `=` `"Digit = "` `+` `str` `(pred[` `0` `]), font` `=` `(` `'Algerian'` `, ` `20` `))` ` ` `l1.place(x` `=` `230` `, y` `=` `420` `)` `lastx, lasty ` `=` `None` `, ` `None` `# Clears the canvas` `def` `clear_widget():` ` ` `global` `cv, l1` ` ` `cv.delete(` `"all"` `)` ` ` `l1.destroy()` `# Activate canvas` `def` `event_activation(event):` ` ` `global` `lastx, lasty` ` ` `cv.bind(` `'<B1-Motion>'` `, draw_lines)` ` ` `lastx, lasty ` `=` `event.x, event.y` `# To draw on canvas` `def` `draw_lines(event):` ` ` `global` `lastx, lasty` ` ` `x, y ` `=` `event.x, event.y` ` ` `cv.create_line((lastx, lasty, x, y), width` `=` `30` `, fill` `=` `'white'` `, capstyle` `=` `ROUND` `, smooth` `=` `TRUE, splinesteps` `=` `12` `)` ` ` `lastx, lasty ` `=` `x, y` `# Label` `L1 ` `=` `Label(window, text` `=` `"Handwritten Digit Recoginition"` `, font` `=` `(` `'Algerian'` `, ` `25` `), fg` `=` `"blue"` `)` `L1.place(x` `=` `35` `, y` `=` `10` `)` `# Button to clear canvas` `b1 ` `=` `Button(window, text` `=` `"1. Clear Canvas"` `, font` `=` `(` `'Algerian'` `, ` `15` `), bg` `=` `"orange"` `, fg` `=` `"black"` `, command` `=` `clear_widget)` `b1.place(x` `=` `120` `, y` `=` `370` `)` `# Button to predict digit drawn on canvas` `b2 ` `=` `Button(window, text` `=` `"2. Prediction"` `, font` `=` `(` `'Algerian'` `, ` `15` `), bg` `=` `"white"` `, fg` `=` `"red"` `, command` `=` `MyProject)` `b2.place(x` `=` `320` `, y` `=` `370` `)` `# Setting properties of canvas` `cv ` `=` `Canvas(window, width` `=` `350` `, height` `=` `290` `, bg` `=` `'black'` `)` `cv.place(x` `=` `120` `, y` `=` `70` `)` `cv.bind(` `'<Button-1>'` `, event_activation)` `window.geometry(` `"600x500"` `)` `window.mainloop()` |

**Result:**

Training set accuracy of 99.440000%

Test set accuracy of 97.320000%

Precision of 0.9944

**Output:**

This article is contributed by:

- Utkarsh Shaw (https://auth.geeksforgeeks.org/user/utkarshshaw/profile)
- Tania (https://auth.geeksforgeeks.org/user/taniachanana02/profile)
- Rishab Mamgai (https://auth.geeksforgeeks.org/user/rishabmamgai/profile)