JoaoESmoreira

Optical Character Recognition

The main goal of the project is to design and implement neural network models capable of recognizing handwritten digits (0–9). Each digit is represented as a binary 16×16 matrix (256 features), and different neural network architectures are trained and compared to classify them accurately.

The project also includes the development of a simple graphical interface to test the trained models, allowing users to draw digits and visualize classification results.

The full repository and documentation can be found here: Optical Character Recognition Project Repository.

Tools and Technologies

Programming Language: MATLAB (2023b)
Required Toolbox: Image Processing Toolbox, Neural Network Toolbox

Objectives

Implement and train neural networks for Optical Character Recognition (OCR).
Compare multiple architectures:
- A two-stage model with a filter + classifier.
- A single-stage classifier (with one or two layers).
Evaluate the influence of different activation functions and training algorithms.
Develop a GUI that allows users to draw digits and test the classifiers interactively.

Methodology

Data Representation

Each digit is encoded as a binary 16×16 matrix, converted into a 256×1 vector. Training data is generated using the provided mpaper.m function and the PerfectArial.mat dataset.

Neural Network Architectures

Filter + Classifier
- The filter can be an associative memory or a binary perceptron trained to correct imperfect inputs.
- The classifier is a single-layer neural network with 10 output neurons (one per digit).
Classifier Only
- Direct classification without pre-filtering.
- Tested with one or two layers (using patternnet for multi-layer architectures).

Activation Functions

The following activation functions were tested:

hardlim (binary threshold)
purelin (linear)
logsig (sigmoidal)
softmax (for probabilistic classification)

Training

Training was performed using MATLAB’s Neural Network Toolbox. Both incremental and batch learning modes were explored, with algorithms such as:

learnp (Perceptron rule)
learngd (Gradient descent)
trainlm (Levenberg–Marquardt)

Datasets were split into training and validation sets to prevent overfitting.

Testing

Testing was carried out using new samples generated with mpaper, ensuring they differed from the training set. The sim function was used to evaluate generalization performance.

Implementation Details

Tools and Environment

Programming Language: MATLAB (2023b)
Required Toolbox: Image Processing Toolbox, Neural Network Toolbox
Auxiliary Functions:
- mpaper.m — draw digits and create binary input vectors
- grafica.m — display digits
- showim.m — visualize digit grids
- ocr_fun.m — call the classifier and display results
- myclassify.m — custom classification script

GUI Application

A MATLAB App Designer interface (2023OCRGUIPLxGy) was built to:

Draw digits using mpaper.
Classify them using trained neural networks.
Display both input and output digits in a 5×10 grid.

Evaluation

The following aspects were analyzed:

Impact of dataset size and quality on model performance.
Comparison of architectures (filter + classifier vs. classifier only).
Effectiveness of different activation functions.
Classification accuracy and robustness to imperfect inputs.

References

MATLAB Documentation (R2023b): Neural Networks Toolbox
PerfectArial.mat — Reference digit dataset

Footer

The contents of this website are licensed under the Creative Commons Attribution-NoDerivatives 4.0 International License (CC-BY-ND 4.0).

The source code of this website is licensed under the MIT license, and available in GitHub repositor. User-submitted contributions to the site are welcome, as long as the contributor agrees to license their submission with the CC-BY-ND 4.0 license.