ML Model Accuracy/Results
Handwritten Alphanumeric model
Iteration 1:
Model trained and inference on
Existing dataset
99.80%
Model inference on
NIST dataset
62.70%
Average across all datasets
-
81.25%
Reasoning - As model is trained on the existing dataset, doesn't perform good on NIST dataset
Iteration 2:
Model trained on
Existing dataset + NIST misclassifications
83.30%
Reasoning - After training the model on NIST misclassifications, improvement in accuracy
Iteration 3:
Model trained on
Existing dataset + NIST misclassifications
93.90%
Reasoning - After training the model on NIST misclassifications as per previous checkpoint, improvement in accuracy
Iteration 4:
Model trained on
Existing dataset + manually collected dataset from sheets (~1K samples)
93.90%
Reasoning - Training the model upon previous checkpoint and adding manually collected data, improvement in accuracy
Handwritten Digits model
Iteration 1:
Model trained and inference on
Existing dataset
99.90%
Model inference on
NIST dataset
60.00%
Model inference on
Obtained production data (~50 samples)
97.70%
Average across all datasets
-
85.80%
Reasoning - As model is trained on the existing dataset, doesn't perform good on NIST dataset
Iteration 2:
Model trained on
Existing dataset + NIST misclasifications
96.40%
Reasoning - After training the model on NIST misclassifications, improvement in accuracy
Iteration 3:
Model trained on
Existing dataset + NIST misclasifications + production dataset (~50 samples)
99.70%
Reasoning: After training the model on NIST misclassifications and production dataset, improvement in accuracy
Iteration 4:
Model trained on
Existing dataset + NIST misclasifications + manually collected dataset from sheets (~8.6k samples)
98.30%
Reasoning: As averaging upon a large production dataset, the accuracy slightly dips as compared to iteration 3
Sample dataset images
Existing dataset
Handwritten alphanumeric
Handwritten digits
NIST dataset
Handwritten alphanumeric
Handwritten digits
Manually collected
Handwritten alphanumeric
Handwritten digits
Some unhandled misclassifications
Reasoning: Generally occurs if the digits are written in corners of the cell
Last updated