ML Model Accuracy/Results
Handwritten Alphanumeric model
Iteration 1:
Model details | Dataset | Accuracy achieved |
---|---|---|
Model trained and inference on | Existing dataset | 99.80% |
Model inference on | NIST dataset | 62.70% |
Average across all datasets | - | 81.25% |
Reasoning - As model is trained on the existing dataset, doesn't perform good on NIST dataset
Iteration 2:
Model details | Dataset | Accuracy achieved |
---|---|---|
Model trained on | Existing dataset + NIST misclassifications | 83.30% |
Reasoning - After training the model on NIST misclassifications, improvement in accuracy
Iteration 3:
Model details | Dataset | Accuracy achieved |
---|---|---|
Model trained on | Existing dataset + NIST misclassifications | 93.90% |
Reasoning - After training the model on NIST misclassifications as per previous checkpoint, improvement in accuracy
Iteration 4:
Model details | Dataset | Accuracy achieved |
---|---|---|
Model trained on | Existing dataset + manually collected dataset from sheets (~1K samples) | 93.90% |
Reasoning - Training the model upon previous checkpoint and adding manually collected data, improvement in accuracy
Handwritten Digits model
Iteration 1:
Model details | Dataset | Accuracy achieved |
---|---|---|
Model trained and inference on | Existing dataset | 99.90% |
Model inference on | NIST dataset | 60.00% |
Model inference on | Obtained production data (~50 samples) | 97.70% |
Average across all datasets | - | 85.80% |
Reasoning - As model is trained on the existing dataset, doesn't perform good on NIST dataset
Iteration 2:
Model details | Dataset | Accuracy achieved |
---|---|---|
Model trained on | Existing dataset + NIST misclasifications | 96.40% |
Reasoning - After training the model on NIST misclassifications, improvement in accuracy
Iteration 3:
Model details | Dataset | Accuracy achieved |
---|---|---|
Model trained on | Existing dataset + NIST misclasifications + production dataset (~50 samples) | 99.70% |
Reasoning: After training the model on NIST misclassifications and production dataset, improvement in accuracy
Iteration 4:
Model details | Dataset | Accuracy achieved |
---|---|---|
Model trained on | Existing dataset + NIST misclasifications + manually collected dataset from sheets (~8.6k samples) | 98.30% |
Reasoning: As averaging upon a large production dataset, the accuracy slightly dips as compared to iteration 3
Sample dataset images
Existing dataset
Handwritten alphanumeric
Handwritten digits
NIST dataset
Handwritten alphanumeric
Handwritten digits
Manually collected
Handwritten alphanumeric
Handwritten digits
Some unhandled misclassifications
Reasoning: Generally occurs if the digits are written in corners of the cell
Last updated