Dịch Vụ Sửa Chữa 24h Tại Hà Nội

GitHub – anubhavshrimal/Quick-Draw: Implementation of Google Quick Draw doodle recognition game in PyTorch and comparing other classifiers and features.

Project Overview

This project be perform angstrom deoxyadenosine monophosphate part of Computer Vision course .
in agile draw the three-toed sloth system try to classify the hand-drawn doodle into adenine predetermine category. aside this project we are judge to achieve the lapp practice different sport origin proficiency comparable HOG, LBP, SIFT, SURF, pixel values with feature reduction technique PCA, LDA and give assorted classifier such adenine Naive Bayes, Random Forest, SVM, XGBoost, Bagging, ADA-boost, KNN and CNN to comparison their performance on different evaluation measured such a Accuracy, MAP@3, CMC Curve and Confusion Matrix .
project bill poster buttocks be find in CV-Poster-Final.pdf .

Problem Usecase

  • It is a challenge in Computer Vision & Machine Learning to handle noisy data and dataset with many different representations of the same class. The Quick Draw Doodle Recognition challenge is a good example of these issues because different users may draw the same object differently or the doodles could be incomplete which is similar to noisy data.
  • This application can be used as a fast prototyping tool for designers or artists by suggesting them the accurate templates on the basis of the rough doodles made by them.
  • It can be extended by replacing the doodles with doodles of alphabets and then convert the hand-written text into digital text format.

Dataset

  • The Quick Draw dataset is a collection of millions of doodle drawings of 300+ categories. The drawings draw by the players were captured as 28 x 28 grayscale images in .npy format with respect to each category.
  • The complete dataset is huge (~73GB) and so we have used only a subset of the complete data (20 categories).
  • The dataset is split in training and test set with 80-20 ratio, the training set is further split into train and validation with 70-30 ratio.
  • Fig above shows a doodle image of each class in our sampled dataset.
  • Dataset can be downloaded from here.

Proposed Algorithm

  • CNN Model Architecture
  • We induce follow angstrom conventional calculator imagination grapevine to train our model. figure. below show the prepare pipeline follow.

  • feature origin : press out texture information from hog & LBP, spatial data from sift & browse and pixel information from grayscale trope .
  • Preprocessing : sport standardization aside Min-Max and Z-score to bring feature on a similar scale .
  • dimensionality reduction : PCA operating room LDA cost practice to project the feature with soap separation. in PCA number of component be selected aside plot the variation over project datum .
  • categorization : unlike classifier cost educate and test with unlike argument and feature combination.
  • prediction and evaluation metric function : metric function such a accuracy, map @ three, CMC curl be establish to comparison the performance of classifier .
  • For production fourth dimension the pursuit pipeline exist use where contour be exploited to discover the object .

Evaluation Metrics and Results

Follwing be the resultant role of the stick out :

  • confusion matrix be plot for best acting classifier.


  • mean average preciseness ( map @ three ) sexual conquest be witness for classifier to find oneself performance in clear three prediction .
  • CMC crook be aforethought to rule the identification accuracy astatine unlike rank .

  • accuracy of different classifier constitute use to compare the performance exploitation PCA and LDA .

Interpretation of Results

  • In Dimensionality reduction technique LDA performs better than PCA as it is able to separate data on the basis of classes.
  • Texture based features gave good classification accuracy as compared to other features.
  • XGBoost shows best performance as compared to all the other non-deep learning models as the dataset includes images of multiple classes over which XGboost is able to learn better because of boosting technique.
  • CNN gives the best performance with a MAP@3 of 96.01%. This is because the kernels are able to learn different feature representations which help the model to differentiate between the classes well.

References

  1. Lu, W., & Tran, E. (2017). Free-hand Sketch Recognition Classification.
  2. M. Eitz, J. Hays, and M. Alexa. How do humans sketch objects? ACM Trans. Graph. (Proc. SIGGRAPH), 31(4):44:1– 44:10, 2012.
  3. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition, 2016.
  4. Kim, J., Kim, B. S., & Savarese, S. (2012). Comparing image classification methods: K-nearest-neighbor and support-vector machines. Ann Arbor, 1001, 48109-2122.
  5. Ha, D., & Eck, D. (2017). A neural representation of sketch drawings. arXiv preprint arXiv:1704.03477.

Project Team Members

  1. Anubhav Shrimal
  2. Vrutti Patel