A Deep Learning Approach to Butterfly Species Identification ๐ŸŒฟ

License: AGPL-3.0 Python TensorFlow Keras OpenCV Open Source Dataset

Are you fascinated by the beautiful world of butterflies? ๐Ÿฆ‹ With over 20,000 known species, these delicate creatures have long been a subject of interest for entomologists and naturalists alike. However, accurate identification of butterfly species remains a significant challenge, hindering our understanding of their behavior, habitat, and conservation. ๐ŸŒฟ

About

This project is all about using deep learning to classify images of butterflies into their respective species. The dataset is from Kaggle, which contains over 10,000 images of butterflies from 100 different species. ๐Ÿ“ธ The images were collected from various sources, including field observations, museum collections, and online repositories.

Dataset ๐Ÿ“Š

Category Number of Images
Training 12594
Validation 500
Testing 500

Inspiration ๐ŸŒช๏ธ

  • Manual identification of butterfly species is a time-consuming and expertise-dependent process, prone to errors and inconsistencies. ๐Ÿ“
  • The lack of an efficient and accurate identification system hinders the study of butterfly populations, habitats, and behavior, ultimately affecting conservation efforts. ๐ŸŒŽ
  • An automated system for butterfly species identification can have a profound impact on our understanding of these insects and their role in ecosystems. ๐ŸŒŸ

===========================================================================

Methodology ๐Ÿ”

Requirements

  • Python 3.x
  • TensorFlow 2.x
  • Keras
  • OpenCV
  • NumPy
  • Pandas
  • Matplotlib
  • Seaborn
  • Plotly

Data Preprocessing ๐Ÿ”€

  • Data augmentation: Apply random transformations to the images to artificially increase the size of the training set using TF-keras pre-processing layers. ๐Ÿ”€
  • Image resizing: Resize images to a uniform size of 224x224 pixels.

Model Architecture ๐Ÿ“š

  • Base model: MobileNetV3Large model pre-trained on ImageNet (224, 224, 3)
  • Custom classification head: Add a new classification head on top of the base model, consisting of global average pooling layers, batch normalization layers, and dense layers with 100 units.

Training ๐Ÿ“Š

  • Optimizer: Adam
  • Loss function: Sparse categorical crossentropy
  • Batch size: 32
  • Number of epochs: 50
  • Metrics: Accuracy

Model Performance ๐Ÿ“Š

The model achieves a test accuracy of 0.96, which is a great result considering the complexity of the dataset! ๐ŸŽ‰ Hereโ€™s a breakdown of the results:

  • Training accuracy: 0.9996
  • Validation accuracy: 0.9420
  • Test accuracy: 0.9600

Future Work ๐Ÿš€

  • Experiment with different model architectures (ResNet or DenseNet ๐Ÿค–) and hyperparameters (transfer learning to fine-tune the model on a different dataset ๐Ÿ“š) to improve performance.

===========================================================================

Acknowledgments ๐Ÿ™

  • Kaggle dataset: ๐Ÿ› Butterfly & Moths Image Classification 100 species
  • TensorFlow and Keras libraries for deep learning
  • Matplotlib and Seaborn libraries for data visualization

๐Ÿ™…โ€โ™‚๏ธDisclaimer

This project is licensed under AGPL-3.0 License and is for personal use only and should not be used for commercial purposes. The pre-trained model and may not always produce accurate results.

Get Involved! ๐Ÿ˜Œ

This project demonstrates the potential of deep learning for butterfly species identification. The model achieves high accuracy and can be used as a starting point for further research and development in this field.

I hope you found this project informative and engaging! ๐Ÿ˜Š
If youโ€™re interested in collaborating and contributing to the project, please let me know! Iโ€™d love to hear from you.

Getting Started ๐Ÿš€

To get started with this project, youโ€™ll need to:

  • Install the required libraries, including TensorFlow, Keras, and OpenCV ๐Ÿ“ฆ
  • Download the dataset from Kaggle ๐Ÿ“ˆ
  • Run the code to train and evaluate the model ๐Ÿค–

Enjoy working with the content! ๐Ÿ˜Š