Written by: on Thu Oct 17

Kathakali-Mudra_DinoV3 Vision Transformer

A computer vision project for recognizing and classifying Kathakali classical dance mudras (hand gestures) using Meta's DINOv3 vision transformer model.

Go to project
Screenshot of Kathakali
~4 MIN

Kathakali Mudra Recognition with DINOv3

A computer vision project for recognizing and classifying Kathakali classical dance mudras (hand gestures) using Metaโ€™s DINOv3 vision transformer model.

๐Ÿ“– Overview

Kathakali is a highly stylized classical Indian dance-drama from Kerala, known for its elaborate costumes, detailed gestures, and expressive movements. Central to this art form are the intricate hand gestures called Mudras, which form a sophisticated vocabulary for storytelling. This project leverages state-of-the-art computer vision to automatically recognize and classify these mudras, serving as a tool for learning, preservation, and digital analysis of this ancient art form.

The project utilizes DINOv3, Meta AIโ€™s self-supervised vision transformer, fine-tuned specifically for Kathakali mudra recognition.

โœจ Features

  • Accurate Mudra Classification: Identifies single-hand (Asamyukta) and combined-hand (Samyukta) Kathakali mudras from images with high precision
  • State-of-the-Art Backbone: Powered by DINOv3โ€™s powerful self-supervised visual features
  • Multiple Inference Modes:
  • Single image classification
  • Batch processing for multiple images
  • Real-time webcam demonstration
  • Transfer Learning: Leverages pre-trained DINOv3 weights for efficient training with limited data
  • Modular Architecture: Clean, well-organized code for easy experimentation and extension

๐Ÿ› ๏ธ Installation

  1. Clone the repository:

    git clone https://github.com/Sree14hari/Kathakali-Mudra_DinoV3.git
    cd Kathakali-Mudra_DinoV3
  2. Create and activate a virtual environment:

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies:

    pip install -r requirements.txt

๐Ÿš€ Quick Start

1. Data Preparation

Organize your dataset in the following structure:

data/
  train/
    pataka/
      image_001.jpg
      image_002.jpg
      ...
    tripataka/
    mayura/
    ...
  val/
    pataka/
    tripataka/
    ...
  test/
    pataka/
    tripataka/
    ...

๐Ÿง  Model Architecture

This project uses DINOv3 (Vision Transformer) as the feature extractor with a custom classification head:

  • Backbone: DINOv3 ViT-B/14 or ViT-L/14
  • Feature Extraction: Self-supervised pre-trained weights
  • Classifier: Custom MLP head adapted for mudra classification
  • Input Resolution: 224ร—224 or 518ร—518 pixels

๐Ÿ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Meta AI for the DINOv3 model and pre-trained weights
  • Kathakali artists and researchers who contributed to mudra datasets
  • The open-source community for PyTorch, OpenCV, and other essential libraries
  • Kerala Kalamandalam and other institutions preserving Kathakali heritage

๐Ÿค Contributing

We welcome contributions to improve Kathakali mudra recognition! Please feel free to:

  1. Report bugs and feature requests via GitHub Issues
  2. Submit pull requests for improvements
  3. Share additional mudra datasets or annotations
  4. Improve documentation and examples

Development workflow:

git checkout -b feature/your-feature-name
git commit -m "Add your feature"
git push origin feature/your-feature-name

๐Ÿ“ฎ Contact

For questions or collaborations regarding this project:


Bridging centuries-old classical art with cutting-edge computer vision technology ๐ŸŽญ๐Ÿค–

Subscribe to our newsletter!