Driver Monitoring System

An AI-powered platform that detects dangerous driving behaviors in real-time using dual YOLO models, advanced computer vision, and intelligent detection algorithms running entirely in your browser.

System Architecture

Complete pipeline from image upload to final detection results using dual-model approach.

Complete detection pipeline from upload to final results

Dual Model Architecture

Two distinct YOLO architectures working in parallel for robust detection.

Model A

Chaitanya Model

YOLOv8 architecture optimized for core safety violations. Uses C2f blocks for efficient feature extraction.

Architecture YOLOv8

Core Block C2f (CSPDarknet)

Output [1, 9, 8400]

Classes 5

Detected Classes

0: Cigarette
1: Drinking
2: Eating
3: Phone
4: Seatbelt

Training Distribution

Drinking: 771 samples
Cigarette: 365 samples
Seatbelt: 429 samples
Eating: 261 samples
Phone: 55 samples

Model B

Soham Model

YOLO11 next-gen architecture with PSA attention mechanisms and C3k2 blocks for detecting subtle behaviors.

Architecture YOLO11

Core Block C3k2 + PSA

Output [1, 12, 8400]

Classes 8

Detected Classes

0: Distracted
1: Drinking
2: Drowsy
3: Eating
4: PhoneUse
5: SafeDriving
6: Seatbelt
7: Smoking

Dataset Source

Both models were trained on datasets from Roboflow, a comprehensive platform for computer vision datasets and model training.

Visit Roboflow →

Class Name Unification

Standardizing class names across both models for consistent detection results.

Chaitanya Model	Soham Model		Unified Output
Cigarette	Smoking	→	Smoking
Phone	PhoneUse	→	Phone Usage
Drinking	Drinking	→	Drinking
Eating	Eating	→	Eating
Seatbelt	Seatbelt	→	Seatbelt
—	Distracted	→	Distracted
—	Drowsy	→	Drowsy
—	SafeDriving	→	Filtered out

Image Preprocessing

Letterbox technique preserves aspect ratio while resizing to 640×640 pixels.

Images are resized using gray padding bars to maintain their original proportions. The preprocessing pipeline normalizes pixel values to [0, 1] and separates RGB channels into float32 tensors.

Preprocessing maintains aspect ratio using letterbox padding

Enhanced NMS Algorithm

Core innovation that solves duplicate detection using IoU and center containment.

Enhanced NMS uses both IoU and center containment to remove duplicates

Complete Detection Flow

Step-by-step process from image upload to final results.

1

Image Upload

User uploads driver images through browser interface. Files are validated and stored in memory as base64 data URLs.

2

Lazy Model Loading

On "Process Images" click, both ONNX models download and initialize with WebAssembly. Models cache in memory for reuse.

3

Preprocessing

Images resize to 640×640 with letterbox, normalize to [0, 1], and convert to float32 tensors with separated RGB channels.

4

Parallel Inference

Both models process tensors simultaneously, outputting 8400 potential detections with bounding boxes and confidence scores.

5

Parse & Filter

Extract detections above 0.25 confidence threshold. Convert coordinates from model space back to original image dimensions.

6

Class Unification

Map different class names to standardized labels (Phone/PhoneUse → Phone Usage, Cigarette → Smoking).

7

Enhanced NMS

Group detections by class, sort by confidence, apply NMS to remove duplicates while preserving separate instances.

8

Visualization

Draw color-coded bounding boxes on canvas (red for dangerous, green for safe). Generate and display safety instructions.

Team & Contributions

Developed under Advanced Course on Green Skills and AI (Skills4Future Program).

Chaitanya kulkarni

YOLOv8 Model Training

Soham Jadhav

YOLO11 Model Training

Divyanshu Mishra

CNN Model Training

Anurag Pawar

Backend Logic

Additional Models & Backend

Divyanshu Mishra trained a CNN-based model that achieves excellent accuracy but is not browser-compatible due to computational requirements. This model can be run locally for enhanced detection capabilities.

Anurag Pawar developed the backend logic and server infrastructure for advanced deployment scenarios.

For complete model access and backend implementation, visit our GitHub repository.

Program: Advanced Course on Green Skills and Artificial Intelligence
Organized by: Edunet Foundation, AICTE, Shell India Markets Pvt. Ltd.
Mentor: Professor Sarthak Narnor