An AI-powered platform that detects dangerous driving behaviors in real-time using dual YOLO models, advanced computer vision, and intelligent detection algorithms running entirely in your browser.
Complete pipeline from image upload to final detection results using dual-model approach.
Two distinct YOLO architectures working in parallel for robust detection.
YOLOv8 architecture optimized for core safety violations. Uses C2f blocks for efficient feature extraction.
YOLO11 next-gen architecture with PSA attention mechanisms and C3k2 blocks for detecting subtle behaviors.
Both models were trained on datasets from Roboflow, a comprehensive platform for computer vision datasets and model training.
Standardizing class names across both models for consistent detection results.
| Chaitanya Model | Soham Model | Unified Output | |
|---|---|---|---|
| Cigarette | Smoking | → | Smoking |
| Phone | PhoneUse | → | Phone Usage |
| Drinking | Drinking | → | Drinking |
| Eating | Eating | → | Eating |
| Seatbelt | Seatbelt | → | Seatbelt |
| — | Distracted | → | Distracted |
| — | Drowsy | → | Drowsy |
| — | SafeDriving | → | Filtered out |
Letterbox technique preserves aspect ratio while resizing to 640×640 pixels.
Images are resized using gray padding bars to maintain their original proportions. The preprocessing pipeline normalizes pixel values to [0, 1] and separates RGB channels into float32 tensors.
Core innovation that solves duplicate detection using IoU and center containment.
Step-by-step process from image upload to final results.
User uploads driver images through browser interface. Files are validated and stored in memory as base64 data URLs.
On "Process Images" click, both ONNX models download and initialize with WebAssembly. Models cache in memory for reuse.
Images resize to 640×640 with letterbox, normalize to [0, 1], and convert to float32 tensors with separated RGB channels.
Both models process tensors simultaneously, outputting 8400 potential detections with bounding boxes and confidence scores.
Extract detections above 0.25 confidence threshold. Convert coordinates from model space back to original image dimensions.
Map different class names to standardized labels (Phone/PhoneUse → Phone Usage, Cigarette → Smoking).
Group detections by class, sort by confidence, apply NMS to remove duplicates while preserving separate instances.
Draw color-coded bounding boxes on canvas (red for dangerous, green for safe). Generate and display safety instructions.
Developed under Advanced Course on Green Skills and AI (Skills4Future Program).
YOLOv8 Model Training
YOLO11 Model Training
CNN Model Training
Backend Logic
Divyanshu Mishra trained a CNN-based model that achieves excellent accuracy but is not browser-compatible due to computational requirements. This model can be run locally for enhanced detection capabilities.
Anurag Pawar developed the backend logic and server infrastructure for advanced deployment scenarios.
For complete model access and backend implementation, visit our GitHub repository.
Program: Advanced Course on Green Skills and Artificial Intelligence
Organized by: Edunet Foundation, AICTE, Shell India Markets Pvt. Ltd.
Mentor: Professor Sarthak Narnor
function preprocess(img) { const INPUT_SIZE = 640; const canvas = document.createElement('canvas'); canvas.width = INPUT_SIZE; canvas.height = INPUT_SIZE; const ctx = canvas.getContext('2d'); // Fill canvas with neutral gray background ctx.fillStyle = '#808080'; ctx.fillRect(0, 0, INPUT_SIZE, INPUT_SIZE); // Calculate aspect-ratio preserving scale const scale = Math.min( INPUT_SIZE / img.width, INPUT_SIZE / img.height ); const newWidth = img.width * scale; const newHeight = img.height * scale; // Center the image on canvas (letterbox technique) const offsetX = (INPUT_SIZE - newWidth) / 2; const offsetY = (INPUT_SIZE - newHeight) / 2; ctx.drawImage(img, offsetX, offsetY, newWidth, newHeight); // Extract pixel data and convert to tensor const imageData = ctx.getImageData(0, 0, INPUT_SIZE, INPUT_SIZE); const tensor = new Float32Array(3 * INPUT_SIZE * INPUT_SIZE); // Normalize to [0, 1] and separate RGB channels // Format: [R, R, R...], [G, G, G...], [B, B, B...] for (let i = 0; i < imageData.data.length; i += 4) { const idx = i / 4; tensor[idx] = imageData.data[i] / 255.0; // R channel tensor[INPUT_SIZE * INPUT_SIZE + idx] = imageData.data[i + 1] / 255.0; // G channel tensor[2 * INPUT_SIZE * INPUT_SIZE + idx] = imageData.data[i + 2] / 255.0; // B channel } return { tensor, scale, offsetX, offsetY }; }
// Enhanced NMS with center containment check function boxesOverlap(box1, box2, iouThreshold) { // Method 1: Traditional IoU calculation const iou = calculateIoU(box1, box2); if (iou > iouThreshold) { return true; // Boxes overlap significantly } // Method 2: Center containment check (our innovation) const center1 = { x: (box1.x1 + box1.x2) / 2, y: (box1.y1 + box1.y2) / 2 }; const center2 = { x: (box2.x1 + box2.x2) / 2, y: (box2.y1 + box2.y2) / 2 }; // Check if box1 center is inside box2 if (center1.x >= box2.x1 && center1.x <= box2.x2 && center1.y >= box2.y1 && center1.y <= box2.y2) { return true; // Same object detected } // Check if box2 center is inside box1 if (center2.x >= box1.x1 && center2.x <= box1.x2 && center2.y >= box1.y1 && center2.y <= box1.y2) { return true; // Same object detected } return false; // Different objects } function enhancedNMS(detections, iouThreshold) { // Sort detections by confidence score (highest first) const sorted = [...detections].sort((a, b) => b.confidence - a.confidence ); const keep = []; while (sorted.length > 0) { // Take the highest confidence detection const best = sorted.shift(); keep.push(best); // Remove overlapping boxes using enhanced overlap check sorted = sorted.filter(box => !boxesOverlap(best.bbox, box.bbox, iouThreshold) ); } return keep; } // Main merging function - applies NMS per unified class function mergeDetections(chaitanyaResults, sohamResults) { // Step 1: Combine all detections and unify class names const allDetections = unifyClassNames([ ...chaitanyaResults, ...sohamResults ]); // Step 2: Group detections by unified class const grouped = groupByClass(allDetections); // Step 3: Apply Enhanced NMS to each class separately const final = {}; for (const [className, detections] of Object.entries(grouped)) { final[className] = enhancedNMS(detections, 0.45); } return final; }