AI Background Remover - Interactive Demo

This project showcases RTMDet-Tiny, an instance segmentation model deployed with MMDeploy to ONNX format and running entirely in your browser. Upload any image and watch as AI removes the background in real-time!

The model performs accurate object detection and creates precise segmentation masks to isolate subjects from their backgrounds. All processing happens client-side using WASM for optimal performance.

AI Background Remover Demo

Upload any image and watch as AI removes the background in real-time. All processing happens in your browser - no data is sent to any server.

Loading model...

Background Removal with MMDetection: Instance Segmentation in the Browser

Background removal is a common computer vision task traditionally requiring powerful servers. But with modern web technologies like ONNX Runtime Web and optimized models like RTMDet-Tiny, we can perform high-quality background removal directly in the browser.

Project Overview

Goal: A web application that removes backgrounds from images with professional quality, running entirely in the browser.

Key Features:

Zero Server Cost - All inference runs client-side
Instance Segmentation - Precise object detection and masking
Optimized Performance - ~765ms inference on CPU (512x512 resolution)
Privacy First - Images never leave your device
Mobile Compatible - Works on phones and tablets

Technical Architecture

Model: RTMDet-Tiny

RTMDet (Real-Time Multi-object Detection) is a state-of-the-art object detection model from OpenMMLab. The "Tiny" variant offers an excellent balance between accuracy and speed:

Model Specifications:
├── Architecture: RTMDet-Tiny (Instance Segmentation)
├── Backbone: CSPNeXt-Tiny
├── Neck: CSPNeXtPAFPN
├── Head: RTMDetInsSepBNHead (Mask R-CNN style)
├── Parameters: ~4M
└── Model Size: ~23 MB (ONNX)

Export Pipeline: MMDeploy

MMDeploy provides a streamlined pipeline to export MMDetection models to various deployment formats. For browser deployment, I used the SDK format with ONNX Runtime:

# MMDeploy export configuration
config = "configs/mmdet/instance-seg/instance-seg_onnxruntime_dynamic.py"
model_cfg = "rtmdet-ins_tiny_8xb32-300e_coco.py"
checkpoint = "rtmdet-ins_tiny_8xb32-300e_coco_20221130_151727-ec670f7e.pth"
 
# Export command
python tools/deploy.py \
    {config} \
    {model_cfg} \
    {checkpoint} \
    test_image.jpg \
    --work-dir output \
    --device cpu \
    --dump-info

SDK Format Benefits:

End-to-end processing (preprocessing + inference + NMS + mask generation)
Dynamic shapes for variable detection counts
Optimized for inference (no training components)
Output format: dets, labels, masks

Browser Optimization

To achieve smooth performance in the browser, several optimizations were applied:

1. Resolution Optimization

// Reduced from 640x640 to 512x512 for 6% speedup
const TARGET_SIZE = 512;

2. WASM SIMD Acceleration

ort.env.wasm.numThreads = 1;
ort.env.wasm.simd = true;  // Enable SIMD for faster computation

3. Preprocessing Pipeline

function preprocess(img, targetSize) {
    // Letterbox resizing (maintain aspect ratio)
    const scale = Math.min(targetSize / img.width, targetSize / img.height);
    const newW = Math.round(img.width * scale);
    const newH = Math.round(img.height * scale);
    
    // Pad to target size with gray (114, 114, 114)
    // Normalize: (pixel - mean) / std
    const mean = [103.53, 116.28, 123.675];
    const std = [57.375, 57.12, 58.395];
    
    // Convert to CHW format (channels-first)
    // Output shape: [1, 3, 512, 512]
}

Performance

Tested with different image types:

Test Results:

Portrait Photos - Accurate person detection with detailed edges
Vehicles - Clean car/vehicle segmentation
Animals - Precise fur texture preservation
Complex Scenes - Handles multiple objects and overlapping elements

Implementation Details

Model Loading

async function loadModel() {
    const response = await fetch('rtmdet_tiny_sdk.onnx');
    const buffer = await response.arrayBuffer();
    
    const session = await ort.InferenceSession.create(buffer, {
        executionProviders: ['wasm'],
        graphOptimizationLevel: 'all'
    });
    
    return session;
}

Inference Pipeline

async function runInference(session, imageElement) {
    // 1. Preprocess
    const { tensor, scale, padW, padH } = preprocess(imageElement, 512);
    
    // 2. Run inference
    const feeds = { [session.inputNames[0]]: tensor };
    const results = await session.run(feeds);
    
    // 3. Parse outputs (SDK format)
    const dets = results.dets.data;      // [N, 5] - bboxes + scores
    const labels = results.labels.data;  // [N] - class labels
    const masks = results.masks.data;    // [N, H, W] - segmentation masks
    
    // 4. Postprocess and visualize
    return { dets, labels, masks, scale, padW, padH };
}

Mask Application

function applyMask(originalImage, mask, scale, padW, padH) {
    const canvas = document.createElement('canvas');
    canvas.width = originalImage.width;
    canvas.height = originalImage.height;
    const ctx = canvas.getContext('2d');
    
    // Draw original image
    ctx.drawImage(originalImage, 0, 0);
    
    // Get image data
    const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
    const pixels = imageData.data;
    
    // Apply mask to alpha channel
    for (let y = 0; y < canvas.height; y++) {
        for (let x = 0; x < canvas.width; x++) {
            const maskValue = getMaskValue(mask, x, y, scale, padW, padH);
            const pixelIndex = (y * canvas.width + x) * 4;
            pixels[pixelIndex + 3] = maskValue > 0.5 ? 255 : 0;
        }
    }
    
    ctx.putImageData(imageData, 0, 0);
    return canvas;
}

Use Cases

This technology enables various applications:

Photography - Remove distracting backgrounds from portraits
E-commerce - Create product images with clean white backgrounds
Social Media - Generate profile pictures and stickers
Presentations - Extract subjects for slides and documents
Creative Projects - Composite images and artistic effects

Challenges & Solutions

Challenge 1: Model Size

Problem: Original model too large for web deployment
Solution: Used RTMDet-Tiny variant (23 MB), acceptable for web

Challenge 2: Inference Speed

Problem: Initial 640x640 resolution too slow
Solution: Optimized to 512x512 with minimal quality loss (6% speedup)

Challenge 3: Mobile Compatibility

Problem: WebGL not universally supported
Solution: Used WASM backend for broader compatibility

Challenge 4: Memory Management

Problem: Large images causing browser crashes
Solution: Resize input images before processing, limit max resolution

Future Improvements

Potential enhancements for this project:

Multi-resolution Support - Adaptive resolution based on device capabilities
Batch Processing - Process multiple images sequentially
Edge Refinement - Post-processing for smoother edges
Background Replacement - Not just remove, but replace with custom backgrounds
Video Support - Real-time background removal for video streams
Model Selection - Multiple model sizes for speed/quality tradeoff

Deployment Considerations

Web Hosting

This project requires only static file hosting:

GitHub Pages (free)
Netlify / Vercel (free tier)
Cloudflare Pages (free)
S3 + CloudFront

CORS Configuration

Ensure proper CORS headers for ONNX model loading:

// Server configuration
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET

Caching Strategy

The ONNX model is large, so aggressive caching is beneficial:

// Service Worker caching
cache.addAll([
    '/demos/background-remover/model.onnx',
    '/demos/background-remover/demo.js'
]);

Test Results Gallery

The model has been extensively tested with various image types. Below are visual comparisons showing the original image, segmentation mask, and final background-removed result for each test case:

Vehicle Segmentation

Car Segmentation Test

Clean segmentation of vehicle with precise edge detection. The model successfully separates the car from background elements.

Portrait Segmentation

Human 1 Segmentation

Human 2 Segmentation

Accurate person detection with detailed masks including hair and body contours. Works well with different poses and backgrounds.

Animal Segmentation

Dog Segmentation

Precise animal detection capturing fine details like fur texture. Model handles animal segmentation on par with humans and vehicles.

Complex Scene

Complex Scene Segmentation

Multi-object scene demonstrating the model's ability to handle overlapping objects and complex backgrounds.

Resources & References

MMDetection - Object detection framework
MMDeploy - Model deployment toolkit
ONNX Runtime Web - Browser inference
RTMDet Paper - Model architecture details
GitHub Repository - Full source code and test notebooks

Conclusion

This project demonstrates that sophisticated computer vision tasks like instance segmentation and background removal can run efficiently in web browsers. By combining optimized models (RTMDet-Tiny), efficient formats (ONNX), and modern web technologies (WASM), we can create powerful applications that respect user privacy, reduce costs, and provide instant results.

The future of AI is increasingly moving to the edge, and browsers are becoming a capable platform for running machine learning models. This project is just one example of what's possible today.

Try the demo above with your own images and experience AI-powered background removal running entirely on your device!