AI Background Remover
Browser-based background removal using MMDetection and ONNX
AI Background Remover - Interactive Demo
This project showcases RTMDet-Tiny, an instance segmentation model deployed with MMDeploy to ONNX format and running entirely in your browser. Upload any image and watch as AI removes the background in real-time!
The model performs accurate object detection and creates precise segmentation masks to isolate subjects from their backgrounds. All processing happens client-side using WASM for optimal performance.
AI Background Remover Demo
Upload any image and watch as AI removes the background in real-time. All processing happens in your browser - no data is sent to any server.
Loading model...
Background Removal with MMDetection: Instance Segmentation in the Browser
Background removal is a common computer vision task traditionally requiring powerful servers. But with modern web technologies like ONNX Runtime Web and optimized models like RTMDet-Tiny, we can perform high-quality background removal directly in the browser.
Project Overview
Goal: A web application that removes backgrounds from images with professional quality, running entirely in the browser.
Key Features:
- Zero Server Cost - All inference runs client-side
- Instance Segmentation - Precise object detection and masking
- Optimized Performance - ~765ms inference on CPU (512x512 resolution)
- Privacy First - Images never leave your device
- Mobile Compatible - Works on phones and tablets
Technical Architecture
Model: RTMDet-Tiny
RTMDet (Real-Time Multi-object Detection) is a state-of-the-art object detection model from OpenMMLab. The "Tiny" variant offers an excellent balance between accuracy and speed:
Model Specifications:
βββ Architecture: RTMDet-Tiny (Instance Segmentation)
βββ Backbone: CSPNeXt-Tiny
βββ Neck: CSPNeXtPAFPN
βββ Head: RTMDetInsSepBNHead (Mask R-CNN style)
βββ Parameters: ~4M
βββ Model Size: ~23 MB (ONNX)Export Pipeline: MMDeploy
MMDeploy provides a streamlined pipeline to export MMDetection models to various deployment formats. For browser deployment, I used the SDK format with ONNX Runtime:
# MMDeploy export configuration
config = "configs/mmdet/instance-seg/instance-seg_onnxruntime_dynamic.py"
model_cfg = "rtmdet-ins_tiny_8xb32-300e_coco.py"
checkpoint = "rtmdet-ins_tiny_8xb32-300e_coco_20221130_151727-ec670f7e.pth"
# Export command
python tools/deploy.py \
{config} \
{model_cfg} \
{checkpoint} \
test_image.jpg \
--work-dir output \
--device cpu \
--dump-infoSDK Format Benefits:
- End-to-end processing (preprocessing + inference + NMS + mask generation)
- Dynamic shapes for variable detection counts
- Optimized for inference (no training components)
- Output format:
dets,labels,masks
Browser Optimization
To achieve smooth performance in the browser, several optimizations were applied:
1. Resolution Optimization
// Reduced from 640x640 to 512x512 for 6% speedup
const TARGET_SIZE = 512;2. WASM SIMD Acceleration
ort.env.wasm.numThreads = 1;
ort.env.wasm.simd = true; // Enable SIMD for faster computation3. Preprocessing Pipeline
function preprocess(img, targetSize) {
// Letterbox resizing (maintain aspect ratio)
const scale = Math.min(targetSize / img.width, targetSize / img.height);
const newW = Math.round(img.width * scale);
const newH = Math.round(img.height * scale);
// Pad to target size with gray (114, 114, 114)
// Normalize: (pixel - mean) / std
const mean = [103.53, 116.28, 123.675];
const std = [57.375, 57.12, 58.395];
// Convert to CHW format (channels-first)
// Output shape: [1, 3, 512, 512]
}Performance
Tested with different image types:
Test Results:
- Portrait Photos - Accurate person detection with detailed edges
- Vehicles - Clean car/vehicle segmentation
- Animals - Precise fur texture preservation
- Complex Scenes - Handles multiple objects and overlapping elements
Implementation Details
Model Loading
async function loadModel() {
const response = await fetch('rtmdet_tiny_sdk.onnx');
const buffer = await response.arrayBuffer();
const session = await ort.InferenceSession.create(buffer, {
executionProviders: ['wasm'],
graphOptimizationLevel: 'all'
});
return session;
}Inference Pipeline
async function runInference(session, imageElement) {
// 1. Preprocess
const { tensor, scale, padW, padH } = preprocess(imageElement, 512);
// 2. Run inference
const feeds = { [session.inputNames[0]]: tensor };
const results = await session.run(feeds);
// 3. Parse outputs (SDK format)
const dets = results.dets.data; // [N, 5] - bboxes + scores
const labels = results.labels.data; // [N] - class labels
const masks = results.masks.data; // [N, H, W] - segmentation masks
// 4. Postprocess and visualize
return { dets, labels, masks, scale, padW, padH };
}Mask Application
function applyMask(originalImage, mask, scale, padW, padH) {
const canvas = document.createElement('canvas');
canvas.width = originalImage.width;
canvas.height = originalImage.height;
const ctx = canvas.getContext('2d');
// Draw original image
ctx.drawImage(originalImage, 0, 0);
// Get image data
const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
const pixels = imageData.data;
// Apply mask to alpha channel
for (let y = 0; y < canvas.height; y++) {
for (let x = 0; x < canvas.width; x++) {
const maskValue = getMaskValue(mask, x, y, scale, padW, padH);
const pixelIndex = (y * canvas.width + x) * 4;
pixels[pixelIndex + 3] = maskValue > 0.5 ? 255 : 0;
}
}
ctx.putImageData(imageData, 0, 0);
return canvas;
}Use Cases
This technology enables various applications:
- Photography - Remove distracting backgrounds from portraits
- E-commerce - Create product images with clean white backgrounds
- Social Media - Generate profile pictures and stickers
- Presentations - Extract subjects for slides and documents
- Creative Projects - Composite images and artistic effects
Challenges & Solutions
Challenge 1: Model Size
Problem: Original model too large for web deployment
Solution: Used RTMDet-Tiny variant (23 MB), acceptable for web
Challenge 2: Inference Speed
Problem: Initial 640x640 resolution too slow
Solution: Optimized to 512x512 with minimal quality loss (6% speedup)
Challenge 3: Mobile Compatibility
Problem: WebGL not universally supported
Solution: Used WASM backend for broader compatibility
Challenge 4: Memory Management
Problem: Large images causing browser crashes
Solution: Resize input images before processing, limit max resolution
Future Improvements
Potential enhancements for this project:
- Multi-resolution Support - Adaptive resolution based on device capabilities
- Batch Processing - Process multiple images sequentially
- Edge Refinement - Post-processing for smoother edges
- Background Replacement - Not just remove, but replace with custom backgrounds
- Video Support - Real-time background removal for video streams
- Model Selection - Multiple model sizes for speed/quality tradeoff
Deployment Considerations
Web Hosting
This project requires only static file hosting:
- GitHub Pages (free)
- Netlify / Vercel (free tier)
- Cloudflare Pages (free)
- S3 + CloudFront
CORS Configuration
Ensure proper CORS headers for ONNX model loading:
// Server configuration
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GETCaching Strategy
The ONNX model is large, so aggressive caching is beneficial:
// Service Worker caching
cache.addAll([
'/demos/background-remover/model.onnx',
'/demos/background-remover/demo.js'
]);Test Results Gallery
The model has been extensively tested with various image types. Below are visual comparisons showing the original image, segmentation mask, and final background-removed result for each test case:
Vehicle Segmentation

Clean segmentation of vehicle with precise edge detection. The model successfully separates the car from background elements.
Portrait Segmentation


Accurate person detection with detailed masks including hair and body contours. Works well with different poses and backgrounds.
Animal Segmentation

Precise animal detection capturing fine details like fur texture. Model handles animal segmentation on par with humans and vehicles.
Complex Scene

Multi-object scene demonstrating the model's ability to handle overlapping objects and complex backgrounds.
Resources & References
- MMDetection - Object detection framework
- MMDeploy - Model deployment toolkit
- ONNX Runtime Web - Browser inference
- RTMDet Paper - Model architecture details
- GitHub Repository - Full source code and test notebooks
Conclusion
This project demonstrates that sophisticated computer vision tasks like instance segmentation and background removal can run efficiently in web browsers. By combining optimized models (RTMDet-Tiny), efficient formats (ONNX), and modern web technologies (WASM), we can create powerful applications that respect user privacy, reduce costs, and provide instant results.
The future of AI is increasingly moving to the edge, and browsers are becoming a capable platform for running machine learning models. This project is just one example of what's possible today.
Try the demo above with your own images and experience AI-powered background removal running entirely on your device!