Spaces:
Sleeping
Sleeping
license: apache-2.0 | |
tags: | |
- segmentation | |
- remove background | |
- background | |
- background-removal | |
- Pytorch | |
pretty_name: Open Remove Background Model | |
datasets: | |
- schirrmacher/humans | |
# Open Remove Background Model (ormbg) | |
[>>> DEMO <<<](https://huggingface.co/spaces/schirrmacher/ormbg) | |
 | |
This model is a **fully open-source background remover** optimized for images with humans. It is based on [Highly Accurate Dichotomous Image Segmentation research](https://github.com/xuebinqin/DIS). | |
This model is similar to [RMBG-1.4](https://huggingface.co/briaai/RMBG-1.4), but with open training data/process and commercially free to use. | |
## Inference | |
``` | |
python utils/inference.py | |
``` | |
## Training | |
The model was trained with the [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans). | |
After 10.000 iterations with a single NVIDIA GeForce RTX 4090 the following achievements were made: | |
- Training Time: 8 hours | |
- Training Loss: 0.1179 | |
- Validation Loss: 0.1284 | |
- Maximum F1 Score: 0.9928 | |
- Mean Absolute Error: 0.005 | |
Output model: `/models/ormbg.pth`. | |
## Want to train your own model? | |
Checkout _Highly Accurate Dichotomous Image Segmentation_ code: | |
``` | |
git clone https://github.com/xuebinqin/DIS.git | |
cd DIS | |
``` | |
Follow the installation instructions on https://github.com/xuebinqin/DIS?tab=readme-ov-file#1-clone-this-repo. | |
Download or create some data ([like this](https://huggingface.co/datasets/schirrmacher/humans)) and place it into the DIS project folder. | |
I am using the folder structure: | |
- training/im (images) | |
- training/gt (ground truth) | |
- validation/im (images) | |
- validation/gt (ground truth) | |
Apply this git patch for setting the right paths and remove normalization of images: | |
``` | |
git apply dis-repo.patch | |
``` | |
Start training: | |
``` | |
cd IS-Net | |
python train_valid_inference_main.py | |
``` | |
Export to ONNX (modify paths if needed): | |
``` | |
python utils/pth_to_onnx.py | |
``` | |
# Research | |
Synthetic datasets have limitations for achieving great segmentation results. This is because artificial lighting, occlusion, scale or backgrounds create a gap between synthetic and real images. A "model trained solely on synthetic data generated with naïve domain randomization struggles to generalize on the real domain", see [PEOPLESANSPEOPLE: A Synthetic Data Generator for Human-Centric Computer Vision (2022)](https://arxiv.org/pdf/2112.09290). However, hybrid training approaches seem to be promising and can even improve segmentation results. | |
Currently I am doing research how to close this gap with the resources I have. There are approaches like considering the pose of humans for improving segmentation results, see [Cross-Domain Complementary Learning Using Pose for Multi-Person Part Segmentation (2019)](https://arxiv.org/pdf/1907.05193). | |
## Support | |
This is the first iteration of the model, so there will be improvements! | |
If you identify cases were the model fails, <a href='https://huggingface.co/schirrmacher/ormbg/discussions' target='_blank'>upload your examples</a>! | |
Known issues (work in progress): | |
- close-ups: from above, from below, profile, from side | |
- minor issues with hair segmentation when hair creates loops | |
- more various backgrounds needed | |