VOID VLM-Mask-Reasoner — Quadmask Generation
Generate 4-level semantic masks (quadmasks) for interaction-aware video inpainting with VOID.
Pipeline: Click points on object → SAM2 segments it → Gemini VLM reasons about interactions → SAM3 segments affected objects → Quadmask generated
Use the generated quadmask with the VOID inpainting demo.
Quadmask format
| Pixel Value | Color | Meaning |
|---|---|---|
| 0 (black) | Red overlay | Primary object to remove |
| 63 (dark grey) | Yellow overlay | Overlap of primary + affected zone |
| 127 (mid grey) | Green overlay | Affected region (shadows, reflections, physics) |
| 255 (white) | Original | Background — keep as-is |