VOID VLM-Mask-Reasoner — Quadmask Generation

Generate 4-level semantic masks (quadmasks) for interaction-aware video inpainting with VOID.

Pipeline: Click points on object → SAM2 segments it → Gemini VLM reasons about interactions → SAM3 segments affected objects → Quadmask generated

Use the generated quadmask with the VOID inpainting demo.

Upload Video

Click to select primary object points (click multiple spots on the object)

Selected Points

Edit instruction — describe what to remove

Download lossless quadmask_0.mp4 (use this with VOID)

Quadmask overlay on original video

Pixel Value	Color	Meaning
0 (black)	Red overlay	Primary object to remove
63 (dark grey)	Yellow overlay	Overlap of primary + affected zone
127 (mid grey)	Green overlay	Affected region (shadows, reflections, physics)
255 (white)	Original	Background — keep as-is