Sandbox Guide — VideoDB Global Online Hackathon, May 2026

Overview

For the hackathon, VideoDB has unlocked selected models so you can build with video indexing and GenAI workflows. Use the VideoDB SDK to ingest media, index it with AI, and expose video/audio perception to your agent or application.

Pipe in live streams, uploaded files, RTSP feeds, YouTube links, or any continuous media source. Build spoken-word, scene, and visual indexes; search across ingested media; compose clips; trigger events; and wire responses into Slack, web apps, or phone workflows.

A VideoDB sandbox is a dedicated compute pool for model workloads. Create one sandbox, wait for it to become active, pass its sandbox_id into supported generation/indexing APIs, then stop it when you're finished to conserve credits.

Try the notebook first

Recommended Start with the runnable sandbox notebook before copying snippets from this guide. It walks through creating a sandbox, scene indexing, OmniVoice, FLUX, combining assets, and stopping compute billing end-to-end.

Open the sandbox compute notebook in the VideoDB cookbook hackathon branch, or launch it directly in Google Colab.

Prerequisites

Install the hackathon branch of the VideoDB SDK:

shell

!pip install "git+https://github.com/Video-DB/videodb-python.git@hackathon"

python

from videodb import connect, SandboxModel, SandboxTier, IndexType, SearchType, SceneExtractionType, play_stream

conn = connect()
coll = conn.get_collection()

Your environment should include the normal VideoDB credentials/API key expected by videodb.connect().

Sandbox lifecycle

1. Create a sandbox

python

sandbox = conn.create_sandbox(
    tier=SandboxTier.medium,
    idle_timeout=600,    # stop after 10 minutes of inactivity
)
print(f"Sandbox: {sandbox.id}, Status: {sandbox.status}, Tier: {sandbox.tier}")

Sandbox creation returns immediately, usually with status provisioning. Use the idle_timeout parameter to automatically stop the sandbox after a period of inactivity and conserve credits.

2. Wait until ready

python

sandbox.wait_for_ready(timeout=300, interval=5)
print(f"Sandbox ready: {sandbox.id}, Status: {sandbox.status}")

Only run jobs after sandbox.status == "active" or sandbox.is_active is true.

3. Reuse the sandbox ID

Pass the same sandbox ID to supported indexing and generation APIs:

python

sandbox_id = sandbox.id

If sandbox_id is omitted, the server may try to auto-resolve a compatible active sandbox, but explicitly passing sandbox_id is recommended for predictable routing.

4. Inspect sandboxes

python

# Refresh one sandbox
sandbox.refresh()
print(sandbox.status, sandbox.is_active)

# List all sandboxes
for sb in conn.list_sandboxes():
    print(f"{sb.id} | {sb.name} | {sb.tier} | {sb.status}")

# Get one sandbox by ID
sb = conn.get_sandbox(sandbox.id)
print(sb.id, sb.status)

5. Stop the sandbox

Stop the sandbox when finished. Billing is based on sandbox runtime.

python

sandbox.stop()
sandbox.wait_for_stop(timeout=120)
print(f"Sandbox {sandbox.id} final status: {sandbox.status}")

Sandbox tiers and supported models

Use the smallest tier that supports your selected model. The notebook uses SandboxModel enum constants so examples stay aligned with the SDK.

Model enum	Use case	Minimum tier
`SandboxModel.GEMMA_4_E2B`	Scene indexing / faster visual understanding	`SandboxTier.small`
`SandboxModel.QWEN_9B`	Scene indexing / smaller VLM option	`SandboxTier.small`
`SandboxModel.GEMMA_4_26B`	Scene indexing / higher quality visual understanding	`SandboxTier.medium`
`SandboxModel.QWEN_27B`	Scene indexing / larger VLM option	`SandboxTier.medium`
`SandboxModel.GEMMA_4_31B`	Scene indexing / best fit for the notebook demo	`SandboxTier.medium`
`SandboxModel.OMNIVOICE`	Text-to-speech, voice design, and voice clone	`SandboxTier.small`
`SandboxModel.FLUX`	FLUX image generation	`SandboxTier.medium`

Supported workloads

Workload	API	Model	Notes
Scene indexing / VLM extraction	`video.index_scenes(...)`	`SandboxModel.GEMMA_4_31B` or another supported VLM enum	Use with `SceneExtractionType` and an extraction prompt. Pick a tier that fits the model.
RTStream visual indexing	`rtstream.index_visuals(...)`	`SandboxModel.GEMMA_4_31B` or another supported VLM enum	For live RTSP / RTMP / capture streams. Pass `sandbox_id=sandbox.id` just like video scene indexing.
Text-to-speech	`coll.generate_voice(...)`	`SandboxModel.OMNIVOICE`	Supports basic TTS, voice design, voice clone, and extra config. Small tier is usually suitable.
Image generation	`coll.generate_image(...)`	`SandboxModel.FLUX`	Supports config such as size, inference steps, guidance scale, negative prompt. Medium tier recommended.

Scene indexing example

python

video = coll.upload("https://www.youtube.com/watch?v=jeA-KBv0b68")

index_id = video.index_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={
        "time": 10,
        "select_frames": ["first"],
        "frame_count": 1,
    },
    model_name=SandboxModel.GEMMA_4_31B,
    prompt="Describe the scene in a clear, concise way.",
    sandbox_id=sandbox.id,
)

idx = video.get_scene_index(index_id)
print(idx)

res = video.search("Claude", index_type=IndexType.scene, search_type=SearchType.semantic)
stream_url = res.compile()
play_stream(stream_url)

RTStream indexing

Sandbox-backed models are also available for RTStream indexing. Same lifecycle: create a sandbox, wait until active, pass sandbox_id=sandbox.id, and stop it when finished.

Visual indexing

python

rtstream = coll.connect_rtstream(
    url="rtsp://your-camera-or-stream-url",
    name="Hackathon Live Stream",
    media_types=["video"],
    store=True,
)

rtstream.start()

visual_index = rtstream.index_visuals(
    prompt="Describe what is happening in the live video. Return concise observations.",
    batch_config={"type": "time", "value": 5, "frame_count": 3},
    model_name=SandboxModel.GEMMA_4_31B,
    sandbox_id=sandbox.id,
    name="live_visual_index",
)

Audio indexing

python

audio_index = rtstream.index_audio(
    prompt="Summarize the important spoken content and events.",
    batch_config={"type": "time", "value": 30},
    model_name=SandboxModel.QWEN_9B,
    sandbox_id=sandbox.id,
    name="live_audio_index",
)

Stop the RTStream and sandbox when you're done:

python

rtstream.stop()
sandbox.stop()

OmniVoice examples

Basic TTS

python

job = coll.generate_voice(
    text="Hello, welcome to VideoDB.",
    model_name=SandboxModel.OMNIVOICE,
    sandbox_id=sandbox.id,
)

audio = job.wait(timeout=900, interval=5)
print(audio.id)

Voice design

python

job = coll.generate_voice(
    text="Breaking news! Scientists discover a new planet.",
    model_name=SandboxModel.OMNIVOICE,
    sandbox_id=sandbox.id,
    config={
        "instructions": "A deep, authoritative male news anchor voice",
    },
)

Voice clone

python

ref_audio = coll.upload(
    url="https://www.youtube.com/shorts/7xOPzBhHKWY",
    media_type="audio",
)

job = coll.generate_voice(
    text="This is a cloned voice powered by OmniVoice.",
    model_name=SandboxModel.OMNIVOICE,
    sandbox_id=sandbox.id,
    config={
        "ref_audio": ref_audio.generate_url(),
        "ref_text": "Sample reference text for the audio clip",
    },
)

Extra TTS config

python

job = coll.generate_voice(
    text="Hola, bienvenidos a VideoDB.",
    model_name=SandboxModel.OMNIVOICE,
    sandbox_id=sandbox.id,
    config={
        "response_format": "wav",
        "speed": 1.2,
        "language": "es",
    },
)

FLUX examples

Basic image generation

python

job = coll.generate_image(
    prompt="A futuristic cityscape at sunset, cyberpunk style",
    model_name=SandboxModel.FLUX,
    sandbox_id=sandbox.id,
)

image = job.wait(timeout=900, interval=5)
print(image.id)

Image generation with config

python

job = coll.generate_image(
    prompt="A photorealistic portrait of a robot reading a book in a cozy library",
    model_name=SandboxModel.FLUX,
    sandbox_id=sandbox.id,
    config={
        "size": "1024x1536",
        "num_inference_steps": 50,
        "guidance_scale": 4.0,
        "negative_prompt": "blurry, low quality, watermark",
    },
)

Combining generated assets

You can generate a FLUX image and OmniVoice narration on the same sandbox, then compose them with videodb.editor:

python

from videodb.editor import Timeline, Track, Clip, ImageAsset, AudioAsset, Fit

image_job = coll.generate_image(
    prompt="A dramatic mountain landscape at dawn",
    model_name=SandboxModel.FLUX,
    sandbox_id=sandbox.id,
    config={"size": "1280x720", "num_inference_steps": 28},
)
image = image_job.wait(timeout=900, interval=5)

audio_job = coll.generate_voice(
    text="Witness the breathtaking beauty of dawn over the mountains.",
    model_name=SandboxModel.OMNIVOICE,
    sandbox_id=sandbox.id,
    config={"instructions": "female, young adult, calm and cinematic"},
)
audio = audio_job.wait(timeout=900, interval=5)

timeline = Timeline(conn)
timeline.resolution = "1280x720"
timeline.background = "#000000"

image_track = Track()
image_track.add_clip(0, Clip(asset=ImageAsset(id=image.id), duration=float(audio.length), fit=Fit.crop))

audio_track = Track()
audio_track.add_clip(0, Clip(asset=AudioAsset(id=audio.id), duration=float(audio.length)))

timeline.add_track(image_track)
timeline.add_track(audio_track)

stream_url = timeline.generate_stream()
player_url = f"https://player.videodb.io/watch?v={stream_url}"
print(player_url)

Pricing and limits

Hackathon sandbox compute is charged against your credits based on runtime.

Pricing

Sandbox tier	Price
`small`	$1 / hour
`medium`	$3.50 / hour

Concurrent sandbox limits

Sandbox tier	Parallel sandbox limit
`small`	4
`medium`	2

Need more? If your project needs higher limits, please contact the VideoDB team at team@videodb.io.

Hackathon credits Every registered team gets $1,000 in sandbox compute credits to spend across the weekend. The unlock link goes live 1 hour before kickoff.

Unlock $1,000 credits →

Best practices

Create one sandbox per session/workflow and reuse it for compatible jobs.
Always wait for the sandbox to be active before submitting jobs.
Pass sandbox_id=sandbox.id explicitly for sandbox-backed jobs.
Select a tier based on the heaviest model in your workflow.
Use job.wait(timeout=900, interval=5) for long-running generation jobs.
Stop the sandbox after use to avoid unnecessary runtime billing and conserve your hackathon credits.
Keep the sandbox ID in logs so jobs can be debugged or retried.

Need help?

If you face any issue with sandbox setup, model access, indexing, generation, or credits, please reach out to the VideoDB team at team@videodb.io or drop your queries in the hackathon Discord — fastest way to get unblocked. If you're just getting started, try the sandbox notebook in Colab first as the reference implementation.