Blackbox Logo
EVENTS
MENU

Hands-on Workshop: Give your AI Agents Eyes and Ears (Perception Layer 101)

Mar 19, 2026 6:00 PM
-
Mar 19, 2026 9:00 PM
Shibuya

​LLMs gave us reasoning. RAG gave us retrieval. Tool calling gave us action. What’s missing in the modern agent stack is perception: the ability to see, hear, and remember the world as it happens.

​This workshop is a practical walkthrough of building a perception layer for agents using VideoDB. You’ll learn how to convert continuous media (screen, mic, camera, RTSP, files) into a structured context your agent can use:

  • Indexes (searchable understanding)
  • Events (real-time triggers)
  • Memory (episodic recall with playable evidence)


We’ll implement the core loop:

Continuous Media → Perception Layer (VideoDB) → Agent (reasoning + action) → Output grounded in evidence

Who should attend:

  • ​Engineers building agents that need continuous and temporal awareness (not one-shot screenshots).
  • ​Research teams building in physical AI, desktop robots, and wearables.
  • ​Product teams building meeting bots, desktop copilots, monitoring/ops, QA/compliance
  • ​Founders building multimodal apps where “show me the moment” matters

What You’ll Discover:

  • ​What “perception” actually means for agents: continuous, temporal, multi-source, searchable, actionable.
  • ​How to support three input modes with one mental model: files, live streams, desktop capture.
  • ​How to build searchable memory so your agent can retrieve results with playable evidence, not vibes.
  • ​How to move from batch video AI to real-time event streams your agent can react to immediately.

Plus:

  • ​A starter template you can reuse: “Index + Events + Memory” as the default perception stack
  • ​Networking with builders working on agents + multimodal infra

Learn more and register here.

© 2022 Queue, Inc. All rights reserved.
Terms & Conditions