Agentic Visual Reporting
Winner of the IEEE VIS 2025 VISxGenAI Challenge. An agentic system addressing the trade-off between speed and reliability in data analysis. By combining AI for creative tasks with deterministic modules for visualization, it generates both explorable reports for readers and verifiable notebooks for analysts. This hybrid approach explores a more transparent and adaptable model for human-AI partnership in generating insights.
Most data analysis workflows force a difficult trade-off between speed and reliability. Manual analysis produces trustworthy results but doesn’t scale, while automated tools are fast but often yield black-box outputs that are difficult to verify or adapt. This project explores a different approach to resolve this tension.
The system uses a pipeline of eleven specialized AI agents, but its core design principle is a hybrid architecture. It delegates creative and interpretive work—like planning insights and generating narratives—to AI, while assigning tasks that demand precision and consistency to deterministic components. This separation of concerns is fundamental to producing results that are both insightful and verifiable.
The result is a set of two complementary outputs that serve different needs. For readers, it generates interactive web reports for exploring data beyond the initial analysis. For analysts, it produces executable
The system is built with the principles of real-world observability and performance in mind. Every AI interaction and data transformation is tracked using tools like
Invited for live presentation at the IEEE VISxGenAI 2025 Workshop Challenge, the project serves as a prototype for how AI can augment rather than replace human analytical capabilities. It’s an exploration of how modular design, AI orchestration, and modern engineering practices can be combined to build tools that are not just powerful, but also transparent and adaptable.
Stack
While the problem is more important than the tools, the tech stack tells a story about the project's architecture and trade-offs. Here's what this project is built on:
Platforms & Runtimes
Runs the 11-agent orchestration, data processing, report generation pipeline and provenance tracking notebooks
Used as the browser-compatible Python runtime of dynamically-generated Marimo notebooks
Solves dynamically-constructed logic programs for finding optimal visualizations adhering to human-centric design guidelines expressed as formal constraints
Executes documentation and Observable build tools for report artifacts and site generation
Powers the Observable Notebook builder service as well as the VitePress configuration for generating docs as static site
Powers the programmatically-constructed Observable Notebooks, making generated reports interactive
Frontend & Visualization
Canvas for interactive HTML reports that's convenient for analysts to modify and for readers to explore
Enables reader-driven exploration with coordinated views and cross-filtering in generated reports
Synthesizes visualization specifications from constraints to produce principled charts in reports in the matter of milliseconds
Renders charts defined by Draco-derived specifications for final report visuals
Powers interactive components within the documentation site and gallery
Builds the documentation site and gallery as a statically generated web experience
AI & Machine Learning
Orchestrates 11 specialized AI agents with modular prompts and tracing for automated report generation
Powers dataset description, insight planning, and narrative generation steps within the agent pipeline
Supports coding-heavy agent tasks alongside other LLM providers with interchangeable routing
Provides LLM inference with native Google Search tool for mapping codes to human-readable labels
Provides an additional managed LLM inference backend used selectively for agent runs
Aggregates alternative LLMs for flexible model selection across agent stages via uniform API
Data Engineering
Powers both on-server and in-browser SQL queries for interactive data exploration in generated reports via WASM runtime
Applies transformations to the raw input dataset automatically based on agent-generated metadata
Used for numerical operations when computing dataset statistics to inform insight discovery
Stores dataset artifacts so that they can be efficiently queried and processed even outside the context of the agent pipeline
Used to read remote datasets in Pyodide-backed notebooks to work around networking limitations of in-browser DuckDB and Polars
Used to pass data between different engines without memory copies, e.g. loading data into DuckDB but then processing it with Polars
Backend & APIs
Validates and serializes agent inputs/outputs as well as allows attribute-driven prompt design
Used to collect agent traces in an industry-standard format for observability and debugging
Used as the Python API to render Vega-Lite visualizations from Draco specifications
External Services
Tracks traces, token usage, and latency across all agents for observability and QA of generations
Used for managing API keys used by agents when submitting to the challenge's evaluation server
Provides web analytics capabilities for tracking user interactions and engagement on the documentation site
Cloud & DevOps
Hosts report assets and artifacts using S3-compatible storage with CDN links referenced in outputs
Self-hosts Langfuse, Infisical, Umami instances and Observable Notebook Builder Node.js service
Used for containerized deployment of the Observable Notebook Builder Node.js service on Coolify
Automates building the VitePress documentation site and deploying it to GitHub Pages
Development Tooling
Installs and resolves Python dependencies rapidly for local dev and reproducible runs
Lints Python sources for the agentic system to maintain code quality
Notebook environment that exposes end-to-end pipelines and enables interactive & reactive agent development and debugging
Installs and manages Node.js dependencies for docs and report build tooling
Builds and previews the documentation site and related frontend assets
Formats and lints JavaScript/TypeScript code for the documentation and build tooling