2025

2025

2025

Combining Data Sources Without the Pain: How Milo’s Query Engine Works

Most analytics teams hit the same wall: data lives everywhere. Customer records in Postgres, financials in a spreadsheet, behavioral data in a data warehouse. Joining them means either long engineering cycles or fragile ad-hoc scripts. The result? Analysts spend more time moving and cleaning data than analyzing it.

Phillipp Buschhaus

Co-Founder & CTO

Why Combining Data Sources Is Hard

Traditional BI tools typically assume a single clean source. The moment you need to cross-analyze Postgres tables with spreadsheet data or mix data warehouse facts with operational logs, you end up writing glue code. That often means:

  • Exporting CSVs and manually joining them

  • Building one-off ETL pipelines

  • Writing Python scripts for transformations and statistical cleanup

These approaches are slow, brittle, and distract from the actual business question.

Enter Milo’s Query Engine

To solve this, we built a multi-source query engine directly into Milo. It lets you run a single SQL-like query across heterogeneous data sources, without standing up a separate ETL pipeline.

Here’s how it works under the hood:

  • Adapted SQL dialect: We extended SQL to allow joins and filters across sources (Postgres, spreadsheets, warehouses).

  • DuckDB foundation: The engine is built on DuckDB, a high-performance in-memory analytics database. It parses queries and prunes them to read only the minimum data required.

  • Containerized execution: Queries run inside the same container that executes Milo’s Python code, ensuring consistency, isolation, and performance.

The result is fast, efficient, persistent queries across multiple data sources, without writing Python unless you want to.

Three Ways Milo Gets Data

The new query engine is part of a broader toolkit. Milo dynamically decides the best approach depending on the problem:

  1. Direct SQL queries

  • Runs native SQL against a single database.

  • Fastest option for same-source data without transformations.

  1. Python code generation

  • When data is messy or external APIs are needed, Milo writes Python.

  • This enables data cleaning, enrichment, statistical analysis, and custom transformations.

  1. The Query Engine

  • The sweet spot for cross-source joins.

  • Lets you query Postgres + spreadsheets + warehouse tables together in one statement.

  • No Python boilerplate, no ad-hoc pipelines.

By switching intelligently between these three modes, Milo balances speed, flexibility, and reliability.

Why This Matters

For data teams and business users, the implications are big:

  • Less engineering overhead: Analysts no longer need to stitch sources with custom code.

  • Faster insights: DuckDB’s in-memory execution minimizes latency by scanning only what’s needed.

  • Cleaner workflows: Use Python only when data requires heavy cleaning or statistical modeling.

  • Scalable governance: All queries run in Milo’s controlled environment, making them auditable and repeatable.

The Future of Multi-Source Analytics

Modern businesses rarely have a single source of truth. The ability to query across sources efficiently, without building fragile pipelines, is becoming a baseline requirement.

Milo’s query engine makes multi-source analytics as simple as writing a Prompt. Combine Postgres with spreadsheets, or join CRM data from your warehouse with operational logs, all inside one prompt. When data is clean, Milo uses SQL. When it’s messy, it switches to Python. And when you need cross-source joins, the new query engine takes over.

It’s the shortest path from data everywhere to answers anywhere.

More insights to explore

Discover additional case studies, tips, and strategies to help your business grow and innovate faster.

Minimalist Note Card Composition

2025

The Real Win in Operations: Getting Instant Answers, Not Better Dashboards

"How do I get to the point where it just tells me what I need to know?" That's the question I hear from every operations leader we speak to. Not "give me more reports." Not "help me build a dashboard." Just: tell me what matters, now, without me having to dig for it or ask some one else...

Minimalist Note Card Composition

2025

The Real Win in Operations: Getting Instant Answers, Not Better Dashboards

"How do I get to the point where it just tells me what I need to know?" That's the question I hear from every operations leader we speak to. Not "give me more reports." Not "help me build a dashboard." Just: tell me what matters, now, without me having to dig for it or ask some one else...

Minimalist Note Card Composition

2025

The Real Win in Operations: Getting Instant Answers, Not Better Dashboards

"How do I get to the point where it just tells me what I need to know?" That's the question I hear from every operations leader we speak to. Not "give me more reports." Not "help me build a dashboard." Just: tell me what matters, now, without me having to dig for it or ask some one else...

Green Plant

2025

The CEO Who Stopped Asking for Permission to See Her Own Data

Even with a full data team and tools, CEOs are bottlenecked from accessing basic metrics, stuck waiting days for answers. This access barrier hinders decisions and wastes analyst time. The solution is true self-serve analytics, giving leaders instant, plain-language data access to speed up the business and empower analysts.

Green Plant

2025

The CEO Who Stopped Asking for Permission to See Her Own Data

Even with a full data team and tools, CEOs are bottlenecked from accessing basic metrics, stuck waiting days for answers. This access barrier hinders decisions and wastes analyst time. The solution is true self-serve analytics, giving leaders instant, plain-language data access to speed up the business and empower analysts.

Green Plant

2025

The CEO Who Stopped Asking for Permission to See Her Own Data

Even with a full data team and tools, CEOs are bottlenecked from accessing basic metrics, stuck waiting days for answers. This access barrier hinders decisions and wastes analyst time. The solution is true self-serve analytics, giving leaders instant, plain-language data access to speed up the business and empower analysts.

Create a free website with Framer, the website builder loved by startups, designers and agencies.