Combining Data Sources Without the Pain: How Milo’s Query Engine Works

Date Posted

September 29, 2025

Post Category

AI Technology

Written by

Philipp Buschhaus

Reading Time

7 min.

Why Combining Data Sources Is Hard

Traditional BI tools typically assume a single clean source. The moment you need to cross-analyze Postgres tables with spreadsheet data or mix data warehouse facts with operational logs, you end up writing glue code. That often means:

Exporting CSVs and manually joining them

Building one-off ETL pipelines

Writing Python scripts for transformations and statistical cleanup

These approaches are slow, brittle, and distract from the actual business question.

Enter Milo’s Query Engine

To solve this, we built a multi-source query engine directly into Milo. It lets you run a single SQL-like query across heterogeneous data sources, without standing up a separate ETL pipeline.

Here’s how it works under the hood:

Adapted SQL dialect: We extended SQL to allow joins and filters across sources (Postgres, spreadsheets, warehouses).

DuckDB foundation: The engine is built on DuckDB, a high-performance in-memory analytics database. It parses queries and prunes them to read only the minimum data required.

Containerized execution: Queries run inside the same container that executes Milo’s Python code, ensuring consistency, isolation, and performance.

The result is fast, efficient, persistent queries across multiple data sources, without writing Python unless you want to.

Three Ways Milo Gets Data

The new query engine is part of a broader toolkit. Milo dynamically decides the best approach depending on the problem:

Direct SQL queries

Runs native SQL against a single database.

Fastest option for same-source data without transformations.

Python code generation

When data is messy or external APIs are needed, Milo writes Python.

This enables data cleaning, enrichment, statistical analysis, and custom transformations.

The Query Engine

The sweet spot for cross-source joins.

Lets you query Postgres + spreadsheets + warehouse tables together in one statement.

No Python boilerplate, no ad-hoc pipelines.

By switching intelligently between these three modes, Milo balances speed, flexibility, and reliability.

Why This Matters

For data teams and business users, the implications are big:

Less engineering overhead: Analysts no longer need to stitch sources with custom code.

Faster insights: DuckDB’s in-memory execution minimizes latency by scanning only what’s needed.

Cleaner workflows: Use Python only when data requires heavy cleaning or statistical modeling.

Scalable governance: All queries run in Milo’s controlled environment, making them auditable and repeatable.

The Future of Multi-Source Analytics

Modern businesses rarely have a single source of truth. The ability to query across sources efficiently, without building fragile pipelines, is becoming a baseline requirement.

Milo’s query engine makes multi-source analytics as simple as writing a Prompt. Combine Postgres with spreadsheets, or join CRM data from your warehouse with operational logs, all inside one prompt. When data is clean, Milo uses SQL. When it’s messy, it switches to Python. And when you need cross-source joins, the new query engine takes over.

It’s the shortest path from data everywhere to answers anywhere.

‍

AI Technology

Combining Data Sources Without the Pain: How Milo’s Query Engine Works

Why Combining Data Sources Is Hard

Enter Milo’s Query Engine

Three Ways Milo Gets Data

Why This Matters

The Future of Multi-Source Analytics

Related Articles

AI Technology

GenBI: A New Operating System For Decisions

AI Technology

GenBI: Don’t let your Decisions die in Dashboards

AI Technology

Data In, ROI Out: How to Win in Enterprise AI

AI Technology

Enterprise AI Starts With Data

AI Technology

Combining Data Sources Without the Pain: How Milo’s Query Engine Works

Why Combining Data Sources Is Hard

Enter Milo’s Query Engine

Three Ways Milo Gets Data

Why This Matters

The Future of Multi-Source Analytics

Bring Generative BI to Your Team

Related Articles

AI Technology

GenBI: A New Operating System For Decisions

AI Technology

GenBI: Don’t let your Decisions die in Dashboards

AI Technology

Data In, ROI Out: How to Win in Enterprise AI

AI Technology

Enterprise AI Starts With Data