Overview

Open-source data engineering harness.

100+ specialized data engineering tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the harness for your data agents. Evaluate across platforms, independent of any single warehouse provider.

Get Started See Examples View on GitHub

npm install -g altimate-code

Why Altimate Code?

Every major data platform is building AI agents, but they're all locked to one ecosystem. Your data stack isn't. Altimate Code connects to your entire stack and lets you bring any LLM. No vendor lock-in, no platform tax.

Bring Your Own LLM ¶

Works with Anthropic, OpenAI, Google, AWS Bedrock, Azure, Ollama, and 10+ more providers. Swap models without swapping your harness. No vendor lock-in.

Cross-Platform ¶

Claude Code, Cursor, Windsurf, VS Code, and any MCP-compatible client. Terminal, IDE, CI, or web. One install, everywhere.

100+ Deterministic Tools ¶

SQL analysis, column-level lineage, dbt integration, FinOps, warehouse connectivity. Purpose-built for data work, not hallucinated by a model.

Validation Layer ¶

SQL, lineage, and equivalence checks run in compiled Rust, not the model. 100% F1 across 1,077 anti-pattern queries, ~2 ms each, zero tokens.

Token Efficiency ¶

Context compaction trims the schema payload per task. The model gateway routes each call to the cheapest model that clears your accuracy bar.

Data Governance ¶

Built-in PII detection, policy enforcement, and compliance validation across your data stack. Three agent modes — Builder, Analyst, Plan — with tool-level permissions.

Open source & auditableEvery tool, prompt and rule is inspectable on GitHub. A requirement for regulated industries, not a nice-to-have.

Customizable to your workflowBring your own rules, agents, skills and tools. Match your company's data conventions and testing patterns.

100+ specialized tools

Unlike general-purpose coding agents, every tool is purpose-built for data engineering workflows.

See it in action

Build dbt models from Jira tickets, find broken Snowflake views, optimize warehouse costs, migrate PySpark to dbt, debug Airflow DAGs, and more — all from your terminal.

# Analyze a query for anti-patterns and optimization opportunities
> Analyze this query for issues: <query code> or <query id from warehouse>

# Translate SQL across dialects
> /sql-translate this Snowflake query to BigQuery: <query-code>

# Get a cost report for your Snowflake or Databricks account
> /cost-report

# Scaffold a new dbt model following your project patterns
> /model-scaffold fct_revenue from stg_orders and stg_payments

# Generate column level lineage report for sensitive columns
# from a particular table and identify owners
> Trace the lineage for email_id and name columns from
  customer_data.customer_info table and generate a report
  of where sensitive data is replicated with table owners info

# Migrate PySpark jobs to dbt models
> Migrate this PySpark ETL to a dbt model: <path to PySpark file>

# Debug a failing Airflow DAG
> Debug this Airflow DAG failure: <DAG id or error log>

Browse more examples

Benchmarks

Precision matters. Here's where we stand.

Benchmark	Result
ADE-Bench (DuckDB Local)	74.4% pass rate (32/43 tasks) — 15.4 points ahead of dbt Fusion+MCP (59%).
SQL Anti-Pattern Detection	100% accuracy across 1,077 queries, 19 categories. Zero false positives.
Column-Level Lineage	100% edge match across 500 queries with complex joins, CTEs, and subqueries.
Snowflake Query Optimization (TPC-H)	16.8% average execution speedup (3.6x vs baseline).

Full benchmark details

Overview

Why Altimate Code?

Bring Your Own LLM ¶

Cross-Platform ¶

100+ Deterministic Tools ¶

Validation Layer ¶

Token Efficiency ¶

Data Governance ¶

100+ specialized tools

SQL Anti-Pattern Detection ¶

Live Column-Level Lineage ¶

FinOps & Cost Analysis ¶

Cross-Dialect Translation ¶

PII Detection & Safety ¶

dbt Native ¶

See it in action

Benchmarks