Skip to content

The Modern Python Data Stack in 2026

Posted on:February 6, 2026

A Complete Guide to Building Fast, Reproducible Data Projects

From Package Management to Publishing — Every Tool You Need


ToolCategoryReplaces
uvPackage & Project Managementpip, virtualenv, Poetry, pyenv
RuffLinting & Formattingflake8, isort, black
tyType Checkingmypy, pyright
Positron IDEDevelopment EnvironmentVS Code, RStudio, JupyterLab
MarimoReactive NotebooksJupyter Notebook
PolarsDataFrame ProcessingPandas
DuckDBEmbedded SQL AnalyticsSQLite (analytics), Spark (local)
QuartoPublishing & DocumentationMkDocs, Jupyter Book, nbconvert
EvidenceBI & Data DashboardsPower BI, Tableau, Metabase

Introduction: Python’s Tooling Renaissance

Python has dominated data science, machine learning, and analytics for over a decade. But for much of that time, its developer tooling lagged behind the language’s ambitions. Dependency management was fragmented across pip, virtualenv, conda, and Poetry. Notebooks introduced reproducibility nightmares. Type checking felt like an afterthought. And publishing results required stitching together multiple disconnected tools.

In 2026, that story has fundamentally changed. A new generation of tools — many built in Rust for blazing speed, others rethinking entire workflows from scratch — has coalesced into a modern Python data stack that is fast, reproducible, and elegant.

This guide walks through each layer of this modern stack: from project setup with uv, through code quality with Ruff and ty, to reactive notebooks with Marimo, high-performance data processing with Polars and DuckDB, a purpose-built IDE with Positron, reproducible publishing with Quarto, and code-driven analytics with Evidence. Together, these tools form an integrated ecosystem where every piece works with the others.


1. uv — The Universal Python Project Manager

What it is: A Rust-based, all-in-one tool that replaces pip, virtualenv, Poetry, pyenv, and pipx with a single, blazing-fast command. Built by Astral, uv is 10–100x faster than pip and handles package installation, virtual environment creation, Python version management, and project scaffolding in one unified interface.

Why It Matters

Python’s packaging ecosystem has historically been one of its weakest points. The famous XKCD comic about Python environments resonated precisely because managing dependencies, virtual environments, and Python versions required juggling multiple tools with overlapping responsibilities. uv eliminates this entirely.

Key Features

Quick Start

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create a new project
uv init my-data-project
cd my-data-project

# Add dependencies (creates .venv automatically)
uv add polars duckdb marimo

# Run your script in the right environment
uv run python analysis.py

# Pin a Python version
uv python pin 3.12

🌐 docs.astral.sh/uv


2. Ruff — Lightning-Fast Linting and Formatting

What it is: An ultra-fast Python linter and code formatter written in Rust. Ruff replaces flake8, isort, and black in a single tool, running 10–100x faster while covering 800+ lint rules with auto-fix capabilities.

Why It Matters

Code quality tools are only effective if developers actually run them. Traditional Python linters were slow enough that developers would skip them during development and only run them in CI. Ruff is so fast it can run on every save without any perceptible delay, making code quality automatic rather than aspirational.

Key Features

Quick Start

# Lint and auto-fix
uvx ruff check --fix .

# Format code
uvx ruff format .

🌐 docs.astral.sh/ruff


3. ty — Modern Type Checking at Rust Speed

What it is: A next-generation Python type checker from Astral (the makers of uv and Ruff), built in Rust. ty aims to replace mypy with dramatically faster performance and better developer experience, including a built-in language server for real-time IDE feedback.

Why It Matters

Type checking in Python has always been optional, and slow type checkers made it feel burdensome. ty changes the equation: it’s fast enough to run continuously in watch mode, providing instant feedback as you type. Combined with its built-in language server, type checking becomes part of the editing experience rather than a separate step.

Key Features

Quick Start

# One-time check
uvx ty check

# Continuous feedback
uvx ty check --watch

Note: ty is still in beta. Introduce it gradually — first as an informational check, then as a CI gate once you’re confident in its signal-to-noise ratio.

🌐 docs.astral.sh/ty


4. Positron — The Data Science IDE

What it is: A free, next-generation IDE from Posit (formerly RStudio), built on VS Code’s open-source foundation but purpose-designed for data science. Positron treats Python and R as first-class citizens with native data exploration, variable inspection, plot management, and AI assistance built in — no extensions required.

Why It Matters

Data scientists have long been forced to choose: VS Code offers extensibility but requires plugins for basic data work; JupyterLab excels at exploration but lacks IDE power; RStudio is purpose-built but R-centric. Positron is the first IDE that combines VS Code’s extensibility with RStudio’s data-first design, while treating Python and R equally. Released as stable in 2025, it represents where data science development is heading.

Key Features

Quick Start

# Download from positron.posit.co
# Open your uv project folder
# Positron auto-detects your Python environment
# Start exploring data with the built-in Data Explorer

🌐 positron.posit.co


5. Marimo — Reactive Notebooks Done Right

What it is: A reactive Python notebook that solves Jupyter’s reproducibility problems. Marimo notebooks are stored as pure .py files (not JSON), execute deterministically based on a dependency graph, and can be deployed as interactive web apps or run as scripts.

Why It Matters

Jupyter notebooks are powerful for exploration but carry well-known problems: hidden state from out-of-order execution, JSON files that create merge conflicts in Git, and no built-in way to deploy work as applications. Marimo rethinks the notebook from the ground up. Every cell’s dependencies are tracked automatically. When you update a cell, all dependent cells re-execute or are marked stale. There’s no hidden state, no “run all cells” rituals, and no phantom bugs.

Key Features

Quick Start

# Install via uv
uv add marimo

# Create and edit a notebook
uv run marimo edit notebook.py

# Run as a web app
uv run marimo run notebook.py

# Run as a script
uv run python notebook.py

🌐 marimo.io


6. Polars — DataFrames at the Speed of Rust

What it is: A Rust-based DataFrame library that is 5–20x faster than Pandas and 8x more energy efficient. Polars uses columnar storage (Apache Arrow), lazy evaluation with automatic query optimization, and multi-core parallelism by default.

Why It Matters

Pandas was revolutionary when it launched in 2008, but its single-threaded, row-oriented architecture shows its age with modern data volumes. Polars brings a query-engine mindset to DataFrame processing: you describe what you want, and Polars optimizes how to execute it — including filter pushdown, projection pushdown, and parallel execution across all CPU cores.

Key Features

Quick Start

import polars as pl

# Lazy: describe what you want, Polars optimizes how
result = (
    pl.scan_parquet("sales_data/*.parquet")
    .filter(pl.col("revenue") > 1000)
    .group_by("region")
    .agg(pl.col("revenue").sum())
    .sort("revenue", descending=True)
    .collect()  # Execute the optimized plan
)

🌐 pola.rs


7. DuckDB — SQLite for Analytics

What it is: An embedded, in-process analytical database that runs SQL queries directly on local files (CSV, Parquet, JSON) and in-memory DataFrames — without a server. DuckDB uses columnar storage and vectorized execution to deliver analytical performance 10–100x faster than SQLite, right inside your Python process.

Why It Matters

Before DuckDB, running analytical SQL queries locally meant either importing data into a full database server (PostgreSQL, MySQL) or accepting the limitations of SQLite, which was designed for transactional workloads. DuckDB eliminates this tradeoff: pip install duckdb and you have a production-grade analytical engine that can query Parquet files, join with Pandas/Polars DataFrames, and handle billions of rows on a laptop.

Key Features

Quick Start

import duckdb

# Query a Parquet file with SQL — no loading step
result = duckdb.sql("""
    SELECT region, SUM(revenue) as total
    FROM 'sales_data/*.parquet'
    GROUP BY region
    ORDER BY total DESC
""").pl()  # Returns a Polars DataFrame

# Query an existing Polars DataFrame
import polars as pl
df = pl.DataFrame({"name": ["Alice", "Bob"], "score": [95, 87]})
duckdb.sql("SELECT * FROM df WHERE score > 90").show()

🌐 duckdb.org


8. Quarto — Reproducible Publishing for Data Science

What it is: An open-source scientific and technical publishing system from Posit that renders markdown with executable code into HTML, PDF, Word, presentations, websites, books, and dashboards. Quarto is language-agnostic (Python, R, Julia, Observable JS) and built on Pandoc, the universal document converter.

Why MkDocs Isn’t Enough for Data Projects

MkDocs is excellent for static documentation sites, but data projects need more than documentation. They need reports where code executes and generates results, publications with cross-references and citations, dashboards that update when data changes, and multi-format output from a single source. Quarto does all of this while also handling documentation sites. It’s a superset of MkDocs’ functionality, specifically designed for code-driven content.

Key Features

Quick Start

# Create a document (analysis.qmd)
---
title: "Sales Analysis Q4 2025"
format: html
---
```{python}
import polars as pl
df = pl.read_parquet("sales.parquet")
df.group_by("region").agg(pl.col("revenue").sum())
```
# Render to HTML
quarto render analysis.qmd

# Render to PDF
quarto render analysis.qmd --to pdf

🌐 quarto.org


9. Evidence — Business Intelligence as Code

What it is: An open-source framework for building data products — reports, dashboards, and decision-support tools — using only SQL and Markdown. Evidence generates static websites from markdown files with embedded SQL queries, offering a code-driven alternative to drag-and-drop BI tools like Power BI and Tableau.

Why It Matters

Traditional BI tools create maintenance nightmares: dashboards that can’t be version-controlled, filters that break silently, and customization limits that force workarounds. Evidence applies software engineering principles to analytics — your dashboards live in Git, changes are reviewed in pull requests, and deployments are automated. Combined with DuckDB, it creates a powerful local-first analytics pipeline.

Key Features

Quick Start

# Create a new Evidence project
npx degit evidence-dev/template my-report
cd my-report && npm install
npm run dev
<!-- Edit src/pages/index.md -->

```sql revenue_by_region
SELECT region, SUM(revenue) as total
FROM sales GROUP BY region

```

Observable Framework: An Alternative Approach

Observable Framework takes a similar philosophy — static sites with embedded data — but uses JavaScript as the primary language with data loaders in any backend language. Where Evidence targets SQL-centric teams, Observable excels for custom visualizations with D3.js and interactive exploration. Both can connect to DuckDB and deploy as static sites. Choose Evidence if your team thinks in SQL; choose Observable if you need custom JavaScript visualizations.

🌐 evidence.dev | observablehq.com


10. How It All Fits Together

The real power of this stack isn’t in any individual tool — it’s in how they integrate. Here’s a typical workflow that touches every layer:

A Complete Data Project Workflow

Step 1 – Project Setup: uv init creates your project with pyproject.toml and a managed .venv. uv add polars duckdb marimo installs your stack in seconds.

Step 2 – IDE: Open the project in Positron. It auto-detects the uv environment, provides the Data Explorer and Variables Pane, and runs Ruff on save.

Step 3 – Exploration: Launch marimo edit to explore data interactively. Use Polars for fast transformations and DuckDB for complex SQL joins across Parquet files.

Step 4 – Code Quality: Ruff auto-formats and lints on every save. ty check --watch catches type errors in real-time.

Step 5 – Analysis: Write your final analysis in a Quarto document (.qmd) with executable Python code blocks. Render to HTML for sharing or PDF for publication.

Step 6 – Dashboards: Build an Evidence project that queries your DuckDB database with SQL and generates an interactive BI dashboard for stakeholders.

Step 7 – Deploy: Docker containerizes the environment. Quarto publishes to GitHub Pages. Evidence deploys to Netlify. Everything is Git-versioned and reproducible.

Integration Matrix

PairHow They Integrate
uv + PositronPositron auto-detects uv environments and uses project templates
Polars + DuckDBZero-copy data exchange via Apache Arrow; SQL on Polars DataFrames
Marimo + PolarsReactive notebook cells with fast DataFrame operations
Quarto + PositronRender .qmd documents to HTML/PDF directly in the IDE
DuckDB + EvidenceEvidence’s query engine is built on DuckDB WASM
Ruff + PositronFormat-on-save with official extension
uv + Dockeruv sync in Dockerfile for reproducible container builds

11. Quick Comparison: Modern vs. Legacy

CategoryLegacy StackModern Stack (2026)
Package Managementpip + virtualenv + pyenvuv (all-in-one)
Lintingflake8 + isort + blackRuff (single tool)
Type Checkingmypy (slow, separate)ty (fast, integrated)
IDEVS Code + extensionsPositron (data-first)
NotebooksJupyter (hidden state, JSON)Marimo (reactive, .py files)
DataFramesPandas (single-thread)Polars (multi-core, lazy)
Local SQLSQLite or full PostgresDuckDB (embedded OLAP)
DocumentationMkDocs + manual reportsQuarto (docs + reports + more)
DashboardsPower BI / Tableau (drag-drop)Evidence (code-driven, Git)

12. Conclusion: The Best Time to Modernize Is Now

The modern Python data stack in 2026 isn’t about replacing one tool at a time — it’s about an ecosystem that was designed to work together. uv manages your projects and environments at Rust speed. Ruff and ty keep your code clean and typed. Positron gives you an IDE that understands data. Marimo makes notebooks reproducible and deployable. Polars and DuckDB handle data processing from DataFrames to SQL. Quarto publishes everything from quick reports to full books. And Evidence turns SQL queries into production dashboards.

The beautiful thing is that migration is incremental. You don’t have to adopt everything at once. Start with uv to manage your projects. Add Ruff for automatic formatting. Try Polars on your next analysis. Each tool delivers immediate value on its own and compounds when combined with the others.

For the first time, Python’s tooling matches the language’s ambitions. These tools are fast, polished, and designed with developer experience as a priority — not an afterthought. If you’ve been waiting for the right moment to modernize your Python workflow, that moment is now.


Resources

ToolURL
uvdocs.astral.sh/uv
Ruffdocs.astral.sh/ruff
tydocs.astral.sh/ty
Positronpositron.posit.co
Marimomarimo.io
Polarsdocs.pola.rs
DuckDBduckdb.org
Quartoquarto.org
Evidenceevidence.dev
Observableobservablehq.com

About the author

Stephane Busso
Stephane Busso

Software builder and engineering manager based in New Zealand 🇳🇿. HTDOCS.dev is a medium to share about ai, technologies and engineering.