SQLite and DuckDB share something striking: both are embedded databases — a library living in your process, no separate server. And yet they solve different problems. SQLite is the king of small transactions and local persistence; DuckDB is the king of fast columnar analytics without infra. Choosing wrong is a costly mistake.
This article is an honest comparison, with criteria to decide and scenarios where each wins — and when they can (should) be used together.
The Common Ground: Embedded
Both discard the client-server model. Your application imports the library and runs SQL queries on a file. No systemd, no ports, no users, no replication as a central concern. This brutally reduces operational complexity.
- SQLite: a
.dbfile (plus some temporary WAL). Dumps, copies, migrations arecp. - DuckDB: a
.duckdbfile (or in memory, or Parquet directly).
This embedded pattern is perfect for: mobile apps, desktop, single-instance services, analytic notebooks, local ETL pipelines. It’s the anti-pattern for: systems with dozens of concurrent services against the same DB.
Structural Difference: Rows vs Columns
The separation isn’t cosmetic. It’s architectural:
- SQLite stores data by rows. Each row is a contiguous block. Perfect for:
SELECT * FROM users WHERE id = 42— read a whole row fast. - DuckDB stores data by columns. Each column is a contiguous block. Perfect for:
SELECT AVG(amount) FROM transactions WHERE year=2023— read few fields from many rows.
OLTP vs OLAP, in embedded form.
When to Choose SQLite
Scenarios where SQLite is clearly right:
- Mobile app with local DB: iOS, Android, desktop. SQLite is the de facto standard.
- App configuration and persistence: Firefox, Chrome, dozens of apps use it.
- Single-node web service with mixed moderate read/write.
- Prototype or MVP before migrating to PostgreSQL.
- Durable transactional local backup with full ACID.
- WASM in the browser via sql.js or similar.
SQLite does what it should do well: small transactions, high integrity, low latency per operation.
When to Choose DuckDB
Scenarios where DuckDB shines:
- Log analysis: millions of rows, aggregate queries.
- Query Parquet/CSV files directly without importing.
SELECT * FROM 'data.parquet'works. - Replace pandas in mid-sized pipelines (memory efficient, native SQL).
- Ad-hoc notebook analysis: Jupyter + DuckDB is powerful.
- Consolidate OLTP bases for reports without touching production (CDC + DuckDB).
- Vectorized processing over tabular data without distributed infra.
DuckDB is what Pandas-with-native-SQL should have been.
Orientation Benchmarks
With proper asterisks (DB benchmarks lie without context), orders of magnitude:
- Individual insert of 1M rows: SQLite ~20s (batched). DuckDB: takes more — not its case.
- Aggregate over 1M rows with GROUP BY: SQLite ~15s. DuckDB ~1s.
- Scan 100M columnar rows with selective filter: SQLite slow. DuckDB seconds.
- Concurrent transactions: SQLite (with WAL) ~1000 tx/s. DuckDB not designed for this.
Each one is 10x-100x faster than the other in the right terrain.
Python Integration
Both are trivial from Python:
# SQLite
import sqlite3
con = sqlite3.connect('app.db')
con.execute("CREATE TABLE users (id INT, name TEXT)")
# DuckDB
import duckdb
con = duckdb.connect('analytics.duckdb')
# Query Parquet directly
df = con.execute("SELECT * FROM 'events.parquet' WHERE type='signup'").df()
DuckDB has especially polished Python integration: pandas and Polars DataFrames are consumed/produced without serious overhead.
Using Them Together
Combining them is a common pattern:
- OLTP in SQLite, OLAP in DuckDB. The app writes to SQLite; a periodic job dumps changes to DuckDB/Parquet for analysis.
- DuckDB reads SQLite directly via the
sqlite_scannerextension.SELECT * FROM sqlite_scan('app.db', 'users'). No ETL. - Archive old SQLite data to Parquet. DuckDB queries historicals without bloating the production DB.
The “SQLite for hot + DuckDB for cold” pattern is very productive.
Real Limitations
Let’s be honest:
SQLite is not for:
- High concurrent writes — even with WAL, the write lock is global.
- Multiple processes writing — subtle race conditions exist.
- Giant DBs (>100GB) — technically possible, operationally awkward.
- Built-in replication — tools like Litestream help.
DuckDB is not for:
- Real OLTP — no fine-grained locking, concurrency for analysis.
- Sub-millisecond latency — fast but optimized for analytic queries.
- DBs with many concurrent writes — wrong case.
Forcing each to do the other’s job hurts.
The Verdict
How to decide in 10 seconds:
- Short transactions, high integrity, one active process? SQLite.
- Analytic queries over large volumes, few writes? DuckDB.
- Mobile / desktop / simple service app? SQLite.
- Notebooks, local ETL, replace pandas? DuckDB.
- Both? Use them together.
Any honest doubt resolves by looking at the main query: if it’s WHERE id=? almost always, SQLite. If it’s GROUP BY over many rows, DuckDB.
Conclusion
SQLite and DuckDB are complementary tools, not competitors. Both represent the “embedded matters” philosophy applied to different problems. For local OLTP, SQLite is unbeatable for simplicity and robustness. For infra-less OLAP, DuckDB has made many complex solutions obsolete. Knowing both and applying the right one is a real technical advantage — and combining them when it makes sense multiplies the value of both.
Follow us on jacar.es for more on databases, data architectures, and embedded tools.