jq for datasets
Fast, flexible data manipulation across multiple file formats using familiar jq-like syntax
•
Key Features
High Performance
Built on Polars DataFrames with lazy evaluation and columnar operations for lightning-fast data processing
Format Flexibility
Supports Parquet, Avro, CSV, JSON Lines, Arrow, and more with automatic format detection
User-Friendly
Intuitive jq-inspired syntax with interactive REPL mode and clear error messages
Supported Formats
CSV/TSV Parquet JSON/JSON Lines Arrow Avro ASCII Delimited Text
Perfect For
Data Analysts
Quick data exploration and transformation
Developers
Pipeline integration and data processing
Data Engineers
ETL workflows and format conversion
Researchers
Dataset analysis and manipulation
•
Get Started Today
Install via cargo:
$ cargo install dsq
Or download pre-compiled binaries from GitHub
Transform data
# Input: employees.csv
# id,name,age,city,salary,department
# 1,Alice Johnson,28,New York,75000,Engineering
# 2,Bob Smith,34,Los Angeles,82000,Sales
# ... $ dsq 'map(.salary += 5000) | map({name, new_salary: .salary, department})' employees.csv [
{"department": "Engineering", "name": "Alice Johnson", "new_salary": 80000},
{"department": "Sales", "name": "Bob Smith", "new_salary": 87000},
{"department": "Marketing", "name": "Carol Williams", "new_salary": 73000},
...
] Group and aggregate
# Input: employees.csv $ dsq 'group_by(.department) | map({dept: .[0].department, count: length, avg_salary: (map(.salary) | add / length)})' employees.csv [
{"avg_salary": 90666.67, "count": 3, "dept": "Engineering"},
{"avg_salary": 63500.0, "count": 2, "dept": "HR"},
{"avg_salary": 83000.0, "count": 3, "dept": "Sales"},
{"avg_salary": 69500.0, "count": 2, "dept": "Marketing"}
] Group with statistics
# Input: books.csv
# title,author,year,genre,price
# "The Great Gatsby",F. Scott Fitzgerald,1925,Fiction,10.99
# "1984",George Orwell,1949,Dystopian,9.99
# ... $ dsq 'group_by(.genre) | map({genre: .[0].genre, count: length, avg_price: (map(.price) | add / length)})' books.csv [
{"avg_price": 11.58, "count": 3, "genre": "Fiction"},
{"avg_price": 9.75, "count": 2, "genre": "Dystopian"},
{"avg_price": 14.99, "count": 1, "genre": "Fantasy"},
...
] Convert formats
# CSV to Parquet
$ dsq '.' data.csv -o output.parquet
# JSON to CSV
$ dsq '.' data.jsonl -o output.csv
# Parquet to JSON
$ dsq '.' data.parquet -o output.json Cross-platform support for Linux, macOS, and Windows