Running Locally

The Laboratory for your CSVs.

Clean, format, and tokenize massive datasets directly in your browser. Secure by design. No server uploads.

CSV to JSONL

Convert massive CSV files to JSONL format for LLM fine-tuning. Streaming supported.

Redact PII

Sanitize datasets by automatically removing emails, phones, and SSNs.

Split Data

Deterministically shuffle and split files into Train/Validation/Test sets.

Token Cost

Estimate LLM API costs (OpenAI/Anthropic) using accurate tokenizers.

Unlimited File Size

No server limits.

100% Private

Zero data egress.

Instant Processing

No queue times.