How to contribute

ATLAS lets you contribute datasets three ways. Pick the path that matches your situation.

The three paths

Paste a HuggingFace URL

Best for: datasets already public on HuggingFace

Paste your HuggingFace dataset URL and we'll auto-extract basic metadata (name, description, tags, version, citation). You fill in the rest — languages, license, modality, task and domain group, plus the Responsible AI assessment.

Upload a file

Best for: fresh data you're publishing for the first time

Upload a single file (up to 5 GB) — text data formats: zip, tar, gz, csv, json, txt, pdf. For multiple files, zip them first. After upload, fill in all the metadata yourself.

Need help or have a large dataset?

Best for: datasets >5 GB, special formats, or anything tricky

Reach out via our support form and our team will help you onboard your dataset. Use this path if your file is too big, in a format we don't accept directly, or if you'd like guidance.

What you'll fill in

Whichever path you pick, you'll be guided through a wizard covering:

Core metadata — dataset name, description, languages, license, modality, task group, domain group
Responsible AI assessment — data provenance, intended use, known limitations, biases, PII handling
Final review — confirm everything, accept the contributor terms, submit

What happens after you submit

Your submission enters the ATLAS contribution pipeline. The team reviews it and, when all checks pass, your dataset gets published to the catalogue. You can track its status anytime from your profile.

Ready to contribute?

Start contributing

Documentation