How to contribute
ATLAS lets you contribute datasets three ways. Pick the path that matches your situation.
The three paths
Paste a HuggingFace URL
Best for: datasets already public on HuggingFace
Paste your HuggingFace dataset URL and we'll auto-extract basic metadata (name, description, tags, version, citation). You fill in the rest — languages, license, modality, task and domain group, plus the Responsible AI assessment.
Upload a file
Best for: fresh data you're publishing for the first time
Upload a single file (up to 5 GB) — text data formats: zip, tar, gz, csv, json, txt, pdf. For multiple files, zip them first. After upload, fill in all the metadata yourself.
Need help or have a large dataset?
Best for: datasets >5 GB, special formats, or anything tricky
Reach out via our support form and our team will help you onboard your dataset. Use this path if your file is too big, in a format we don't accept directly, or if you'd like guidance.
What you'll fill in
Whichever path you pick, you'll be guided through a wizard covering:
- Core metadata — dataset name, description, languages, license, modality, task group, domain group
- Responsible AI assessment — data provenance, intended use, known limitations, biases, PII handling
- Final review — confirm everything, accept the contributor terms, submit
What happens after you submit
Your submission enters the ATLAS contribution pipeline. The team reviews it and, when all checks pass, your dataset gets published to the catalogue. You can track its status anytime from your profile.