Skip to content

Knowledge Base

Documentation

Everything you need to navigate the ATLAS ecosystem — from contributing high-quality SEA language data to understanding our governance framework.

This document reflects ATLAS's current governance model — terms may evolve as the programme matures.

Data Governance & Responsible AI

Our commitment to building a transparent, unbiased, and secure data ecosystem for the Southeast Asian AI landscape.

Core Principles

Regional Representation

Ensuring linguistic data reflects the true cultural nuances of Southeast Asia, not just direct translations.

Privacy by Design

Automated and manual PII (Personally Identifiable Information) scrubbing before any data enters our public repository.

Bias Mitigation

Multi-layered validation steps specifically designed to identify and remove harmful societal biases.

Anonymisation Protocol

Every contribution undergoes a strict three-stage PII scrubbing process before it is integrated into the Project SEALD datasets.

  • Rule-based pattern matching (Regex) for IDs & Numbers
  • Named Entity Recognition (NER) for sensitive locations
  • Community Validation double-blind review

Cookies & analytics

We use cookies to make ATLAS work and to understand how it's used. Choose which categories to allow.

Necessary

Required for core site functionality. Always active.

Always on

Analytics

Helps us understand how ATLAS is used so we can improve it.

Marketing

Used for personalised content. Off by default.