Manage labeling workflows, datasets, and content classification for governance compliance
Data labeling in OpenRails provides tools for classifying and categorizing content within your data lakes. Labels are used by the governance pipeline to enforce security policies, apply de-identification rules, and control access based on content sensitivity.
From the sidebar, go to Governance > Data Labeling.
Choose the dataset (data lake or document collection) you want to label. The labeling interface shows documents with their current labels and classification status.
Create or select label categories for classification:
Label documents individually or in bulk:
Review auto-labeled documents for accuracy. Approve or correct labels before they are used by the governance pipeline.
Manage labeled datasets from the governance dashboard:
| Action | Description |
|---|---|
| Create Dataset | Define a new dataset from a data lake or document subset |
| Export Labels | Export labels as CSV for external analysis or compliance reporting |
| Label Statistics | View distribution of labels across the dataset |
| Re-label | Re-run auto-labeling rules after updating patterns |
Configure automatic classification rules based on content patterns: