Structured Data Accuracy
Data Quality
Overview
The Structured Data Accuracy evaluation measures how accurately the attributes in a structured (tabular) dataset represent the true values of the intended concept or event. Each row is inspected across three independent accuracy dimensions, and the results are aggregated into per-dimension and overall metrics.
The three accuracy dimensions are:
- Syntactic accuracy: whether attribute values conform to their domain definition - correct encoding, no character corruption, no spelling or grammar errors in text fields.
- Data type accuracy: whether attribute values are represented with the correct type - correct numeric formats, correct date representations, correct null encodings.
- Semantic accuracy: whether attribute values correctly represent the true value of the intended concept - correct field-level values, no transposed or placeholder entries.
Metrics
Accuracy
The aggregate accuracy score combining all three dimensions (range: 0.0 to 1.0).
Syntactic Accuracy
The fraction of attribute values that conform to the domain definition - correct encoding, no character corruption, no spelling or grammar errors in text fields (range: 0.0 to 1.0).
Data Type Accuracy
The fraction of attribute values that are represented with the correct type - correct numeric formats, correct date representations, correct null encodings (range: 0.0 to 1.0).
Semantic Accuracy
The fraction of attribute values that correctly represent the true value of the intended concept - correct field-level values, no transposed or placeholder entries (range: 0.0 to 1.0).
Motivation
Inaccurate data propagates silently through AI pipelines. Encoding corruption, type mismatches, and wrong field values each produce incorrect model outputs - yet none of them raises an obvious error at ingestion time. A model trained or evaluated on inaccurate tabular data learns the wrong patterns, produces wrong predictions, and reports metrics that do not reflect real-world performance.
Accuracy failures fall into three distinct categories, each requiring a different remediation strategy. Measuring them independently makes each failure mode visible so that remediation can be targeted rather than chasing a single opaque aggregate score.
Methodology
-
Samples: Each row in the dataset is scored independently across three accuracy dimensions.
-
Scoring: Each row is scored across three dimensions:
- Syntactic Scorer: checks whether values conform to their domain definition (encoding, formatting, spelling).
- Type Scorer: checks whether values use the correct type (numeric formats, date representations, null encodings).
- Semantic Scorer: checks whether values correctly represent the intended concept (field-level correctness, no transposed or placeholder entries).
Each scorer produces an independent score from 0.0 to 1.0. Per-dimension and aggregate metrics are computed by averaging these scores across all rows.
Scoring
Syntactic Accuracy Scorer
Data Type Accuracy Scorer
Semantic Accuracy Scorer
Examples
Accurate row - all three scorers pass (passing)
Multiple errors - formatting issue, type mismatch, and wrong value (failing)