Structured Data Accuracy
Data Quality
Overview
The Structured Data Accuracy evaluation measures how accurately the attributes in a structured (tabular) dataset represent the true values of the intended concept or event. Each row is inspected across three independent accuracy dimensions, and the results are aggregated into per-dimension and overall metrics.
The three accuracy dimensions are:
- Syntactic accuracy: whether attribute values conform to their domain definition - correct encoding, no character corruption, no spelling or grammar errors in text fields.
- Data type accuracy: whether attribute values are represented with the correct type - correct numeric formats, correct date representations, correct null encodings.
- Semantic accuracy: whether attribute values correctly represent the true value of the intended concept - correct field-level values, no transposed or placeholder entries.
Metrics
Accuracy
The aggregate accuracy score combining all three dimensions (range: 0.0 to 1.0).
Syntactic Accuracy
The fraction of attribute values that conform to the domain definition - correct encoding, no character corruption, no spelling or grammar errors in text fields (range: 0.0 to 1.0).
Data Type Accuracy
The fraction of attribute values that are represented with the correct type - correct numeric formats, correct date representations, correct null encodings (range: 0.0 to 1.0).
Semantic Accuracy
The fraction of attribute values that correctly represent the true value of the intended concept - correct field-level values, no transposed or placeholder entries (range: 0.0 to 1.0).
Motivation
Inaccurate data propagates silently through AI pipelines. Encoding corruption, type mismatches, and wrong field values each produce incorrect model outputs - yet none of them raises an obvious error at ingestion time. A model trained or evaluated on inaccurate tabular data learns the wrong patterns, produces wrong predictions, and reports metrics that do not reflect real-world performance.
Accuracy failures fall into three distinct categories, each requiring a different remediation strategy. Measuring them independently makes each failure mode visible so that remediation can be targeted rather than chasing a single opaque aggregate score.
Methodology
-
Samples: Each row in the dataset is scored independently across three accuracy dimensions.
-
Scoring: Each row is scored across three dimensions:
- Syntactic Scorer: checks whether values conform to their domain definition (encoding, formatting, spelling).
- Type Scorer: checks whether values use the correct type (numeric formats, date representations, null encodings).
- Semantic Scorer: checks whether values correctly represent the intended concept (field-level correctness, no transposed or placeholder entries).
Each scorer produces an independent score from 0.0 to 1.0. Per-dimension and aggregate metrics are computed by averaging these scores across all rows.
Scoring
Syntactic Accuracy Scorer
Data Type Accuracy Scorer
Semantic Accuracy Scorer
Examples
Accurate row - all three scorers pass (passing)
All fields are correctly formatted. client_id follows the Swiss UID format, transaction_date is ISO 8601, and asset_class has no spelling errors.
amount_chf is a float as expected, transaction_date is a valid date string, and currency is a three-letter string code.
All values correctly represent the intended concepts. The transaction date, amount, and asset class match the source record.
Multiple errors - formatting issue, type mismatch, and wrong value (failing)
client_id is missing the dot separators required by the Swiss UID format ('CHE123456789' instead of 'CHE-123.456.789'). transaction_date uses a non-standard format. The remaining fields are syntactically correct.
amount_chf is expected to be a numeric value but contains a natural-language string. There is no reliable programmatic conversion, making the value unrecoverable without manual intervention. The remaining fields have the correct type.
Despite the formatting and type issues, the intended values are identifiable and match the source record. No semantic errors detected.