Tags
eval

Data Drift

Measures the degree to which each sample in a new dataset has drifted from the reference (training) distribution, detecting covariate shift at both the dataset and per-sample level.
eval

Data Poisoning

Detects poisoned samples in a dataset that could elicit backdoor behaviour when used for LLM training — including trigger-payload pairs, sleeper agent patterns, and adversarial trigger phrases.
eval

Data Representativeness

Measures the degree to which a training dataset covers the production distribution, detecting regions of input space where no sufficiently close training sample exists.
eval

Data Uniqueness

Measures the degree to which each real-world object occurs only once in a dataset, detecting exact, transformation, and semantic duplicates at both the dataset and per-sample level.
eval

RAG Knowledge Base Poisoning

Detects poisoned documents in a RAG knowledge base that contain embedded adversarial instructions designed to hijack model behaviour at retrieval time - including prompt injection patterns, instruction overrides, and hidden directives embedded within otherwise legitimate-looking content.
eval

Structured Data Accuracy

Measures the degree to which attributes in a structured (tabular) dataset correctly represent the true value of the intended concept, across syntactic, type, and semantic accuracy dimensions.
eval

Structured Data Bias

Measures the degree to which a structured dataset contains systematic skews that cause an AI model to produce unfair or discriminatory outputs across subgroups.
eval

Structured Data Completeness

Measures the degree to which all required attributes and time periods are present in a structured dataset, across attribute completeness and time completeness dimensions.
eval

Structured Data Representativeness

Measures how well a structured dataset reflects the distribution of the target population or deployment environment for an AI application.
eval

System Prompt Safety Instructions

Audits system prompts for three structural safety properties: whether they define the model's role, scope what it will and will not respond to, and explicitly instruct the model to ignore attempts to override its instructions.
eval

System Prompt Sensitive Data

Detects system prompts that contain sensitive information - such as API keys, authentication credentials, database names, user roles, or permission structures - that should be externalised rather than embedded directly in the prompt.
eval

Training Data Sanitisation

Detects samples in a training dataset that contain unsanitised sensitive content - including personal identifiers, national identifiers, financial account data, authentication secrets, health data, online identifiers, sensitive personal attributes, and confidential business data - that should have been scrubbed or masked before use in model training.
eval

Unstructured Data Accuracy

Measures the degree to which attributes in an unstructured dataset correctly represent the true value of the intended concept, across syntactic, type, and semantic accuracy dimensions.
eval

Candidate Screening Accuracy

Evaluates whether an AI candidate screening system correctly classifies job applicants for individual job requirements, measuring both overall accuracy and the direction of misclassifications.
eval

Candidate Screening Bias

Evaluates whether an AI candidate screening system produces consistent outcomes when protected attributes such as age, gender, or national origin are varied in an applicant's profile while all qualifying information remains unchanged.
eval

Candidate Screening Cyber Security

Evaluates whether an AI candidate screening system resists prompt injection attacks embedded in applicant documents that attempt to manipulate the screening outcome.
eval

Candidate Screening Resilience

Evaluates whether an AI candidate screening system handles malformed inputs gracefully by returning an ERROR response instead of failing silently.
eval

Candidate Screening Robustness

Evaluates whether an AI candidate screening system produces consistent predictions when the same applicant data is presented with meaning-preserving surface variations.
eval

Content Policy Compliance

Evaluates whether an AI application enforces its content policy by generating inputs designed to trigger violations and judging whether responses comply.
eval

Financial Reasoning

Evaluates whether AI models can correctly apply financial concepts, formulas, and quantitative reasoning without producing errors that could mislead users or result in financial harm if acted upon.
eval

Financial Safety

Evaluates whether AI models behave safely in financial advisory contexts by resisting sycophantic validation of unsound financial decisions and refusing to facilitate compliance violations such as insider trading, market manipulation, and money laundering.
eval

Harmful Content

Evaluates whether AI models appropriately refuse to generate harmful or unsafe content across categories such as violent crimes, hate speech, privacy violations, and cybercrime.
eval

Harmful Knowledge

Evaluates whether AI models retain dangerous dual-use knowledge in biosecurity, chemical security, and cybersecurity domains that could be misused for harmful purposes.
eval

Input Size Limits

Verifies that the model endpoint enforces input size limits by sending a request that exceeds the declared token limit and checking whether the oversized input is rejected.