Tags
eval
Data Drift
Measures the degree to which each sample in a new dataset has drifted from the reference (training) distribution, detecting covariate shift at both the dataset and per-sample level.
eval
Data Poisoning
Detects poisoned samples in a dataset that could elicit backdoor behaviour when used for LLM training — including trigger-payload pairs, sleeper agent patterns, and adversarial trigger phrases.
eval
Data Representativeness
Measures the degree to which a training dataset covers the production distribution, detecting regions of input space where no sufficiently close training sample exists.
eval
Data Uniqueness
Measures the degree to which each real-world object occurs only once in a dataset, detecting exact, transformation, and semantic duplicates at both the dataset and per-sample level.
eval
RAG Knowledge Base Poisoning
Detects poisoned documents in a RAG knowledge base that contain embedded adversarial instructions designed to hijack model behaviour at retrieval time - including prompt injection patterns, instruction overrides, and hidden directives embedded within otherwise legitimate-looking content.
eval
Structured Data Accuracy
Measures the degree to which attributes in a structured (tabular) dataset correctly represent the true value of the intended concept, across syntactic, type, and semantic accuracy dimensions.
eval
Structured Data Bias
Measures the degree to which a structured dataset contains systematic skews that cause an AI model to produce unfair or discriminatory outputs across subgroups.
eval
Structured Data Completeness
Measures the degree to which all required attributes and time periods are present in a structured dataset, across attribute completeness and time completeness dimensions.
eval
Structured Data Representativeness
Measures how well a structured dataset reflects the distribution of the target population or deployment environment for an AI application.
eval
System Prompt Safety Instructions
Audits system prompts for three structural safety properties: whether they define the model's role, scope what it will and will not respond to, and explicitly instruct the model to ignore attempts to override its instructions.
eval
System Prompt Sensitive Data
Detects system prompts that contain sensitive information - such as API keys, authentication credentials, database names, user roles, or permission structures - that should be externalised rather than embedded directly in the prompt.
eval
Training Data Sanitisation
Detects samples in a training dataset that contain unsanitised sensitive content - including personal identifiers, national identifiers, financial account data, authentication secrets, health data, online identifiers, sensitive personal attributes, and confidential business data - that should have been scrubbed or masked before use in model training.
eval
Unstructured Data Accuracy
Measures the degree to which attributes in an unstructured dataset correctly represent the true value of the intended concept, across syntactic, type, and semantic accuracy dimensions.
eval
Candidate Screening Accuracy
Evaluates whether an AI candidate screening system correctly classifies job applicants for individual job requirements, measuring both overall accuracy and the direction of misclassifications.
eval
Candidate Screening Bias
Evaluates whether an AI candidate screening system produces consistent outcomes when protected attributes such as age, gender, or national origin are varied in an applicant's profile while all qualifying information remains unchanged.
eval
Candidate Screening Cyber Security
Evaluates whether an AI candidate screening system resists prompt injection attacks embedded in applicant documents that attempt to manipulate the screening outcome.
eval
Candidate Screening Resilience
Evaluates whether an AI candidate screening system handles malformed inputs gracefully by returning an ERROR response instead of failing silently.
eval
Candidate Screening Robustness
Evaluates whether an AI candidate screening system produces consistent predictions when the same applicant data is presented with meaning-preserving surface variations.
eval
Content Policy Compliance
Evaluates whether an AI application enforces its content policy by generating inputs designed to trigger violations and judging whether responses comply.
eval
Financial Reasoning
Evaluates whether AI models can correctly apply financial concepts, formulas, and quantitative reasoning without producing errors that could mislead users or result in financial harm if acted upon.
eval
Financial Safety
Evaluates whether AI models behave safely in financial advisory contexts by resisting sycophantic validation of unsound financial decisions and refusing to facilitate compliance violations such as insider trading, market manipulation, and money laundering.
eval
Harmful Content
Evaluates whether AI models appropriately refuse to generate harmful or unsafe content across categories such as violent crimes, hate speech, privacy violations, and cybercrime.
eval
Harmful Knowledge
Evaluates whether AI models retain dangerous dual-use knowledge in biosecurity, chemical security, and cybersecurity domains that could be misused for harmful purposes.
eval
Input Size Limits
Verifies that the model endpoint enforces input size limits by sending a request that exceeds the declared token limit and checking whether the oversized input is rejected.