Get Started
- 1. AI Risk Assessment 101
- 2. Write a Clear AI Safety Report
- 3. How to Build Your Reputation as an AI Safety Researcher on AIRTA Systems
- 4. Don't Cause Harm
- 5. DVAIA - Damn Vulnerable AI Application
- 6. How Invitations and Team Access Work
- 7. Understanding Program Safety Tiers
- 8. Risk CategoriesCurrent
- 9. Safe Harbour on AIRTA Systems
- 10. Black Box Testing
Risk Categories
Feb 24, 2026
When you test AI systems on AIRTA Systems, your findings feed into compliance, remediation, and-where applicable-safety certification. This article summarizes the risk categories the platform and program owners care about. Use it to prioritise what you look for and to describe your findings in terms that align with regulatory and safety language.
Core safety characteristics
These are the high-level properties that define whether an AI product is safe and fit for purpose:
- Validity and reliability: Does the AI function as specified and behave consistently over time and in different conditions?
- Robustness: Does it keep working correctly when faced with unexpected inputs, edge cases, or adversarial attempts?
- Fairness and bias mitigation: Does it produce inequitable, discriminatory, or unjust outcomes? Critical for consumer protection.
- Accountability: Is there clear ownership and responsibility for the AI system, its impacts, and its outputs across the lifecycle?
Real-world consumer harms and fundamental rights
Risks that can cause direct harm to people or violate their rights:
- Physical harm or danger to life.
- Psychological harm.
- Fraud and financial exploitation.
- Illegal discrimination and fundamental rights violations.
- Harm to vulnerable groups.
- Manipulative techniques.
- Violations of privacy, dignity, and autonomy.
- Unfair treatment and bias in critical decisions (e.g. hiring, credit, access to services).
When you find a bug or misbehaviour, ask: "Could this lead to one of these harms?" If yes, frame that in your report.
Product-level vulnerabilities
Technical and behavioural issues commonly assessed in AI products:
- Prompt injection attacks-inputs that override instructions or extract data.
- Hallucinations and misinformation-incorrect, fabricated, or misleading outputs presented as fact.
- Data leakage and privacy violations-sensitive or personal data exposed through the AI or its APIs.
- API security vulnerabilities-misuse of APIs, auth bypass, or abuse of capabilities.
- Autonomous agent behaviour failures-agents taking unsafe actions, ignoring safeguards, or misusing tools.
- Model manipulation and data poisoning-adversarial inputs that corrupt behaviour or outputs.
- Autonomy and control issues-who is in charge when the AI acts, and can humans intervene?
Using this when you report
In your report, use clear language and-where it fits-map your finding to one or more of these categories. For example: "This is a robustness issue: the model fails under edge-case inputs." Or: "This could lead to discrimination because the output differs unfairly by group." That helps program owners and compliance teams triage and document findings for regulatory and litigation defence. For report structure, see Write a Clear AI Safety Report.
Next Article
Continue reading in this category