Understanding Program Safety Tiers

    Feb 24, 2026

    Programs on AIRTA Systems are classified into safety tiers based on the role AI plays in the product and the compliance obligations that follow. Knowing the tier helps you understand what kind of risks to look for and how your findings support the program owner's goals. When you're testing, the program brief may indicate or imply the tier-use it to prioritize what to look for.

    Why tiers matter for testers

    Each tier has a different focus-look for the risks that matter most for that type of product. The program brief may indicate or imply the tier.

    Tier 1: Code-Level

    What it is: Applications built with AI-generated code but no runtime AI-the app doesn't call an AI model while the user is using it. AI was used to build the product; the product itself isn't an AI system in operation.

    What to focus on:

    • Product liability risks and defects from generated code.
    • Consumer safety and fundamental rights protection.
    • Compliance documentation and evidence of due diligence.

    Testing is more like traditional software/product testing: correctness, safety of behaviour, and whether the built artefact is fit for purpose.

    Tier 2: AI-Enhanced

    What it is: Applications that use AI for specific features-e.g. a chatbot, a summariser, or a recommendation engine inside an otherwise non-AI product.

    What to focus on:

    • Prompt injection-can inputs steer the model to misbehave or leak data?
    • Data leakage and privacy-does the AI or the feature expose sensitive data?
    • Hallucinated or misleading outputs-wrong facts, fake citations, harmful advice.
    • Consumer protection-fairness, accuracy, and safety of the AI-driven feature.

    This is where many "classic" LLM risks show up: injection, hallucinations, and misuse of the AI feature.

    Tier 3: AI-Native

    What it is: Applications where AI is the core product-e.g. an autonomous agent, a primary conversational AI, or a decision-making system. The main value and risk come from the AI.

    What to focus on:

    • Alignment failures-does the system pursue goals or behaviours that diverge from intended use?
    • Unsafe autonomy-can the AI take harmful or high-impact actions without appropriate guardrails?
    • Fundamental rights violations and disproportionate impact on individuals or groups.
    • Comprehensive compliance requirements (e.g. EU AI Act) that apply to high-risk or general-purpose AI.

    Testing here often involves agent behaviour, tool use, multi-step reasoning, and system-level safety, not just single prompts.

    Quick reference

    • Tier 1: No runtime AI → product liability, consumer safety, compliance evidence.
    • Tier 2: AI for features → prompt injection, data leakage, hallucinations, consumer protection.
    • Tier 3: AI as core product → alignment, autonomy, fundamental rights, full compliance.

    When in doubt, check the program brief and scope; they should guide you on what's in bounds and what the program owner cares about most.

    Next Article

    Continue reading in this category

    Understanding Program Safety Tiers | AIRTA Systems AI Safety Academy