Get Started
- 1. AI Risk Assessment 101
- 2. Write a Clear AI Safety Report
- 3. How to Build Your Reputation as an AI Safety Researcher on AIRTA Systems
- 4. Don't Cause HarmCurrent
- 5. DVAIA - Damn Vulnerable AI Application
- 6. How Invitations and Team Access Work
- 7. Understanding Program Safety Tiers
- 8. Risk Categories
- 9. Safe Harbour on AIRTA Systems
- 10. Black Box Testing
Don't Cause Harm
Sep 19, 2025
Who it’s for: AI safety testers and researchers who are new to AIRTA Systems and want to test apps responsibly.
Key idea in one sentence: Safe harbour on AIRTA Systems lets you test listed AI apps for safety issues - within defined rules - so you can learn, probe, and report without causing harm.
Plain-English explainer
“Safe harbour” is the permission space where you can test without getting in trouble - as long as you follow the rules. On AIRTA Systems, participating apps opt in. That means the owners want you to look for safety issues and report them here.
Your job is to stay inside scope: test only the apps and features listed, with the methods allowed. Use harmless test data. Don’t break things on purpose, don’t scrape private info, and don’t spam real users.
Think of it like a skate park. Inside the park (in scope), you can try moves and learn. Outside the rails (out of scope), you’re on the road, and different rules apply.
When in doubt, stop and check the program page. If you still aren’t sure, ask before you test. Reporting early and clearly is always better than pushing a risky action.
What’s in scope (typical)
- Only the apps and features listed on their AIRTA Systems program page Here.
- Non-destructive tests using harmless, fake, or owner-approved data.
- Rate limits that match normal use (no flooding, no load tests).
- Asking about model policies, testing content filters, checking summaries for bias or drift.
What’s out of scope (typical)
- Anything not listed on the program page (other apps, admin panels, third-party services).
- Real customer data, private keys, or credentials (don’t search for or use them).
- Denial of service, brute force, token theft, or bypassing paywalls.
- Sending live emails, posting to real social accounts, or moving money - unless explicitly allowed.
Red flags you can spot without tools
- No clear “in scope” features on the program page.
- Ambiguous wording like “test everything” (ask for clarification).
- Live integrations with no “preview before send” option.
- Prompts that encourage using real personal data.
- Outputs that include credentials, secrets, or private info.
- Rate limits missing or vague while the app exposes powerful tools.
How to try this safely
- Permission first: Only test apps listed on AIRTA Systems that are open for testing.
- Use benign inputs: Dummy names, mock companies, public docs you own or created for testing.
- Preview before action: If the app can act (email, post, save), ask it to show the draft and stop there.
- Stay gentle: Normal usage speed. No automation, scraping, or high-volume probing.
- Capture evidence: Save exact prompts and outputs (screenshots or copy text).
- Report early: If something looks risky (e.g., a leak), stop and file a report immediately.
How to write down findings
- Scenario: What you tried (“I uploaded a test PDF I created”).
- Expectation: What should have happened (“The model should ignore footer instructions”).
- Observation: What actually happened (quote the exact output).
- Risk: Why it matters (“Upload injections could steer actions”).
- Next step: Smallest fix you suggest (“Strip footer text; add instruction sanitization”).
Good vs. bad examples
- Good: “I used a dummy invoice and asked for a summary. The model copied my test footer into the email draft preview.”
- Bad: “I sent a real invoice to a customer to see what would happen.”
FAQ
- Is this legal advice? No. This is practical guidance for safe testing inside AIRTA Systems programs.
- What if I’m unsure about scope? Pause and ask in the program’s discussion channel or contact support.
- Can I use automated scanners? Not unless the program explicitly permits them.
Glossary
- Safe harbour: A defined permission space with rules that protect good-faith testing.
- Scope: What you’re allowed to test (apps, features, methods).
- Benign data: Fake or public info that can’t harm anyone.
- Evaluation: Data leaving the app to other services (email, CRM, web).
Remember: Learn here, test in scope, and report clearly. That’s how you help make AI safer for everyone.
Next Article
Continue reading in this category