AI Escape Room
mod_aiescape
AI Escape Room (mod_aiescape) is a Moodle 5.0+ activity module that runs an AI-driven interactive escape-room / scenario game through Moodle's core_ai subsystem. Students progress through a scenario by picking AI-generated multiple-choice buttons, typing free text, or both. A per-attempt step tally drives gradebook integration and activity completion. It supports narrative and named-persona styles, teacher-configurable good/neutral/bad choice counts, secondary prompt buttons, keyword-based moderation flagging, teacher preview attempts, a sortable attempts report with conversation replay, course reset, and a full Privacy API implementation.
This is a carefully engineered, security-aware plugin with no exploitable vulnerabilities affecting other users' data, grades, or the host system.
Access control is airtight across the attack surface. All four AJAX web-service endpoints (start_attempt, send_message, trigger_button, quit_attempt) call validate_parameters(), validate_context() (which enforces require_login() in core), and require_capability('mod/aiescape:play', ...), and scope every attempt lookup to $USER->id, so no user can read or mutate another user's attempt. The page controllers (view.php, report.php, myattempts.php) each enforce require_login() plus the appropriate capability.
The core anti-cheat design is sound. The server records the exact choices the AI offered each turn and rejects any submitted choice that was not offered, computing the step delta server-side for multiple-choice selections. This neutralises grade forgery and keyword-moderation bypass through direct web-service calls. Completion is derived from the numeric step tally rather than the AI's self-reported completed flag, and out-of-range step values are clamped — a deliberate defence against prompt injection that the author documents in code.
No SQL injection (all queries parameterised or via get_in_or_equal()), no XSS (Mustache auto-escaping, format_text(..., FORMAT_PLAIN), s()), no raw HTTP/cURL (delegates to core_ai), no direct database-driver or file-store access, no code execution, and no hardcoded secrets — AI endpoint credentials are even redacted before display. Schema changes are confined to db/upgrade.php, the Privacy API is complete (including the user-list provider and an external-AI disclosure), backup/restore round-trips are test-covered, and there is solid PHPUnit and Behat coverage.
The only issues found are low and informational: in free-text/combo modes the per-turn step delta is the AI's evaluation of student-controlled input, which is a self-only, inherent grade-integrity caveat (multiple-choice mode is tamper-proof); a cosmetic flagged-count mismatch on the report landing page; and an AI-cost/abuse note that is mitigated by core_ai's per-user rate limiter and per-activity attempt caps. None of these allow one user to harm another.
Overview
mod_aiescape (AI Escape Room) is a Moodle 5.0+ activity module that runs an AI-driven interactive scenario game through Moodle's core_ai subsystem. Students progress by making multiple-choice selections and/or typing free text; a per-attempt step tally drives completion and grading.
Security posture
The plugin is well-engineered from a security standpoint. Key strengths:
- Web-service endpoints all call
validate_parameters(),validate_context()(which enforcesrequire_login()), andrequire_capability('mod/aiescape:play', ...), and scope every attempt to$USER->id, so a user cannot act on another user's attempt. - Anti-forgery choice validation — the server persists the exact choices the AI offered each turn (
lastchoicejson) and rejects any submittedchoicetype/choicelabelthat was not offered, computing the good/neutral/bad step delta server-side. This blocks grade forgery and keyword-moderation bypass via crafted web-service calls. - Prompt-injection awareness — completion is derived from the numeric step tally, never from the AI's self-reported
completedflag, and out-of-rangestepchangevalues are clamped to 0. - No SQL injection (parameterised queries /
get_in_or_equal()), no XSS (Mustache auto-escaping,format_text(..., FORMAT_PLAIN),s()), no raw HTTP/cURL (usescore_ai), no direct DB-driver or file-store access, noeval/exec, no hardcoded secrets (AI endpoint credentials are redacted before display), and all schema changes live indb/upgrade.php. - Complete Privacy API (including the user-list provider and an external-AI disclosure), full backup/restore with round-trip tests, and good PHPUnit + Behat coverage.
Findings
Only low- and info-level observations were identified:
- In free-text/combo modes the per-turn step delta comes from the AI's evaluation of student-controlled input — a self-only, inherent grade-integrity caveat (multiple-choice mode is tamper-proof).
- A cosmetic flagged-attempts count on the report landing page includes teacher preview attempts, unlike the list it links to.
- An AI-cost/abuse note: each turn triggers a billable AI call, mitigated by
core_airate limits and attempt caps.
No exploitable vulnerability affecting other users' data, grades, or the system was found.
Findings
In free text and combo game modes, when a student submits a free-text response (no choice button), the step delta applied to their tally is $result['stepchange'] — the value the AI returns after evaluating the student's own message. Because the student fully controls that message, prompt-injection-style input can coax the model into returning a favourable stepchange, letting the student advance the tally and ultimately reach the completion threshold (and the associated grade) without legitimately solving the scenario.
This does not affect multiple-choice selections: for good/neutral/bad choices the delta is computed server-side (+1/0/-1) and the submitted choice is validated against the server-recorded set the AI actually offered. The concern is limited to the AI-evaluated free-text path.
This is an inherent property of AI-graded free text rather than an implementation flaw, and the author has already mitigated the most dangerous vector — completion is gated on the numeric tally, not on the AI's self-reported completed flag. It is surfaced here as a grade-trust consideration for teachers who use free-text grading for summative assessment.
Low risk. The impact is self-only — a student can influence only their own grade on a single, game-style activity; no other user's data or grade is reachable. The behaviour is inherent to any AI free-text evaluation and the author has deliberately contained the worst case (completion is derived from the numeric tally rather than the AI's completed flag, and multiple-choice mode is fully server-validated). Teachers retain oversight via the attempt-replay report and keyword flagging. It is not exploitable against other users or the platform, so it does not rise above a grade-integrity best-practice note.
Client flow: the free-text send path calls sendMessage(text, '', '') (empty choicetype). Server-side, send_message::execute() records the student's message, calls attempt_manager::run_ai_turn() which builds the prompt from conversation history (including the new message) and calls core_ai, then parses the response. For choices, $presetchange (server-computed) is used; for free text, $presetchange is null so the AI's stepchange is applied via update_tally(). When the tally reaches steps, the attempt completes and grades/completion are written for non-preview attempts.
$stepchange = ($presetchange !== null) ? $presetchange : $result['stepchange'];
No pure-code fix fully removes this — it is intrinsic to letting an LLM score free text. Practical mitigations:
- For summative grading, prefer multiple choice mode, whose step deltas are server-computed and validated.
- Treat free-text/combo scoring as formative, and/or require a teacher review step before the grade is trusted.
- Document to teachers that free-text step outcomes are AI-determined and can be influenced by the student's wording.
$stepchange = isset($data['stepchange']) ? (int) $data['stepchange'] : 0;
if (!in_array($stepchange, self::VALID_STEPCHANGES, true)) {
$stepchange = 0;
}
The parser already clamps stepchange to {-1, 0, 1}, which is correct. The residual trust issue is not the range but that the value originates from the AI's evaluation of attacker-controlled text; see the mode/usage guidance above.
The attempts-report landing page (view 1) computes the flagged-attempts badge count without filtering out preview attempts, whereas the flagged-messages list it links to (view 4) and the activity-page badge in view.php both restrict to aa.ispreview = 0.
Because a teacher/manager previewing the activity in free-text/combo mode can themselves trigger keyword flags, the landing-page badge can report a higher number than the list actually shows. This is a cosmetic inconsistency with no security or data impact.
Low risk. Purely a display discrepancy visible to users who already hold mod/aiescape:viewreports (teachers/managers). No information disclosure, privilege, or data-integrity consequence — it is a technical-debt / correctness nit.
Three locations compute flagged counts: view.php (activity page badge, filters ispreview = 0), report.php view 4 (the flagged list query, filters ispreview = 0), and report.php view 1 (the landing-page badge, which omits the filter). Only the last is inconsistent.
$flaggedcount = $DB->count_records_sql(
'SELECT COUNT(f.id)
FROM {aiescape_flags} f
JOIN {aiescape_attempts} aa ON aa.id = f.attemptid
WHERE aa.aiescape = :aiescape',
['aiescape' => $aiescape->id]
);
Add the same preview filter used by the list and the view.php badge:
$flaggedcount = $DB->count_records_sql(
'SELECT COUNT(f.id)
FROM {aiescape_flags} f
JOIN {aiescape_attempts} aa ON aa.id = f.attemptid
WHERE aa.aiescape = :aiescape AND aa.ispreview = 0',
['aiescape' => $aiescape->id]
);
Every send_message and trigger_button call performs at least one core_ai generate_text action, and the choice-format retry loop can issue several per turn (up to the choiceretrylimit admin setting). With the default maxattempts = -1 (unlimited), unlimited free-text turns per attempt, and secondary buttons that may have no usage limit, a single enrolled student can drive unbounded AI usage and cost.
The plugin correctly delegates to core_ai, which enforces per-user and global provider rate limits, so this is an operational/configuration consideration rather than a code defect. It is noted so administrators size rate limits and attempt caps deliberately.
Informational. Reaching the AI call requires authentication, enrolment, and the mod/aiescape:play capability. Impact is cost/availability (the institution's AI spend), not data exposure or cross-user harm, and core_ai's rate limiter plus per-activity attempt/button caps mitigate it. Included as an operational awareness note.
The plugin uses the sanctioned \core_ai\manager::process_action() path. In core, the provider's is_request_allowed() consults \core_ai\rate_limiter (per-user and global limits) before dispatching, so abuse is throttled where the admin has configured limits. Attempt limits (maxattempts) and per-button usage limits provide additional, activity-level bounds.
$result = $atman->run_ai_turn($aiescape, $context, $USER->id, $messages, (int) $attempt->stepstally);
- Ensure a sensible per-user rate limit is configured on the AI provider (Site administration → AI → AI providers) — this is the primary control and is owned by
core_ai. - Consider a non-unlimited default for Maximum attempts and setting usage limits on secondary buttons.
- Optionally cap the number of turns per attempt for free-text/combo modes.
for ($attempt = 0; $attempt <= $retrylimit; $attempt++) {
The retry loop is bounded by the admin choiceretrylimit (0–5), which is reasonable; just be aware each retry is an additional billable call when the AI returns malformed choices.
Strong anti-forgery design. The store_offered_choices() / get_offered_choices() pair plus the label-and-type match check in send_message::execute() mean a student cannot submit a choicetype the AI never offered, and free text cannot be smuggled through the choice-label field (the recorded message is replaced by the validated choice label). This closes the grade-forgery and moderation-bypass vectors that commonly affect AI-choice activities, and it is backed by the lastchoicejson column added in the final upgrade step.
Prompt-injection-aware completion. Completion is derived solely from the numeric step tally, and prompt_builder carries an explicit comment explaining why the AI's self-reported completed flag must never be trusted. Combined with clamping stepchange to {-1, 0, 1} and flooring the tally at 0, this reflects good security hygiene for an LLM-integrated activity.
No thirdpartylibs.xml required. No third-party runtime library is bundled in the plugin tree (no vendor/ directory is shipped). The committed composer.json/composer.lock declare only development tooling (squizlabs/php_codesniffer, moodlehq/moodle-cs) plus moodle/moodle and the composer installer, so there is nothing to declare and no duplication of a library that Moodle core already ships.
Privacy API is complete. The provider implements the metadata, plugin, and core user-list providers; exports attempts, messages, and keyword flags; deletes by context, by user, and by user-list; and discloses the external AI provider as a data recipient. The char(1333) length on the activity name field matches Moodle 5.x core convention (core grade_items.itemname and mod_forum.name are also 1333), so it is correct rather than an oversized column.