MDL Shield — Security Review

AI Escape Room

mod_aiescape

Table of Contents

Low(2)

#1Free-text step progression is decided by the AI from student-controlled input (grade-integrity caveat)
#2Flagged-attempts count on the report landing page includes teacher preview attempts, unlike the list it links to

Info(1)

#3Each game turn triggers a billable AI request; abuse control relies on core_ai rate limits and attempt caps

Plugin Information

AI Escape Room (mod_aiescape) is a Moodle 5.0+ activity module that runs an AI-driven interactive escape-room / scenario game through Moodle's core_ai subsystem. Students progress through a scenario by picking AI-generated multiple-choice buttons, typing free text, or both. A per-attempt step tally drives gradebook integration and activity completion. It supports narrative and named-persona styles, teacher-configurable good/neutral/bad choice counts, secondary prompt buttons, keyword-based moderation flagging, teacher preview attempts, a sortable attempts report with conversation replay, course reset, and a full Privacy API implementation.

moodle.org/plugins/mod_aiescape

adamjenkins/moodle-mod_aiescapev1.0.5

Version:2026062700

Release:1.0.5

Reviewed for:5.2

Privacy API

Unit Tests

Behat Tests

Reviewed:2026-07-05

43 files·7,356 lines

Grade Justification

This is a carefully engineered, security-aware plugin with no exploitable vulnerabilities affecting other users' data, grades, or the host system.

Access control is airtight across the attack surface. All four AJAX web-service endpoints (start_attempt, send_message, trigger_button, quit_attempt) call validate_parameters(), validate_context() (which enforces require_login() in core), and require_capability('mod/aiescape:play', ...), and scope every attempt lookup to $USER->id, so no user can read or mutate another user's attempt. The page controllers (view.php, report.php, myattempts.php) each enforce require_login() plus the appropriate capability.

The core anti-cheat design is sound. The server records the exact choices the AI offered each turn and rejects any submitted choice that was not offered, computing the step delta server-side for multiple-choice selections. This neutralises grade forgery and keyword-moderation bypass through direct web-service calls. Completion is derived from the numeric step tally rather than the AI's self-reported completed flag, and out-of-range step values are clamped — a deliberate defence against prompt injection that the author documents in code.

No SQL injection (all queries parameterised or via get_in_or_equal()), no XSS (Mustache auto-escaping, format_text(..., FORMAT_PLAIN), s()), no raw HTTP/cURL (delegates to core_ai), no direct database-driver or file-store access, no code execution, and no hardcoded secrets — AI endpoint credentials are even redacted before display. Schema changes are confined to db/upgrade.php, the Privacy API is complete (including the user-list provider and an external-AI disclosure), backup/restore round-trips are test-covered, and there is solid PHPUnit and Behat coverage.

The only issues found are low and informational: in free-text/combo modes the per-turn step delta is the AI's evaluation of student-controlled input, which is a self-only, inherent grade-integrity caveat (multiple-choice mode is tamper-proof); a cosmetic flagged-count mismatch on the report landing page; and an AI-cost/abuse note that is mitigated by core_ai's per-user rate limiter and per-activity attempt caps. None of these allow one user to harm another.

AI Summary

Overview

mod_aiescape (AI Escape Room) is a Moodle 5.0+ activity module that runs an AI-driven interactive scenario game through Moodle's core_ai subsystem. Students progress by making multiple-choice selections and/or typing free text; a per-attempt step tally drives completion and grading.

Security posture

The plugin is well-engineered from a security standpoint. Key strengths:

Web-service endpoints all call validate_parameters(), validate_context() (which enforces require_login()), and require_capability('mod/aiescape:play', ...), and scope every attempt to $USER->id, so a user cannot act on another user's attempt.
Anti-forgery choice validation — the server persists the exact choices the AI offered each turn (lastchoicejson) and rejects any submitted choicetype/choicelabel that was not offered, computing the good/neutral/bad step delta server-side. This blocks grade forgery and keyword-moderation bypass via crafted web-service calls.
Prompt-injection awareness — completion is derived from the numeric step tally, never from the AI's self-reported completed flag, and out-of-range stepchange values are clamped to 0.
No SQL injection (parameterised queries / get_in_or_equal()), no XSS (Mustache auto-escaping, format_text(..., FORMAT_PLAIN), s()), no raw HTTP/cURL (uses core_ai), no direct DB-driver or file-store access, no eval/exec, no hardcoded secrets (AI endpoint credentials are redacted before display), and all schema changes live in db/upgrade.php.
Complete Privacy API (including the user-list provider and an external-AI disclosure), full backup/restore with round-trip tests, and good PHPUnit + Behat coverage.

Findings

Only low- and info-level observations were identified:

In free-text/combo modes the per-turn step delta comes from the AI's evaluation of student-controlled input — a self-only, inherent grade-integrity caveat (multiple-choice mode is tamper-proof).
A cosmetic flagged-attempts count on the report landing page includes teacher preview attempts, unlike the list it links to.
An AI-cost/abuse note: each turn triggers a billable AI call, mitigated by core_ai rate limits and attempt caps.

No exploitable vulnerability affecting other users' data, grades, or the system was found.

Findings

best practiceLow

Free-text step progression is decided by the AI from student-controlled input (grade-integrity caveat)

In free text and combo game modes, when a student submits a free-text response (no choice button), the step delta applied to their tally is $result['stepchange'] — the value the AI returns after evaluating the student's own message. Because the student fully controls that message, prompt-injection-style input can coax the model into returning a favourable stepchange, letting the student advance the tally and ultimately reach the completion threshold (and the associated grade) without legitimately solving the scenario.

This does not affect multiple-choice selections: for good/neutral/bad choices the delta is computed server-side (+1/0/-1) and the submitted choice is validated against the server-recorded set the AI actually offered. The concern is limited to the AI-evaluated free-text path.

This is an inherent property of AI-graded free text rather than an implementation flaw, and the author has already mitigated the most dangerous vector — completion is gated on the numeric tally, not on the AI's self-reported completed flag. It is surfaced here as a grade-trust consideration for teachers who use free-text grading for summative assessment.

Risk Assessment

Low risk. The impact is self-only — a student can influence only their own grade on a single, game-style activity; no other user's data or grade is reachable. The behaviour is inherent to any AI free-text evaluation and the author has deliberately contained the worst case (completion is derived from the numeric tally rather than the AI's completed flag, and multiple-choice mode is fully server-validated). Teachers retain oversight via the attempt-replay report and keyword flagging. It is not exploitable against other users or the platform, so it does not rise above a grade-integrity best-practice note.

Context

Client flow: the free-text send path calls sendMessage(text, '', '') (empty choicetype). Server-side, send_message::execute() records the student's message, calls attempt_manager::run_ai_turn() which builds the prompt from conversation history (including the new message) and calls core_ai, then parses the response. For choices, $presetchange (server-computed) is used; for free text, $presetchange is null so the AI's stepchange is applied via update_tally(). When the tally reaches steps, the attempt completes and grades/completion are written for non-preview attempts.

classes/external/send_message.php:156

Identified Code

$stepchange = ($presetchange !== null) ? $presetchange : $result['stepchange'];

Suggested Fix

No pure-code fix fully removes this — it is intrinsic to letting an LLM score free text. Practical mitigations:

For summative grading, prefer multiple choice mode, whose step deltas are server-computed and validated.
Treat free-text/combo scoring as formative, and/or require a teacher review step before the grade is trusted.
Document to teachers that free-text step outcomes are AI-determined and can be influenced by the student's wording.

classes/ai/response_parser.php:72

Identified Code

$stepchange = isset($data['stepchange']) ? (int) $data['stepchange'] : 0;

if (!in_array($stepchange, self::VALID_STEPCHANGES, true)) {
    $stepchange = 0;
}

Suggested Fix

The parser already clamps stepchange to {-1, 0, 1}, which is correct. The residual trust issue is not the range but that the value originates from the AI's evaluation of attacker-controlled text; see the mode/usage guidance above.

code qualityLow

Flagged-attempts count on the report landing page includes teacher preview attempts, unlike the list it links to

The attempts-report landing page (view 1) computes the flagged-attempts badge count without filtering out preview attempts, whereas the flagged-messages list it links to (view 4) and the activity-page badge in view.php both restrict to aa.ispreview = 0.

Because a teacher/manager previewing the activity in free-text/combo mode can themselves trigger keyword flags, the landing-page badge can report a higher number than the list actually shows. This is a cosmetic inconsistency with no security or data impact.

Risk Assessment

Low risk. Purely a display discrepancy visible to users who already hold mod/aiescape:viewreports (teachers/managers). No information disclosure, privilege, or data-integrity consequence — it is a technical-debt / correctness nit.

Context

Three locations compute flagged counts: view.php (activity page badge, filters ispreview = 0), report.php view 4 (the flagged list query, filters ispreview = 0), and report.php view 1 (the landing-page badge, which omits the filter). Only the last is inconsistent.

report.php:212

Identified Code

$flaggedcount = $DB->count_records_sql(
    'SELECT COUNT(f.id)
       FROM {aiescape_flags} f
       JOIN {aiescape_attempts} aa ON aa.id = f.attemptid
      WHERE aa.aiescape = :aiescape',
    ['aiescape' => $aiescape->id]
);

Suggested Fix

Add the same preview filter used by the list and the view.php badge:

$flaggedcount = $DB->count_records_sql(
    'SELECT COUNT(f.id)
       FROM {aiescape_flags} f
       JOIN {aiescape_attempts} aa ON aa.id = f.attemptid
      WHERE aa.aiescape = :aiescape AND aa.ispreview = 0',
    ['aiescape' => $aiescape->id]
);

best practiceInfo

Each game turn triggers a billable AI request; abuse control relies on core_ai rate limits and attempt caps

Every send_message and trigger_button call performs at least one core_ai generate_text action, and the choice-format retry loop can issue several per turn (up to the choiceretrylimit admin setting). With the default maxattempts = -1 (unlimited), unlimited free-text turns per attempt, and secondary buttons that may have no usage limit, a single enrolled student can drive unbounded AI usage and cost.

The plugin correctly delegates to core_ai, which enforces per-user and global provider rate limits, so this is an operational/configuration consideration rather than a code defect. It is noted so administrators size rate limits and attempt caps deliberately.

Risk Assessment

Informational. Reaching the AI call requires authentication, enrolment, and the mod/aiescape:play capability. Impact is cost/availability (the institution's AI spend), not data exposure or cross-user harm, and core_ai's rate limiter plus per-activity attempt/button caps mitigate it. Included as an operational awareness note.

Context

The plugin uses the sanctioned \core_ai\manager::process_action() path. In core, the provider's is_request_allowed() consults \core_ai\rate_limiter (per-user and global limits) before dispatching, so abuse is throttled where the admin has configured limits. Attempt limits (maxattempts) and per-button usage limits provide additional, activity-level bounds.

classes/external/send_message.php:153

Identified Code

$result   = $atman->run_ai_turn($aiescape, $context, $USER->id, $messages, (int) $attempt->stepstally);

Suggested Fix

Ensure a sensible per-user rate limit is configured on the AI provider (Site administration → AI → AI providers) — this is the primary control and is owned by core_ai.
Consider a non-unlimited default for Maximum attempts and setting usage limits on secondary buttons.
Optionally cap the number of turns per attempt for free-text/combo modes.

classes/attempt_manager.php:243

Identified Code

for ($attempt = 0; $attempt <= $retrylimit; $attempt++) {

Suggested Fix

The retry loop is bounded by the admin choiceretrylimit (0–5), which is reasonable; just be aware each retry is an additional billable call when the AI returns malformed choices.

Additional AI Notes

Strong anti-forgery design. The store_offered_choices() / get_offered_choices() pair plus the label-and-type match check in send_message::execute() mean a student cannot submit a choicetype the AI never offered, and free text cannot be smuggled through the choice-label field (the recorded message is replaced by the validated choice label). This closes the grade-forgery and moderation-bypass vectors that commonly affect AI-choice activities, and it is backed by the lastchoicejson column added in the final upgrade step.

Prompt-injection-aware completion. Completion is derived solely from the numeric step tally, and prompt_builder carries an explicit comment explaining why the AI's self-reported completed flag must never be trusted. Combined with clamping stepchange to {-1, 0, 1} and flooring the tally at 0, this reflects good security hygiene for an LLM-integrated activity.

No thirdpartylibs.xml required. No third-party runtime library is bundled in the plugin tree (no vendor/ directory is shipped). The committed composer.json/composer.lock declare only development tooling (squizlabs/php_codesniffer, moodlehq/moodle-cs) plus moodle/moodle and the composer installer, so there is nothing to declare and no duplication of a library that Moodle core already ships.

Privacy API is complete. The provider implements the metadata, plugin, and core user-list providers; exports attempts, messages, and keyword flags; deletes by context, by user, and by user-list; and discloses the external AI provider as a data recipient. The char(1333) length on the activity name field matches Moodle 5.x core convention (core grade_items.itemname and mod_forum.name are also 1333), so it is correct rather than an oversized column.

This review was generated by an AI system and may contain inaccuracies. Findings should be verified by a human reviewer before acting on them.