Patrick Glatz | System Investigations

AI Authorship Evaluator

Expand

Failure Mode: AI-assisted workflows shift control of decisions and reasoning away from the human without detection

Intervention: Rule-based comparison of expected vs observed authorship

Outcome: Undetected transfer of authorship control becomes impossible

- Task conditions define expected authorship range
- Evaluation measures agency, cognitive engagement, evidentiary control, and authorship integrity
- Observed authorship is computed using structured criteria
- System compares expected vs observed authorship
- Output quality and correctness are excluded from evaluation

→ GitHub Link

Advisory Drift Study

Expand

Failure Mode: Mid-process context injection replaces initial recommendations with new classifications and constraints

Intervention: Fixed interaction sequence with delayed context injection and post-hoc scoring

Outcome: Recommendation replacement without trace becomes impossible

- Initial recommendation is required before additional context
- Sessions are isolated with no iteration or mid-run edits
- Context injection occurs only after recommendation is established
- Outputs are captured before any scoring occurs
- Evaluation enforces classification shift, constraint durability, and urgency tracking

→ View Case Study (PDF)

Classroom System Study

Expand

Failure Mode: Instructional escalation is driven by staff discretion under delayed behavioral data

Intervention: Rule-governed instructional lanes with interval-based data capture

Outcome: Discretionary escalation outside defined pressure rules becomes impossible

- Instruction is limited to three lanes with rule-based transitions
- Escalation decisions are governed by predefined pressure rules
- Data is captured at fixed intervals with append-only entry
- Graph generation occurs automatically without manual transfer
- Classroom zones are mapped to instructional pressure states

→ View Case Study (PDF)

Failure Without Rescue

Expand

Failure Mode: Silent correction replaces rule-based outcomes during simulation runs

Intervention: Irreversible decision locking with delayed resolution sequencing

Outcome: Outcome alteration after submission becomes impossible

- State display is read-only with manual updates only
- Decision capture is append-only with no edit or retry paths
- Resolution executes only after decision lock with no external input
- Data flow is one-directional from decision → resolution → reveal
- Instructor cannot modify outcomes at any stage

→ View Case Study (PDF)

Governed Writing System

Expand

Failure Mode: AI systems generate and modify text without exposing student decisions or authorship

Intervention: Rule-enforced interaction flow with immutable checkpoints

Outcome: System-driven rewriting or inference of student work becomes impossible

- Progression requires explicit acknowledgment at each stage
- Drafts are generated only from student-provided ideas
- Drafts must be frozen before critique begins
- Critique is limited to question-based prompts only
- All interactions produce immutable logs

→ View Case Study (PDF)

Intent Before Movement — Training in Dungeon

Expand

Failure Mode: Execution proceeds on incomplete plans, producing mismatched outcomes

Intervention: Full instruction commitment before execution with non-intervening system behavior

Outcome: Mid-execution correction or system inference becomes impossible

- Execution begins only after full instruction submission
- System executes instructions exactly with no correction
- No hints, previews, or feedback are provided during execution
- Environment is constrained to fixed grid and command set
- Revision occurs only after full execution completes

→ GitHub Link

Making Intent Legible — DALL·E

Expand

Failure Mode: Image generation resolves underspecified prompts through untracked assumptions

Intervention: Pre-generation intent gating with schema validation

Outcome: Generation from underspecified or unvalidated intent becomes impossible

- Generation is blocked until required fields are validated
- Interaction isolates decisions one question at a time
- Output is structured into JSON representations
- Tiered schemas enforce increasing constraint depth
- Guardrails constrain topic scope during prompt construction

→ View Case Study (PDF)