Topic 2.4

Writing Good Assessment Items Is Tedious. Writing Bad Ones Is Easy.

AI generates the options. You ensure they actually test learning.

⏱️ 12 minutes 📋 Prompt Templates ✓ Quality Checklist

The Grind

Good assessments are hard to write. MCQs need plausible distractors. Scenario questions need realistic situations. Rubrics need clear criteria. Performance tasks need observable behaviors.

Time to write 20 solid MCQs: 2-3 hours.
Time to create scenario questions with rubrics: 3-4 hours.

💡 The shift

AI generates assessment items in minutes. You verify alignment with objectives and eliminate the obviously wrong answers.

The Basic Prompt

📋 Copy this template
Create [number] assessment items for [topic] at [cognitive level]. Format: [MCQ / scenario-based / performance task / rubric] Target audience: [describe learners] Aligned with: [specific learning objective] For MCQs: - Include 4 answer options - Make distractors plausible - Avoid "all of the above" or "none of the above"

Example:

Create 10 multiple-choice questions for customer service de-escalation at the Apply level. Format: MCQ with 4 answer options Target audience: Retail managers with 0-6 months experience Aligned with: "Demonstrate three de-escalation techniques in customer conflict scenarios" For MCQs: - Include 4 answer options - Make distractors plausible (common mistakes managers make) - Avoid "all of the above" or "none of the above"

What AI Does Well vs. What It Gets Wrong

✓ AI Strengths

  • Generates many options quickly
  • Creates plausible wrong answers
  • Follows format specifications
  • Aligns to specified cognitive levels

✗ AI Limitations

  • Subtle differences that matter in your field
  • What mistakes learners actually make
  • Whether distractors are truly plausible
  • Organization-specific policies or procedures

💡 The Division of Labor

AI generates structure and options. You verify technical accuracy and ensure distractors reflect real misunderstandings.

Creating MCQs

The most common assessment type. Also the easiest to get wrong.

📋 MCQ generation prompt
Create 10 multiple-choice questions testing [specific knowledge or skill]. Cognitive level: [Remember / Understand / Apply / Analyze / Evaluate] Format: 4 answer options (A, B, C, D) Distractors should represent: [common errors or misconceptions]
✓ MCQ Quality Checklist (6 checks)
Check Question
Clear stem Is the question complete and unambiguous?
One answer Is there exactly one correct option?
Plausible distractors Could a learner reasonably choose each wrong answer?
No giveaways Are options similar in length and detail?
No tricks Does it test knowledge, not reading comprehension?
Right level Does it match the cognitive level of the objective?

🚩 Red Flag

If the correct answer is obviously longer or more detailed than the others, AI just cued learners to the right answer.

Creating Scenario-Based Questions

Scenarios test application and decision-making—not recall.

📋 Scenario assessment prompt
Create a workplace scenario testing [specific skill]. Scenario should: - Present a realistic workplace situation - Include relevant context and constraints - Lead to a decision point Follow with 3-4 questions that assess: - Problem identification - Solution selection - Justification of approach Format: [MCQ / short answer / performance rubric]

The key difference: MCQs test "Do you know this?" Scenarios test "Can you use this?"

Creating Rubrics

For performance assessments, AI drafts criteria fast:

📋 Rubric generation prompt
Create a rubric for assessing [specific performance task]. Performance task: [describe what learners will do] Criteria to evaluate: [list 4-6 key dimensions] For each criterion, define: - Exemplary performance (4 points) - Proficient performance (3 points) - Developing performance (2 points) - Needs improvement (1 point) Make criteria observable and measurable.
📊 Example rubric output
Criterion Exemplary (4) Proficient (3) Developing (2) Needs Improvement (1)
De-escalation technique Uses all three techniques appropriately; adapts based on customer response Uses 2-3 techniques appropriately; some adaptation Uses 1-2 techniques; limited adaptation Does not use learned techniques
Active listening Reflects content and emotion accurately; customer feels heard Reflects content; some emotional acknowledgment Minimal reflection; customer repeats concerns No evidence of listening behaviors

The Alignment Problem

Every assessment must test what the objective says. Not close. Exactly.

🎯 Objective-Assessment alignment guide
Objective says... Question must require...
Apply Using knowledge in a situation—not recalling it
Evaluate Making a judgment—not following a procedure
Analyze Breaking down information—not describing it
Remember Recalling facts—exactly what was taught

The test: If the objective says "Evaluate," but the question just asks for a definition, they're misaligned. Fix it.

Common Mistakes AI Makes

🚩 Five mistakes to watch for
Mistake How to spot it How to fix it
Trick questions Multiple "correct" answers or confusing wording "Rewrite to have one clear correct answer"
Implausible distractors Wrong answers are obviously wrong "Make the distractors more plausible—what mistakes would learners actually make?"
Wrong cognitive level Objective says "Apply" but question tests recall "Rewrite at the Apply level—require using knowledge in a scenario"
Length cues Correct answer is much longer/more detailed "Make all answer options similar in length and detail"
Generic context Examples don't match learners' actual work "Rewrite with examples from [specific industry/role]"

The Refinement Loop

Don't restart. Direct specific fixes:

💬 Refinement prompts
Weak distractors? "Make option B more plausible—it should represent a common mistake learners make when [specific error]." Wrong level? "This tests recall. Rewrite at the Apply level—require using this knowledge in a realistic scenario." Too obvious? "The correct answer is much longer than the others. Make all options similar in length." Not aligned? "The objective requires evaluation. Rewrite to require making a judgment, not just selecting a procedure."

Key Takeaways

  1. AI generates volume. You ensure quality and alignment.
  2. Check alignment first. Does the question test what the objective says?
  3. Verify distractors. Could learners reasonably pick each wrong answer?
  4. Contextualize examples. Generic scenarios don't test real application.

Try It Now

🎯 Your task:

Pick an objective from your current project. Generate 5 MCQs. Run each through the quality checklist. Fix at least 2 issues AI created.

The test: Could a learner pass by guessing, or do they actually need to know the material?

📥 Download: Assessment templates and quality checklists (PDF)

Ready-to-use templates for MCQs, scenarios, and rubrics with alignment guides.

Download PDF