Multiple choice questionnaires can be a useful tool in teaching, in particular under the current lockdown measures, where lots of teachers are asked to move their examinations online in a short time frame. This blog post summarizes the results of my research in designing and using such questionnaires: what are they good (or bad) for, how to design them well, and how to get them in the hands of your students.

# Pros and Cons of Multiple Choice Questions

Multiple Choice Questionnaires are often (fairly) criticized:

• There is no possibility for the student to be creative, or get creativity recognized
• They reinforce pure memorization, rather than the ability to creatively apply the knowledge. In a sense, they are “hackable”: one can learn “just enough” that one passes the test, without actual long term benefit. Even worst, one can learn patterns to guess the best answer without actual knowledge (more on this later)
• When given on paper, it might be relatively easy for participants to get an idea of what their neighbor answered (even involuntarily)
• Getting them well designed is difficult, and requires quite a lot of time from the part of the teachers.

Of course, the fact that they are used so widely points at the fact that they also have positive aspects, including:

• The grading is clear and consistent, instead of depending on the identity or even the mood of the corrector. If you ever graded knowledge questions, you probably know what I mean: grading fairly is much trickier than most care to admit.
• They can be graded automatically. This is good for big classes, mass exams (e.g. university entrance test) or self-test

In addition, they are a good format to quickly asses understanding of the material in a live quiz in class. Why is this important? As anyone ever having corrected exams probably knows, it regularly happens that most students have the same misunderstanding of a concept. This might be due to a lack of clarity on the part of the lecturer; or maybe it is just a difficult counter-intuitive concept. If the only interaction with student’s knowledge is in end-of-term exams, those misunderstanding cannot be identified early enough. It then seems unfair to penalize the students for the lack of clarity of the lecturer. Regular testing of the student’s knowledge potentially allows to identify and correct those misunderstanding in time. Multiple choice questionnaires can be used both by the student during self-study, and in an interactive setting at the beginning of a class using something like Mentimeter.

# How to Design Good Multiple Choice Questionnaires

Now let’s assume that, based on the previous section, we decide that we want to use multiple choice questions. How do we design an effective questionnaire?

One of the most surprising facts when designing a multiple choice questionnaire for the first time is how hard it can be. The main difficulty, compared to other types of examination, is that you also need to put thought in the choices which are not correct, called “distractors”. Their aim is to distract the candidate such that the correct option is obvious only for someone mastering the topic.

Fortunately, this issue was faced by all questionnaires designers in the past, and there are good guidelines available, such as the excellent guidelines from the University of Waterloo (Canada).

Here are a few principles to help designing good distractors for Multiple Choice Questionnaires:

• limit the number of options: the more options there are, the more difficult it is to come up with plausible distractors. A distractor that is obviously wrong is not better than no distractor at all. Research suggests that 3 items is a good number
• the good answer should be obvious for an expert: the distractors should be clearly wrong to someone mastering the topic, and the good answer clearly right. This can be harder than it sounds.
• the distractors should be plausible and attractive: students who do not master the material should be tempted to pick a distractor. It is a good idea to base them on common mistakes made by students, which can be identified by continuous testing. A technique I find effective is to slightly misuse jargon, in a way that an expert would consider obviously wrong: technical jargon tends to sound “right”1.
• the options should be grammatically consistent with the stem (the “question”): just try to make it painless to read
• all options should have the same level of detail: if one option is more specifically worded than the others, it will likely look like the right answer
• randomly distribute the correct response: students should not be able to infer a pattern in the location of the right answer.
• avoid words such as “always”, “never”, “all”…: most concepts are most subtle than that, and students know it. Those words tend to identify the option as a distractor.
• make sure the distractors are orthogonal: the truth of one distractor should not depend on the truth of another one. This is one of the restrictions that makes coming up with more than 3 choices tricky.
• make sure the truth value of a distractor does not depend on the answer of another question: This is the trickiest. It should not be possible to deduce the truth value of an option based on the answers to other questions.

The underlying principle is that someone not familiar with the topic should find all options as likely, and someone mastering the topic should find the right option obvious.

This difficulty in designing good questionnaires has implications on time allocation for the tasks related to the exam: it shifts the effort for the exam before the exam, compared to a typical exam. The effort after the exam is minimal.

The time issue is amplified by the fact that a multiple choice questionnaire can be answered very fast by a student mastering the topic, and thus should ideally contain a lot of questions, with a lot of variety.

This makes it even more important to start designing the test early. In the case of a final exam in a university class, for instance, you might want to write a few questions each week just after lecturing or the exercise session, when both the material and the potential errors of the students are fresh in your mind.

One of the biggest problems in multiple choice questions is that one cannot differentiate a student mastering a topic from a student good at guessing the answer.

To avoid guessing, multiple choice questions are sometimes graded using negative points for wrong answers.

This is generally not a good idea, as risk averse students who know the answer might hesitate to answer if not absolutely sure. Risk-taking students would on the opposite gamble on questions they do not know the answer of but are confident they guessed from the context (for instance bad distractors). The test should only test the knowledge of the students and not depend on their personality.

Standard setting2 is an approach to take into account the possibility to guess the answers of multiple choice questions, without resorting to negative marking. The idea is to increase the “pass threshold” to take into account the possibility to guess.

There are two broad categories for setting the threshold:

• norm-reference bases the threshold on the grades of the best students (eg. 60% of the 95th percentile grade)3.
• criterion-reference uses a formula computed a priori

A popular method to choose the passing grade is the Angoff Method (see e.g. that paper for a discussion). It works by letting experts determine the probability that a borderline student could guess the answer to every question in the questionnaire. Given the effort required, it is typically used for high-stakes standardized questionnaires, where the investment in being as fair as possible is worth it.

Fortunately, there are alse simpler approaches. For instance, The Gent University published a document explaining that, from academic year 2014-2015 onwards, they will only grade multiple choice questions using standard setting, where the passing grade is computed as:

$c = \sum_i ((n_i + 1) / 2 n_i) w_i$

where $$n_i$$ is the number of options and $$w_i$$ the weight for question $$i$$. Note that I could not find this “standard formula” anywhere else, and in particular could not find its derivation. My feeling is that is is an approximation for the score that a student could get by pure guessing. If anyone has more details, I would love to hear about them!

# Implementation

Before closing, let me share two tools I found helpful to deliver such questionnaires.

The first one is the exam LaTeX package.

If you are using LaTeX, the exam class allows to streamline creating questionnaires, including grading tables and other niceties. With some additional scripting, it is quite easy to generate a set of handouts where the order of the options differs.

When including multiple choice quizzes as a form of self assessment, I was happy to learn that one can embed such questionnaires in a static website, needing only HTML and CSS (not even javascript!). See here for the details. My first experiments were satisfying, and I plan to make this a part of the next iterations of my lecture.

# To Conclude

Multiple choice questions can be a useful tools to design engaging lectures. In particular, by allowing continuous testing of the student’s understanding, they can help correct those before it is too late. They can also be useful as a part of the evaluation, in complement to other forms of evaluations, such as projects.

1. I noticed students are also very good at using that fact the other way around in free-form knowledge questions. I was fooled more than once by an answer using the proper technical words, indicating a supposedly perfect understanding… followed by a trivial error in a subsequent exercise, revealing a fundamental misunderstanding of the same question.

2. In case you wonder where that strange name comes from, it refers to the fact of setting a standard against which students are evaluated. What it is not is a setting that would be standard for some reason. That took me some time.

3. To be honest, I find this to be a terrible idea in an academic setting, as the grade of a student should be a function of how much he or she fulfills the requirements, rather than how good the other students are. In particular, bad students dropping out mid-semester would lead to a general drop in grades, even if the remaining students continue to perform well!

#### Comment

Want to react? Send me an e-mail or use Webmentions

#### Webmentions

No webmentions were found.