The optimal number of choices in a multiple-choice test

iStock_000014316766XSmallSome researchers contend that the key to multiple-choice tests is the quality of distractor items rather than their number. More recent evidence questions the standard use of four or five options, finding three options sufficient in most cases. Researchers have also found that it is very difficult to write three or four plausible distractors for each item. They recommend decreasing the number of choices per item but making the test longer so that it can cover more content area and better discriminate among high-ability students.

Michael C. Rodriquez, University of Minnesota, carried out a meta-analysis to synthesize these results to estimate the effect that the number of options in multiple-choice items has on test score reliability and validity. Meta-analysis is a method of combining summary statistics to achieve the approximation of a mega-study.

Rodriquez used data from 27 studies conducted in K-12 classes including both classroom-based tests and standardized instruments to determine the optimal number of multiple-choice options.

Rodriquez writes that performance and authentic assessments have profound importance for demonstrating real-life activities and are important tools for teachers. At the same time, the role of multiple-choice items is important in assessing broad ranges of knowledge and comprehension and, although more difficult, for assessing higher order thinking skills as well.

In this study, he reviewed the existing empirical research as well as narrative and theoretical reviews regarding the optimal number of multiple-choice options, and then synthesized the empirical findings using metaanalytic techniques.

Effectiveness of options

Empirical research on adequate options in multiple-choice questions covers a wide range of conditions, subject areas, ages and testing stakes. The effectiveness of options was judged by how plausible the options were. An option was judged to function as an effective distractor if at least 5 percent of the participants selected it. Previous research indicates that most test makers can only produce three effective options per item–one correct answer and two plausible distractors.

Many four- and five-option items had one or more non-functional options. Rodriquez’s goal was to formally synthesize the empirical results of these 27 research studies to estimate the effect that the number of options had on item difficulty, item discrimination, test-score reliability and test validity. A total of 12,591 participants were involved in these studies. In some of the tests, distracting options were eliminated randomly, in others, the most ineffective options were deleted.

Validity and number of options

Across nearly all the studies, reducing the number of options from four to three resulted in only a small decrease in item difficulty. Differences in reliability varied significantly across studies. In most cases, reduction in the number of options decreased reliability except when options were reduced from four to three, which slightly increased the reliability of test scores. Only two of the 27 studies provided test-validity data. Both revealed a statistically negligible change in validity when options were reduced from four or five to three. This evidence, therefore, provides no support for the argument that increasing the number of options in items raises the validity of the test.

Comparing the differences in methods of deleting item options revealed differences in test score reliability. Random deletion of options reduced reliability, but when ineffective options were deleted, there was no change in reliability.

Reducing options from five to four reduces item difficulty, item discrimination and reliability. Moving from five to three options reduced the difficulty of items but did not affect discrimination or reliability.

Three options optimal

Rodriquez concludes that, in most cases, only three options are feasible for most test developers. Providing more options does little to improve item and test-score statistics and typically results in implausible distractors. There are also practical considerations: less time is needed to prepare two plausible distractors than three or more; more threeoption items can be administered in the same time period, allowing for more content coverage; and the inclusion of additional high-quality items should improve test-score reliability and validity, providing greater consistency of scores and score meaningfulness and usability.

The idea that it is easier to guess correctly when given three options rather than four was not proved in this analysis. Rodriquez believes that using no more choices than required to suppress guessing is more efficient. The quality of the choices guards against guessing better than a larger number of less plausible choices.

However, the real benefit from fewer choices is the potential to provide more items and cover more content in the same amount of time. The method of reducing the number of options does affect the results, and only implausible options should be deleted. The results of this meta-analysis indicate that three is the optimal, effective number of options for multiple-choice questions.

“Three Options Are Optimal for Multiple-Choice Items: A Meta-Analysis of 80 Years of Research”, Educational Measurement: Issues and Practice, Volume 24, Number 2, June 2005, pp. 3-13.

Published in ERN September 2005 Volume 18 Number 6

Leave a Reply

  • (will not be published)