Ferris State University

Center for Teaching, Learning & Faculty Development
Top Ten Test Writing Tips
  Based on the book Developing and Using Tests Effectively by Jacobs and Chase 1992.

Tip One - The extent to which students engage in deep or surface learning is tied directly to their perceptions of what they will be tested on.

  1. The first test sets the tone and expectation for the student.
  2. The earlier in the semester the better

Tip Two - Test length is crucial to the reliability of the test.

Factors to consider:

  1. Time available—fifty minutes or seventy-five minutes
  2. Type of questions being asked
    True and false ---30 sec. Multiple choice --- one min.
    Completion --- one min.
    Short answer --- two min.
    Multiple choice with higher level thinking ---90 sec.
    Matching items --- 30 sec.
    Short essays ---10-15 min.
    Extended essay ---30 min.
  3. Other Time Considerations
    Numbers of items for a fifty-minute test are 40-45 multiple-choice items or 60–80 true and false.
    The fastest student will typically finish a test in about half the time as the slowest student.

Tip Three - Develop Tests that have a High Level of Reliability and Validity

Factors that Influence Reliability

  • The length of the test
  • The time limits for the tests
  • The nature of the student group (i.e. If the group is quite homogeneous the reliability will be lower than if the group is fairly heterogeneous)
  • The difficulty of the test items ( 50-70% of the students should be able to get the answer correct)
  • Common set of instructions
  • A common environment in which to attempt the test
  • The scoring procedure of the test

How to Improve Test Reliability

  • Tests should be long enough to sample the content well
  • Time limit should allow most students to finish
  • Score range should be wide, items at mid-difficulty range
  • Items should be free of ambiguity and tricks
  • Directions should be clear and concise

What about Essay Test?

  • The biggest issue is the consistency of the reader of the test answers.
  • Write questions that are not too wide in scope
  • Develop a prescribed scoring method.
    - A check list of things that need to be in the answer and their point value
    - A written response to each question by the instructor that is reviewed before reading the student answers.
  • Reading all of the answers to one question across all tests so you focus on one answer at a time.

Factors that Affect Validity

  • Directions are not clear
  • Test requires inappropriate levels of skills that are not part of the course objectives—teach at one level test at another
  • Test items are poorly written
  • Test length does not allow for adequate sampling of content
  • Complexity and subjectivity of scoring inaccurately rank some students.
  • If the scoring process has many steps there are many opportunities for mistakes
  • If it is subjective and easily influenced by factors not part of the teaching objectives

Improving Validity

  • Have someone else read your test for clarity of directions and questions
  • Use a test matrix to check balance of questions in relationship to what was taught
  • Develop test questions the day you teach the material
  • Ask enough questions to adequately cover the material that was taught

Tip Four - Tips for Writing Multiple Choice Questions

  1. You need a quality STEM—a quality stem is one in which the students are able to read the stem and formulate a tentative answer even before reading the answer options.
  2. A stem may be an incomplete sentence
  3. The stem should be in the simplest form consistent with precision and clarity.
  4. Always present a verb in the stem
  5. Do not pad the stem with superfluous material this only adds to students reading time.
  6. State if you want the students to find the correct answer or the best answer. If it is to be the correct answer it must be correct beyond any question.

Writing Distractors

  1. Write distractors that are plausible enough to attract students that do not know the material very well.
  2. If you can’t develop a sufficient number of plausible answers then do not use the question
  3. Using humor in the answers usually is just a give away to the students. It is a cue to ignore that answer.

Example

The founder of Ferris State University was

  1. Ferris Buller
  2. Ferris Wheel
  3. Woodbridge Ferris

Make the distractors fairly homogeneous. This will increase the need for the students to be discriminating in their choices.

Avoid giving irrelevant clues to the students. You want to measure their content and cognitive skill abilities not their test taking skills.

Examples of irrelevant clues

Length clues—the longest answer is often the correct answer.

Verbal association—using a word in the stem that also appears in the answers

Grammatical clues

Example Grammar Clue

The coefficient of correlation…social studies test is called a

  1. Validity coefficient
  2. Index of reliability
  3. Equivalence coefficient

Specific stems—these are modifying words or phrases that limit the meaning of sentences

Example

all, never, always (associated with the incorrect answer)

usually, typically maybe,

sometimes (associated with the correct answer)

Positives and Negatives

  • Use positive statements if possible.
  • Negative statements can be confusing for students to interpret.
  • If you use negative wording call attention to it by underlining it.

All of the Above/None of Above

  • Use options like all of the above or none of the above rarely.
  • These distractors are generally too easy.
  • If even one of the answer choices is recognized as being incorrect then the student also knows that all of the above is incorrect
  • All of the above maybe a proper use if the instructor is trying to determine if the students have learned all of the relevant characteristics or attributes of a phenomenon
  • Using none of the above is especially difficult if you are asking students to find the correct answer as it may be easy to argue that at least one of the answers was correct in some way.

Additional Considerations when Writing Questions

  • Item independence. Getting the correct answer to one item should not be contingent upon getting the correct response to other items.
  • Avoid letting one answer provide a clue to another answer.
  • Arrange the options (answers) in a logical order (alpha order or if numbers in ascending order).
  • The correct response choice (a,b,c, or d) should be equally divided
  • If the items are controversial site the authority whose opinion is being used…According to my lecture or In Freud’s opinion…
  • Avoid lifting stems verbatim from the text…this encourages students to memorize rather than fully understand the material.
  • Arrange the answer options in vertical columns. This makes the reading easier and less confusing

Summary Checklist for Writing Multiple Choice Items

  1. Make sure the item measures significant concepts and principles: do not write items covering trivia.
  2. The stem should present a problem; thus, a verb is necessary.
  3. State the item clearly and concisely use only relevant material.
  4. Include as much of the item material in the stem; do not repeat words or phrases in each distractor that could be put in the stem.
  5. Write one correct or clearly best answer and three or four plausible distractors
  6. Avoid giving clues to the right answer; some common clues are grammatical, some involve length of the options, and some use specific determiners.

Tip Five - Writing True-False Test Questions

Positives For Using True-False

  1. True-false can sample many more bits of information in a given time period than any other type of test format.
  2. The greater number of questions increases the reliability of the test.
  3. True-false tests can be less reliable than multiple choice unless the number of questions asked (90 questions in a fifty-minute period) is high.
  4. Research does indicate true-false testing is sufficiently reliable and valid for periodic in classroom testing.

Negatives of True-False

  1. It can be difficult to write true-false questions that avoid ambiguous statements without making the items obvious.
  2. Writing true or false statements that have no exceptions is problematic.
  3. Guessing on the part of the student (50-50 chance)
  4. Students can also make educated guesses increasing their odds beyond 50-50 but still not know the answer outright.

Important Steps in Writing True-False Questions

  • Make it clear where the answers are to be placed and what sign (T or F) or word is to be used. Avoid using a plus (+) and minus (-) sign as the minus can be made into a plus easily.
  • Avoid the use of specific determiners (all, never, always). These are a sign of false answers.
  • Avoid the use of qualifying terms (sometimes, usually, typically). These are signs of true answers.
  • Avoid the use of indefinite terms denoting degree or amount (a long time ago, a very large part). These are ambiguous and thus make the answer into a debate
  • Don't leave questions up to interpretation

Example:

POORLY WRITTEN:  In his study of AIDS, Dr Wye found that many of those who contracted the HIV virus were exposed through the use of drug needles that had been used by an infected person.

BETTER VERSION:  In his study of AIDS, Dr Wye found that many (over 20 percent) …infected person.

Assessing students’ knowledge is the goal not their ability to interpret complex sentences.

Use of Compound Sentences

Can be used by stating a condition first and then followed by an explanation.

Example

Because the combustion of gasoline creates gases that pollute the air, cars produce more pollutants at fifty miles per hour than at thirty miles per hour

This form of question can test students at a higher level of thinking.

Using True-False to Ask Higher Level Thinking Questions

Use of propositional logic. Using the "if-then" approach.

Example:

Under the current money policy of the Federal Reserve Bank, the prime rate is .09 and the inflation rate is .04. The gross national product is down.03, and the unemployment rate is 7 percent. A slow down in the economy is taking place.

  1. True-False—If the Federal reserve reduces the prime rate, the inflation rate is expected to rise.
  2. True-False—If the gross national product goes up and the other indicators stay the same, the Dow Jones average will probably respond by going up.
  3. This type of questioning allows for writing several T-F questions related to the same situation or proposition.

Problem Solving Approach

Example

Last night John bought a used car. This morning it would not start. John begins to search for the possible causes of the car’s failure to start. Decide whether each statement is or is not a plausible reason for the car not starting.

  1. T-F The carburetor may be malfunctioning
  2. T-F The exhaust manifold may be loose
  3. T-F The battery may be discharged
  4. T-F The car may be out of gasoline

Use Multiple True-False Items

Example:

The Boston Tea Party (1773) was

  1. T-F Actually carried out by Indians
  2. T-F Planned as a revolt against taxes
  3. T-F Done because the tea market in America was overstocked and prices were falling

Tip Six - Short Answer or Completion Items

  • This form of testing demands recall rather than recognition of information.
  • This form does not lend itself to testing higher level thinking very well.
  • It is best used to see how well students have collected basic information pertinent to the course

Writing Items

Write completion items that can be answered in a single word (if at all possible).  Makes scoring faster and less subjective.

Example:

  • The density of a fluid is measured with an instrument called the _____________? (One word)
  • The hydrometer is used to measure____________.

The second question will likely have a multiple word answer that may require interpretation and will definitely take more time.

The statement should be worded so they have only one right answer.

Example POORLY WRITTEN:  The battle of Lexington was fought in __________?

Example CLEARLY WRITTEN:  The battle of Lexington was fought in the year _______?

The first example could have many answers (year, state, season)

  • Delete only key words from the statement. No tricks
  • Do not lift statements directly from the textbook as this encourages memorizing, not understanding.
  • Make all the blanks the same length.

To make scoring easy use the following model:  The quality of a test that deals with consistency 1. ________ is called (1) and the quality that deals with the extent to which a test relates to criterion is called (2) 2.________

Tip Seven - Tests Using Matching Items

To reduce random errors place the stimulus column on the left, with each item numbered and the responses column on the right with each item lettered.

Provide spaces for students to write their responses to the left of the stimuli.

Each matching exercise should contain only homogeneous material.

If you use heterogeneous material it simply makes the test easier.

Example

_____ 1. George Washington a. Revolutionary war hero

_____ 2. John Hancock b. Signer of the Dec.of Ind

_____ 3. Virginia c. One of the original colonies

Put the stimuli items in alphabetical order. This will make finding stimuli easier and save time for the students.

The best number of items is between 10-15

The entire test should be on one page to keep students from having to flip back and forth to look for stimuli.

The number of items in the response column should be 5 more than the stimulus column to produce better discrimination on the part of the students

Tip Eight - Writing Essay Tests

Advantages of Using Essay Tests

  • Most advantageous when assessing complex learning outcomes.
  • Are relatively easy to construct
  • Emphasize communication skills as a fundamental performance in all areas of complex academic disciplines.
  • Cannot be answered by simply recognizing the correct response.
  • Do not permit guessing (although they will bluff).
  • Essay tests enable instructors to see how students select, organize and evaluate ideas and apply them to answering the question.
  • Essays are not efficient ways however, to get at factual matter, associative learning and other lower level cognitive objectives.
  • A well-constructed test will sample a wide range of course objectives at varying levels of the cognitive functions taught in class.

Limitations of Essay Test

  • They are difficult to score.
  • Their scores are less reliable that well written objective tests.
  • They provide a very limited sample of the content in the typical unit of study.
  • The score is influenced by the readers overall impression of the student.
  • They do not provide a good situation in which to develop good writing skills.

Reliability Concerns

  • They are somewhat less reliable than objective tests.
  • Studies show factors like time of day, number of papers being read, mood of the reader, where the paper is in the stack etc. all can change the grade of the test.
  • The paper read just before a student’s paper can greatly influence the outcome of the grading process (both good and bad).
  • A reader reading the same paper a second time is likely not to give it the same grade.
  • Expectations that an instructor has for a student’s performance influences scoring.
  • Physical elements of the paper (handwriting, erasures, crossing out material, writing style) can impact a reader’s view of the paper.
  • In a study at Arizona University comparing essays that were first hand written then retyped the typed essays scored almost one full grade higher.
  • The use of only a few selected topics increases the possibility that students may get very high, or very low, scores by the luck of the topic draw.
  • There is no data to support that students do better on essay test than objective test.

Making Essay Tests Better

  • Restrict essay to assess outcomes that require complex higher level cognitive functions.

Examples

  • Compare and contrast X and Y in regards to given Qualities.
  • Present argument for and against a given issue.
  • Illustrate how a principle explains facts.
  • Illustrate cause and effect.
  • Describe an application of a rule or principle.
  • Evaluate the adequacy, relevance, or implication of an arrangement, or materials and so on.
  • Form new inferences form data.
  • Organize the parts of a situation, event, or mechanism and show how they interrelate into a whole.
  • Sort out the relevant parts as distinct entities from a total situation, event, or mechanism.
  • Limit the breadth of the essay question.
  • It should be tied to a single objective.
  • If the question is too broad it cannot be answered in a short time period and grading it becomes very difficult.

Example POORLY WRITTEN: What were the conditions that led up to the Civil War?

  • All writers should be asked to respond to the same set of test items.
  • Giving students choices, although appearing to be fairer actually creates dozens of different tests, makes comparisons impossible and does not allow for a common grading scale.
  • Grammar and spelling should only be taken into account in the grading if they are being taught as an objective in the course.
  • Directions need to be crystal clear and should include what type of writing is being sought. (outlines, complete prose, lists).
  • The question should lead the student toward the answer that the instructor wants.

Example POORLY WRITTEN: Why does an internal combustion engine work?

Example WELL WRITTEN:  Explain the function of fuel, distributor, and the operation of the cylinder’s components in making the internal combustion engine run.

  • List the amount of points that each question is worth to allow students to grasp the stature or importance of each question.

Scoring Essay Tests

  • Conceal Students names.
  • Use a computer lab if available and have students all use the same font and double spacing.
  • Before reading the papers skim through a few to get the overall feel of the papers and to get a sense of what a typical response might be, for the extensiveness of the responses and a sense of what questions they may have had difficulty with.
  • Read only one item across all papers before going on the next item. This will help instructors apply the same criteria across all papers. Also the reader has only one criterion (one answer) to keep in mind.
  • Reshuffle the stack of papers after reading through each item. This insures that no one paper will suffer from always following a good paper or reap the benefits of following a bad paper.
  • Use a prescribed reading procedure. Either the "key procedure" or the "ranking procedure".

Grading Procedures

  • In the key procedure the reader lays out the ideas that the student should have developed in a complete answer, along with the number of points the student will get for each component of the answer. Research has shown that this is a more reliable score process than having no prescribe procedure.
  • The readers writes their own correct and complete answer and reviews it constantly as they read the papers—scores papers on their ability to reflect what the (teacher) wrote.
  • In the ranking procedure the reader goes through the pile on the first question and lays the paper in 5-7 piles depending on their quality. Grades are assigned relative to the order of the piles (best to least)

Student Bluffing Characteristics

  • Answering every question even though they do not know the answer.
  • Restating the question as a declarative statement and elaborating on the statement usually does this.
  • Blatant agreement. If the issue is important to the instructor this sometimes can earn a few points.
  • A broad generalization without elaboration.
  • Dropping names with no details. " According to Senator…"
  • Emphasize the importance of the question without really answering it. " This is a vital question in our overpopulated world today…"
  • Writing on a related topic in hopes that there will be some cross over earning some points. " The situation between the

Tip Nine - Test on a Regular Basis

  • Testing every two to three weeks increases the reliability of the final grade being a true reflection of what the students learned
  • Frequent testing has show to produce higher final exam grades in several studies
  • Feedback is one of the most important aspects of student learning—the more tests the more feedback
  • Students are better able to handle smaller amounts of information over shorter periods of time—the use of tests as a means of structuring and motivating students is an effective way of enhancing student learning.

Tip Ten - The Feedback of Test Results is Crucial to Student Learning

  • The usefulness and value of the feedback tests can give students diminishes with every passing hour—you cannot get the tests back too soon.
  • Students need to see what they got wrong—understand why it was wrong in order too fully benefit from the learning that test could provide.
  • Faculty need to do a post-test analysis to determine which items were effective in measuring what they intended them to measure and which were too easy or too difficult for students—this is the best way to develop quality tests.

Faculty wanting further information about any of these topics are encouraged to contact Terry Doyle at doylet@ferris.edu

      Under Construction -- Watch for Updates


CTLFD Home FSU Home Intranet Search