Get Into the Exciting World of High-Stakes Educational Assessment Testing!
Score the New SAT!
by Timothy
Horrigan
Copyright © 2005-2008, Timothy Horrigan
This
article used to be part of my page entitled "How
to Get a Job as a Measured Progress Test Scorer."
The original URL was:
How & Where to Apply:
In March 2005, there was a lot of publicity about the new SAT, which is more curriculum-oriented than in the past, and which now includes a writing section. The writing section features a fairly short essay along with some multiple-choice grammar questions. The highly entertaining analogy section of the old test has been eliminated.
The essay has to be written in 25 minutes (in longhand with a Number 2 pencil) and is expected to follow the classic five paragraph model (i.e., introduction, thesis, antithesis, synthesis, and conclusion.) It is graded on a 0 to 6 scale, with 6 being the ideal score, much like Measured Progress uses for its writing tests. The main difference is that a piece of writing which totally fails to answer the SAT prompt while still saying something intelligible gets a zero. At Measured Progress, such responses are scored on their own merits although it is impossible to get a full 6 points. Also, Measured Progress doesn't mark you down for using a pen instead of a #2 pencil— whereas using a pen on the SAT writing prompt gets you disqualified. If your essay is not in #2 pencil, the SAT people will give you a zero.
Typically, two readers score each essay. The writing score is broken out separately on a student's SAT score report, as well as being mixed into the secret formula which produces a score between 200 and 800. The two writing scores are simply added together to get a total score (always an integer) from 0 to 12.
Slight digression: The writing prompt also makes up 25% of the overall writing test: the other 75% being multiple-choice questions. This demonstrates one of the recurring problems with test scoring, i.e., that the supposed precision of the reported scores is much finer than the precision of the scoring instrument itself. There are 13 scoring points on the writing tests (the integers 0 through 12.) The 0 is only used if the student doesn't answer the question at all, which means he or she is totally blowing off the test. The 1 (=0+1) should in theory never be used, since a non-answer is objectively quite different from a minimal answer. So, effectively there are 11 score points: 2 through 12. (The odd numbers, it is worth noting, turn up only when the two scorers disagree on the score.) According to Barbara Whitaker's April 17, 2005 New York Times article, most good students get a 9 or a 10. Even though there are only on the order of ten possible essay scores, the overall writing score is an integer between 200 and 800. Ostensibly, there are 601 bins that the students are being sorted out into. In practice the scores are only calculated to multiples of ten, but that still means the students are being sorted into 61 bins, which is an order of magnitude more than the precision of the essay question. Even when you consider that the students also answer several dozen multiple-choice questions, you can see that the test is inherently much less precise than the scoring system implies. (The SAT is by no means the worst offender in this area. The College Board uses a half-day test to produce 200-800 scores on only three broadly-defined skills: reading, writing, and math. Some states' K-12 assessment programs report 3-digit scores on one or even two dozen different skills, where each score is based on responses to only a few out of many questions asked during one day or a few days of testing.)
Each school year, approximately two million students take the SAT. Each essay is read and scored at least twice, occasionally more than twice. This means that the process of reading and scoring an essay will be repeated a little more than four million times. (Yes, I know that calculation was Math and not English/Language Arts.)
The College Board has contracted with Pearson NCS to do the scoring. The scorers are supposed to be experienced high school or college teachers. However, the scoring takes place during brief "scoring windows" right after the tests are administered, i.e., during the school year. This makes it hard for active teachers to moonlight as essay readers. Non-teachers with extensive writing experience may be hired if Pearson can't recruit enough teachers. The scoring is done at home, using scorers' own Windows computers (400 Mhz or faster, with at least 128 MB of RAM.) The pay scale starts at $17/hr.
Some General Info:
April
3, 2005 LA Times op-ed by Karin Klein entitled
"How I Gamed the SAT"
This
is an interesting article by an Los Angeles Times
editorial writer who moonlighted as an SAT test reader. Her main
criticism of the test is that students are rewarded based on
superficial features of their essays, regardless of whether or not
their arguments make any sense. For example, students are rewarded
for giving multiple examples to support their case, even if the
examples don't actually support the case. She also amusingly points
out that professional writers like herself normally take more than
25 minutes to write persuasive essays, and don't have to write them
out in longhand with a Number 2 pencil.
The College Board's sample Essay prompt. Ironically, even though you get zero points for an off-topic essay, the topics are vague to the point of utter pointlessness! (I would probably get scored down for overusing the word "point" just now, by the way.) Here is an example:
The essay measures your ability to:
develop a point of view on an issue presented in an excerpt
support your point of view using reasoning and examples from your reading, studies, experience, or observations
follow the conventions of standard written English
The essay will be scored by trained high school and college teachers. Each reader will give the essay a score from 1 to 6 (6 is the highest score) based on the overall quality of the essay and your demonstration of writing competence. For more information, see How the Essay is Scored.
The essay gives you an opportunity to show how effectively you can develop and express ideas. You should, therefore, take care to develop your point of view, present your ideas logically and clearly, and use language precisely.
Your essay must be written on the lines provided on your answer sheet-you will receive no other paper on which to write. You will have enough space if you write on every line, avoid wide margins, and keep your handwriting to a reasonable size. Remember that people who are not familiar with your handwriting will read what you write. Try to write or print so that what you are writing is legible to those readers.
Important Reminders:
A pencil is required for the essay. An essay written in ink will receive a score of zero.
Do not write your essay in your test book. You will receive credit only for what you write on your answer sheet.
An off-topic essay will receive a score of zero.
If your essay does not reflect your original and individual work, your test scores may be canceled.
You have twenty-five minutes to write an essay on the topic assigned below.
Think carefully about the issue presented in the following excerpt and the assignment below.
Many persons believe that to move up the ladder of success and achievement, they must forget the past, repress it, and relinquish it. But others have just the opposite view. They see old memories as a chance to reckon with the past and integrate past and present.
—Adapted from Sara Lawrence-Lightfoot, I've Known Rivers: Lives of Loss and Liberation
Assignment: Do memories hinder or help people in their effort to learn from the past and succeed in the present? Plan and write an essay in which you develop your point of view on this issue. Support your position with reasoning and examples taken from your reading, studies, experience, or observations.
A scathing September
20, 2007 Boston Globe article by Linda E. Wertheimer
reported that (to quote the headline) "Many
colleges ignore SAT writing test." Few if any selective
colleges pay much attention to the score, "[frustrating]
students who spends hours and sometimes thousands of dollars
preparing for it." The basic objection to the test is that it's
a dumb test, period: students have just 25 minutes to wrote an essay,
which is graded without taking into account whether or not the facts
are correct. The formula for the essay reflects junior high work more
than college-level work and Les
Perelman of MIT's writing program says "we have to spend a
year in freshman composition deprogramming" the students.
URL for Globe article (may change, break and/or require payment in the futiure):
The Globe provided a couple more examples of SAT writing prompts:
Prompt 1
Think carefully about the issue presented in the following excerpt and the assignment below.
People are happy only when they have their minds fixed on some goal other than their own happiness. Happiness comes when people focus instead on the happiness of others, on the improvement of humanity, on some course of action that is followed not as a means to anything else but as an end in itself. Aiming at something other than their own happiness, they find happiness along the way. The only way to be happy is to pursue some goal external to your own happiness.
Adapted from John Stuart Mill, Autobiography
Assignment:
Are people more likely to be happy if they focus on goals other than their own happiness? Plan and write an essay in which you develop your point of view on this issue. Support your position with reasoning and examples taken from your reading, studies, experience, or observations.
Prompt 2
Think carefully about the issue presented in the following excerpt and the assignment below.
Heroes may seem old-fashioned today. Many people are cynical and seem to enjoy discrediting role models more than creating new ones or cherishing those they already have. Some people, moreover, object to the very idea of heroes, arguing that we should not exalt individuals who, after all, are only flesh and blood, just like the rest of us. But we desperately need heroes to teach us, to captivate us through their words and deeds, to inspire us to greatness.
Adapted from Psychology Today, "How To Be Great! What Does It Take To Be A Hero?"
Assignment:
Is there a value in celebrating certain individuals as heroes? Plan and write an essay in which you develop your point of view on this issue. Support your position with reasoning and examples taken from your reading, studies, experience, or observations.
SOURCE: The College Board
This
is by no means a comprehensive list.
The New SAT
is just one of Pearson
NCS's testing contracts. The company is headquartered
in Minneapolis-Saint Paul, with K-12 test scoring centers in several
other states, such as Virginia, Texas, Arizona, Florida and
Michigan. They are part of the same multinational conglomerate which
publishes Penguin Books. They like to post their entries on
monster.com.
(I would be very happy if you used one of my monster.com
links to apply, hint hint, nudge nudge.) They also like to use temp
agencies such as Kelly
Services.
Measured Progress has its headquarters in Dover, NH and satellite scoring centers in several states, most notably Colorado. (Click here for more info on working for Measured Progress.)
A company with a confusingly similar name, Measurement, Inc., maintains several scoring facilities across the country, in such states as North Carolina, Ohio, Tennessee, Illinois (and others.) To apply, you (ostensibly) fill out a PDF form and mail it to the appropriate location. (Click here for more info on Measurement, Inc.'s Reader/Evaluator openings.)
Data Recognition: also based in Minnesota. They do various forms of marketing research as well as educational assessment testing.
Harcourt Assessment: Harcourt's educational-testing division, headquartered in San Antonio.
CTB/ McGraw-Hill: McGraw-Hill's educational-testing division, headquartered in the Boston area, with openings in many states.
Educational Testing Services: the GRE people, who also do some K-12 testing. They used to do the SAT, but the College Board recently gave that contract to Pearson NCS. But I bet you knew that already!
Aside from the open-response writing prompt, the rest of the new SAT works just like the old SAT. It consists of several dozen multiple-choice questions. In March 2006, there was a bit of a furore after the College Board admitted that a few of the scores from the fall 2005 tests might be just a wee bit off. Initially, these errors supposedly just affected a few dozen tests, and the scores would be off by no more than 100 points, and typically by just 10 to 40 points. Then, we were told that exactly 0.8% of the test papers (1 in 125) were effected. There have been reports of scores being as much as 200 points off or even more. The College Board is adjusting overly low scores upwards but the smaller number of overly high scores will be left the way they are.
I don't know anything more about this than what I have read in the papers. The various news reports agree that, sometime in the fall of 2005, somehow something mysteriously went wrong while the multiple-choice questions were being scored at a facility somewhere in Texas. Supposedly, it is of a highly technical nature.
The first wave of news stories went into no technical detail whatsoever about what happened or how. One of the better news stories was:
After a few days of controversy, the College Board felt the need to offer an explanation. It was a rather lame explanation, using everyone's favorite excuse— unusual weather. The October 8 test session coincided with a week of record rainfalls in the northeast US, especially New Jersey. Papers from those areas absorbed abnormally large amounts of moisture, which caused the papers to be marked in an "unacceptable manner" and/or caused the marks to be lined up incorrectly.
It is worth mentioning that the papers were actually scanned in Austin, Texas, not in the Northeast, and they were scanned quite some time after the test was administered. This incident raises some disturbing questions about quality control in the testing industry. Aside from the fact that it took them four or five months to address the erroneous scores, it is shocking that Pearson and the College Board didn't design the scanning system so it wouldn't choke on damp test papers. Yes, this is a technical issue, but it is also a technical issue related to a technology (optical mark sensing) which has been around for decades.
My personal pet theory about this snafu is as follows: I think there was probably a problem with exactly one of the many answer keys corresponding to the many versions of the test. If there are five different versions of each of the three sections (Reading, Writing and Math) then you would get 5^3=125 different test forms (and 125 corresponding answer keys.) 1 wrong answer key out of 125 is exactly 0.8%! It makes sense to me (though perhaps not to anyone else.)
The news stories indicate that most of the mistakes led to lower scores— but not all. This is consistent with my answer-key theory. SAT questions are constructed with 5 choices: 1 correct answer, 1 "distractor" (which is plausible but wrong), and 3 blatantly wrong answers. The right answer is usually the most popular choice, and the right answer and the distractor are virtually always the two most popular answers. So if you apply a random answer key to a question, you are much more likely to score the real right answer as being "wrong" than to score a wrong one as "correct."
The official "humidity" explanation simply doesn't make much sense— not to me, at least (though what do I know?)
|
|