Every school and each classroom is different. The many variables include the teacher, the students, the curriculum, the degree of difficulty and grading among several other factors.
Because of these wide global and local variances, it is essential to have an assessment that can show how well students can use the English language regardless of whether they live or went to school in Beijing or Jakarta. Such an assessment should be an integral part of the teaching and learning process.
“A classroom teacher has a lot of different goals for their students beyond just their language ability. The test questions on the TOEFL iBT test reflect the target language use, such as language tasks in an academic setting. That’s what guides us,” explained ETS Assessment Director Mitch Ginsburgh.
When it comes to a high-stakes assessment of English-language proficiency for entrance to a university, it’s important to design the best test possible. “If a student applies to a university of English-language instruction, it’s important that he or she can function linguistically in the academic environment. A test such as the TOEFL iBT needs to show that a student has the language skills to follow a lecture or write a term paper in English,” says ETS Research Scientist Veronika Laughlin.
Not every assessment company can design, develop, administer, and score a high-stakes English proficiency test like the TOEFL iBT. “Basically, developing a large-scale test takes an army of people,” said Laughlin.
The TOEFL iBT test requires several highly trained professionals to follow a series of steps and reviews to ensure fairness, reliability, rigor and accuracy when evaluating the test takers’ English language proficiency level. Both sides of the process are essential, as universities must be assured that they are getting students who will succeed academically after they are admitted, while test takers have to feel confident that, no matter what their circumstances or background are, they will receive an equal and fair shot to showcase their English proficiency.
“The public would be really surprised at how many people are involved in designing one test,” says Managing Research Scientist Larry Davis. “People think it’s an easy thing to do. But it is kind of like an aircraft carrier. It takes 5,000 people to make it work.” Well, maybe not quite that many for the test.
The past four decades have been spent crafting a process for test development that requires great care and precision. “It’s done so rigorously because we have to be able to replicate the design. If it’s not done carefully, students at a university might not be able to understand lectures and read their text books,” said ETS Senior Director Susan Nissan. “There’s so much at stake for these students. Studying abroad without the requisite language skills can dramatically change someone’s life,” she said. “In the most extreme case — not being able to read, listen, write and speak English on a test — can be the difference between staying in the country in which they want to study or being sent back to their native country.”
ETS is different from other testing organizations because it does ongoing research. Once the test goes out, the research doesn’t end.
New employees in assessment development at ETS sometimes don’t realize how long it takes to get fully up to speed. “When we get a new person, it takes about a year for them to get fully trained. Most of our new staff come with a graduate degree and years of relevant experience. Many English-learning test developers lived and taught overseas for an extended period of time,” says Nissan.
They are first trained to write questions for the reading and listening passage sections. Once they master the basics, they are trained to review other people’s questions. As many as nine ETS experts will review a question before a test taker sees it.
Another way that the TOEFL iBT test distinguishes itself from others is the design of its framework, which includes the purpose, the points of measurement, the tasks needed to assess the abilities and the domain of the language being measured; in the case of the TOEFL iBT test, a university or other higher education academic setting. The volume of research ETS conducts compared to other test companies is unusual, according to Davis, adding ETS also funds outside researchers to provide more objectivity.
Nissan explained that it’s a “collaborative process” between research and assessment development. “In general, research provides a framework for measurement and then assessment development designs the specifications for question and form development based on that framework.”
Another point of differentiation between the TOEFL iBT test and its competitors is the “care that goes into selecting the reading passages,” says Nissan. External consultants are trained to identify excerpts of published texts. These “passage finders” submit passages which are then reviewed by assessment development staff, who determine whether the content, complexity, length, density and abstractness of the passage is appropriate and has the type of academic content students will see in real life. “It’s often a challenge to succeed at university, especially when you’re learning all the new concepts in a different language from the one in which you grew up.”
The TOEFL iBT test is also distinct among its peers because ETS sponsors an advisory board known as the Committee of Examiners, an elite group of 12 professors from around the world who advise, consult on technical aspects, and write reports on what they hear and see in terms of the latest research. “They learn so much about the rigors of ETS products and services. They learn about the quality from the inside that’s useful for the long-term success of the program,” says Nissan. “They become ambassadors for TOEFL in various academic communities.”
But maybe what the average person would be most surprised about in regard to the TOEFL iBT test is that it is constantly changing. “ETS is different from other testing organizations because it does ongoing research,” Davis says. “Once the test goes out, the research doesn’t end. There are three different strands – foundational research looking into the knowledge base, research on operational assessments and how they’re performing and validation research to make sure we have backups to every single claim.”
Research obviously takes a lot of time, but it is a major component of building a test that is fair to everyone. It is also part of why test takers may not understand the effort, work and care that goes into the process. “One question in a four-hour test might take a test-taker 10 seconds to answer, but 10 (ETS) staff spent four hours to make sure it was right,” said Ginsburgh.