Conference 2000 week 3 w. james popham



Week 3 - Outcomes and Standards


 Principals in Peril: Judging Quality With the Wrong Yardsticks  


Listing of Papers

DR W. JAMES POPHAM
Hawaii, USA



UNITED STATES principals are currently being clobbered because of the widespread but nonetheless deplorable use of educational tests. Increasingly, US principals and their teaching staffs are being evaluated on the basis of students? scores on standardized achievement tests. As a consequence, all sorts of harmful educational mischief is now taking place in the US. If Australian principals can forestall such silliness in their own nation, they had best do so.

In this brief paper, I?d like to describe this American assessment absurdity, indicate a few of the educationally unsound practices flowing from it, and suggest about the only solution strategy I?ve been able to come up with. I?ve spent only three weeks in Australia, and that was in Western Australia, so my perceptions of your country are undoubtedly warped.

Yet I believe that Australian principals may already be wrestling with the same kinds of difficulties now experienced by their American counterparts. Let?s turn, then, to the problem now facing US school administrators.

Accountability-Induced Conduct

For the past two decades or so, American educators have been buffeted by increasingly frequent demands for genuine educational accountability. Taxpayers want evidence that the dollars they spend on our public schools are paying off, hence are calling for evidence that schools are helping students achieve worthwhile outcomes. Accordingly, state lawmakers, responsive to taxpayers? (that is, voters?) preferences, have installed a variety of accountability schemes, most of which are based centrally on students? standardized achievement test performances.

In the US there are five nationally standardized achievement tests, for example, the - Comprehensive Tests of Basic Skills?. These five tests are developed and marketed by three commercial test publishers. In addition, officials in some of the 50 states have authorized the creation of state-specific standardized achievement tests, most of which ended up being built by the same three publishers that previously created the five national tests.

Generalised Naïve Assumption

Most Americans naively think that the best way to judge a school?s quality is to see how well its students perform on standardized achievement tests. So, when the results of a state?s spring administration of a standardized achievement test are available, it has been common, for more than a decade, to see schools ranked in local newspapers on the basis of students' scores. Schools in which students earn high scores are regarded as effective schools. Schools with low-scoring students are thought to be ineffective. Clearly, these are high-stakes achievement tests.

And if newspapers? score-based rankings weren?t bad enough, more and more states are now actually grading schools - and those grades are dished out chiefly on the basis of students? test scores. Schools with high scores not only get As, but are also given financial rewards. Schools that earn middling grades are usually given instructional support to help them improve. Schools that earn Fs for at least two years can either be taken over by the state or simply shut down. And all of this stems, in the main, from students? scores on standardized achievement tests.

The Wrong Measuring Tools

The tragedy of all this accountability frenzy is that the wrong tests are being used to judge educational quality. As a result, most schools are being inaccurately evaluated. A principal of an effective school may be viewed as ineffective, and vice versa. Let me sketch, ever so briefly, why standardized achievement tests are the wrong tools for this task.

The primary mission of American standardized achievement tests is to differentiate sufficiently among test-takers so that a student?s performance can be contrasted with that of a norm group. This is a useful mission, and both parents and educators can profit from knowing, for example, that a fifth-grade child scores at the 85th percentile in reading, but only at the 29th percentile in mathematics.

However, to create tests that differentiate effectively, test-makers must include items that yield a reasonable degree of score-spread, that is, different scores for different students. Two kinds of items do this very well. First, an item strongly related to a student's socioeconomic status (SES) does a great job in contributing to score-spread. If such an SES-linked item is, because of its content, more apt to be answered correctly by children from affluent families than from disadvantaged families, such an item will help spread out students' scores. A student's SES, of course, doesn?t change all that rapidly. SES-linked items contribute well to a test's score-spread.

A second kind of item that helps differentiate test-takers is the sort of item better suited to aptitude than achievement tests. Such items are closely related to the inherited academic aptitudes with which children are born. Children, from birth, differ in their academic aptitudes such as their verbal, quantitative, and spatial capacities. An aptitude-linked item is one more apt to be answered correctly by a child who, at birth, was genetically lucky. Again, aptitude-linked items do a super job in creating score-spread because inherited aptitudes are spread out all over the lot, and aren?t all that readily amenable to alteration.

Measuring What Children Bring to School

So, to the extent that standardized achievement tests contain many SES-linked or aptitude-linked items, such tests measure what children bring to school, not what they learn there.

Testing-Teaching Mismatches


There?s another problem with standardized achievement tests that proves more than a little vexing to American principals. These tests are built to be sold throughout an entire nation- a nation in which curricular preferences vary from locality to locality. Therefore, the best market-driven test-development strategy for a publisher of nationally standardized tests is to base its test?s items on a most-common-denominator curriculum in order to create a - one size fits all - test that has a better chance of being widely adopted.

The only problem with that approach, for a specific school, is the test?s content may be a miserable match to what?s supposed to be taught (by district or state mandate) in that school. What happens is that the test-makers must sample from their common-denominator content or the tests would be intolerably long for children. But the content sample that?s tested may be quite different from the content that?s taught - or at least to the curricular content a principal?s teachers were supposed to be teaching.

One US landmark study, conducted at Michigan State University in the early 1980s, suggests that on the basis of test-versus-textbook mismatches, there are many instances in which 50 per cent or more of what?s tested by standardized achievement tests is simply not taught in a particular school. How can the caliber of a principal?s staff be accurately assessed via students' test scores if much of the tested content wasn?t even supposed to be taught?

The Consequences of Mismeasurement

Because of these measurement realities, the use of students? standardized achievement test scores to judge the quality of a school's staff is patently unsound. And yet, throughout the US, it's being done again and again. Even though principals, at some intuitive level, realize that they're being appraised with the wrong yardstick, they are still under enormous pressure to improve their school?s test scores and to raise ?standards!? And that relentless pressure has led to a series of practices that every upright educator ought to bemoan.

For one thing, given the ?importance? of these test scores, rampant curricular reductionism is common - wherein instructional attention is devoted only to what?s being tested. If there?s no standardized test being used in music or art, then any instructional attention to those subjects simply evaporates. And prior to a high-stakes test?s administration, a good many schools will for weeks, and even months, focus instruction only on the types of items used in the test.

Even worse, because standardized achievement tests are rarely accompanied by sufficiently detailed descriptions of the skills or bodies of knowledge being tested - sufficiently detailed, that is, for a teacher's instructional planning - many teachers simply coach their students to answer the actual items on the test. Such teachers seem unaware that this sort of item-focused instruction obliterates the validity of any test-based inferences about students? actual skills and knowledge.

And, finally, worst of all, a good many US teachers, and more than a few US principals, have simply cheated on these high-stakes tests - in some cases actually erasing students? incorrect responses, then replacing them with correct responses. Those who have been caught in such misconduct have typically been fired.

Genuine Educational Calamities

The litany of educationally inappropriate practices doesn?t stop there, but there are space limitations associated with this electronic give-and-take, so let me conclude by saying that in the US, as a consequence of using the wrong tests as the chief determiner of a school?s educational quality, some genuine educational calamities are taking place.

I am not opposed to using students? accomplishments as the chief determiner of a school's educational success. That's only proper. But if the wrong assessment tools are used to gauge a school?s success, then bad things happen. They are happening daily in the US.

A Solution Strategy

It is difficult to see the negative consequences of this accountability-induced assessment circus without becoming discouraged. What, then, can be done?

I?ve devoted considerable thought to that question, and here?s the only response that I think has a chance of working. Most fundamentally, we need to make sure that the right tests are being employed to evaluate educators. Standardized achievement tests aren?t the only assessment options in the cupboard. We must make sure that more suitable tests are used.

But how will that happen? In my view, only if we can substantially enhance the assessment literacy of educators, parents and educational policymakers, will the situation ever change. If these people truly understood why certain tests yield invalid pictures of a principal's effectiveness, I don?t think such tests would be used. As long as the US public and the education profession remain ignorant about the assessment issues involved, the current situation will persist.
In America, unfortunately, few administrators, and even fewer teachers, are ever obliged to complete coursework in educational measurement. As a result, there is an alarming amount of misinformation about assessment encountered in teachers? classrooms and principals? offices. That has to change.

But the public itself needs to become more knowledgeable about what sorts of tests should and shouldn?t be used to judge educational quality. Promoting parental assessment literacy is a good way to begin. In recent years my own writing has been largely targeted toward the promotion of educators? and parents? assessment literacy. (Information about the three references cited below can be obtained at http://vig.abacon.com/professional).

Assessment Literacy?s Role

If principals in Australia haven't run into the kind of educational landmines I?ve recounted here, then Australian principals are a lucky lot. If this sort of difficulty is either present or on the horizon, I urge you to promote widespread assessment literacy. Although there are some exceptions, informed people usually don?t set policies or carry out activities that will do harm to children.


_____________________________________________________________



ABOUT THE AUTHOR
Dr W. James Popham, a University of California (Los Angeles) emeritus professor, taught measurement and instruction courses at that university for 29 years. He is the author of about 20 books and nearly 200 journal articles, many of which deal specifically with educational measurement. He now resides on the island of Kauai in the US state of Hawaii.

James Popham can be contacted by email at:
wpopham@ucla.edu


_____________________________________________________________



REFERENCES
Popham, W. James. (1999). Classroom Assessment: What Teachers Need to Know (2nd ed.). Boston: Allyn and Bacon.

Popham, W. James. (2000). Modern Educational Measurement: Practical Guidelines for Educational Leaders (3rd ed.). Boston: Allyn and Bacon.

Popham, W. James. (2000). Testing! Testing! What Every Parent Should Know About School Tests. Boston: Allyn and Bacon.

Dr W. James Popham, a UCLA emeritus professor, taught measurement and instruction courses at that university for 29 years. He is the author of about 20 books and nearly 200 journal articles, many of which deal specifically with educational measurement. He now resides on the island of Kauai in the US state of Hawaii.

Week 1: 15-21 May 2000
Major internet tutorials

Week 2: 22-28 May 2000 - Theme: Healthy School Communities
Conference papers
Internet tutorial

Week 3: 29 May-4 June 2000 - Theme: Outcomes and Standards
Conference papers
Internet tutorial

Week 4: 5-11 June 2000 - Theme: Local School Management
Conference papers
Internet tutorial


 

Comments, suggestions or enquiries regarding the Online Conference should be made to APAPDC Secretariat; information@apapdc.edu.au


APAPDC National Online Conference 2000
Online Conference Management by CyberText
Copyright © APAPDC 2000

Home | Copyright | Disclaimer | Privacy Policy | Email | Staff Login  

Principals Australia Inc. (formerly APAPDC) was formed in 1993 by the four peak bodies representing principals in Australian schools.
  Login  |  Copyright  |  Disclaimer  |  Home  |  Site Credits