Society and Culture, Education

A game of ‘Good Test, Bad Test’



Teachers in Seattle are refusing to participate in a set of assessments designed to measure student growth over the course of a school year.  The MAP, or Measures of Academic Progress, is used by the Seattle public schools as a supplement to their end of the year summative exams in order to track student progress and calculate teacher value-added, the specific contribution an individual teacher made to a student’s learning.  The teachers don’t believe the tests are an accurate measure of student performance and object to them being used as a part of their evaluations.

The teachers’ efforts have won support of some of education reform’s staunchest critics. Impassioned leaders of the boycott have gone so far as to compare their struggle with that of Martin Luther King and the civil rights movement.

While the opportunity to take that hyperbole to task is almost too good to pass up, I’d prefer to address some of the specific concerns that teachers have with this particular test, and why I think they’re wrong.

For what it’s worth, I think teachers should absolutely push back against bad tests or the use of tests to do bad things.  But it doesn’t look like the MAP falls into either of those categories.

Let’s take a look:

Bad tests take up days or weeks of valuable instruction time

Teachers often complain, and rightly so, that testing takes away from already-limited instructional time.  When combined, schools often wind up administering 2-3 weeks of testing in a 36 week school year.  While assessing students is important, actually teaching them is far more.

The MAP tests take about an hour a piece

The MAP tests in Reading and Math each take about an hour to complete.  That means that even taking the tests 2-3 times a year (as Seattle does) only means sacrificing at most 6 hours of instruction, less than one day.

Bad tests offer a crude look at how many students clear some imaginary bar

The much maligned No Child Left Behind-mandated proficiency tests establish a performance bar for a given grade and subject and simply measure how many students clear it.  They are designed to be most sensitive around the “cut point” of proficiency so as to have the best ability to clearly distinguish whether or not a student has demonstrated what he or she needs to know for that year.  As a result, they are terribly insensitive in the tails of the distribution (in measuring the knowledge of the highest and lowest performing students) and are ill-equipped to measure growth throughout a year (which is what we really want the test to tell us).

The MAP test is computer adaptive to offer more fine grained assessment of student performance

The MAP test is computer adaptive, meaning that the test tailors questions to each student as they take it, to get the most accurate measure of what he or she knows.  Rather than measure whether or not the student cleared some proficiency bar, it assesses the relative position of that student as compared to a nationally representative sample of students.  The result is more fine-grained, and is a much clearer picture of what the student does and does not know.  Also, because it is administered several times over the year, teachers and school leaders can track progress and intervene when it become clear that a student is falling behind.

Bad tests are used as the sole measure of a teacher’s performance

No one test can possibly capture a teacher’s total contribution to a child’s learning.  If tests are going to be used as an evaluative tool for teachers, they should only comprise some part of that evaluation, because they can only measure one facet of what we want from our teachers.

The MAP test is used as only one part of a teacher’s evaluation, and a small part at that

Teachers in Seattle are primarily evaluated based on classroom observations, not tests.  In fact, test scores only serve as a warning sign to principals to place teachers on performance improvement plans. It appears that MAP is only being used as a diagnostic tool, not an evaluative one.

Bad tests don’t give actionable information for months

It is not uncommon for teachers and schools to get the results of their standardized tests months after they have been administered.  These tests are still taken with paper and pencil and thus have to be collected, sorted, and graded, which all take time.  As a result, teachers get little up to date feedback that can inform their instruction.

The MAP test gets results back in days

Because of the computerized nature of the tests, teachers and school leaders can get results back in just a few days.  This provides them with information that they can use immediately, allowing them to provide remediation to those that are behind and enrichment to those that are ahead.

If MAP had any of the problems of bad tests listed above, I’d be on the side of those teachers right now.  We have a lot of bad tests out there, and we have a lot of good tests being used in dumb ways.  However, it does not appear like that is the case in Seattle.  It looks like the MAP is being used the way that it should.

No test, and thus no teacher evaluation system, is perfect.  We know this.  But what tends to get lost in this discussion is the fact that before tests like MAP came around, 99% of teachers were rated effective and schools had little or no information about where students were during the year.   MAP is clearly an improvement on that, and one that should be built upon, not boycotted.

7 thoughts on “A game of ‘Good Test, Bad Test’

  1. Virginia uses something called PALS – Phonological Awareness Literacy Screening

    but it’s primarily an assessment tool to determine a particular child’s strengths and weaknesses and the primary teacher of the child for a range of subjects may not be the child’s only teacher because school are specializing now in dealing with kids deficits. There are a number of “specialist” positions for reading, math, even learning issues.

    so how fair would it be to just judge one teacher for all these different issues that may well be handled by other teachers – specialists?

    This is why teachers are reactive to the idea of narrowing down these assessments to just one teacher.

    A child that shows up with material weaknesses will likely need to be assessed then a plan developed to catch them up and basically a team approach to dealing with his/her needs.

    how do you judge the performance of a team because that’s what is going on in schools today.

    • You could do it like they do in sports– fire the coach. Give the principal broad powers and judge the overall results. Or, you could have lots of schools–call them charter schools–who compete for students by their reputation for doing the job.

      No test is perfect. But the incessant whining and foot-dragging make it seem the teachers resent the very notion that their performance should be judged.

      • I’m not opposed to the testing. I support the testing but what I’m saying is that teaching is team teaching in many schools these days.

        Further – subjects like elementary reading and math are far more important than say 10th grade photo journalism.

        we should be testing for core academic subjects primarily in my view but how do we fairly test for a kid who has different teachers at different times? A kid in 3rd grade may spend 3 months getting special help for reading then return to his home teacher.

        who gets credit or blame and how do you decide?

        Part of our problem is that we have a sound-bite appreciation of how schools operate now days.

        It’s not one teacher in one classroom teaching all subjects all the time.

        the PALs system in Va actually assesses the kid on a per grade level basis with granularity to the tenth so when you test a kid in the middle of the year in first grade -if the kid is on grade level – he’ll test out at 1.5.

        but that kid may have a home teacher and a special reading teacher.. that he sees one or two days a week for an hour or two depending on his needs.

        how do you calculate that – fairly?

  2. The MAP test is only used as one part of a teacher evaluation. If there are low growth scores a teacher will be more closely observed and could eventually be placed on probation. If we look at the way observations are handled in urban public school districts now (philadelphia’s red light/green light system) we can translate that “more closely observed” to “beginning the removal process.” If good tests don’t take much time, measure student growth and get results back quickly…let the MAP test do just that. Do not tie teacher evaluation in with the test. The majority of a teacher’s success is said to be contingent on his/her observations…let that fully be the case because in today’s world that likes to judge schools on sound-bites…the last thing we need is to feed in to these desires with more sound-bite measurement like a quick test score that labels: good teacher/bad teacher.

    • re: ” If there are low growth scores a teacher will be more closely observed and could eventually be placed on probation.”

      again, there is a sound-bite perception here that only one teacher is involved especially when the child has deficits.

      how is one teacher supposed to keep 3/4 of the class on grade level while 1/4 are not only behind but behind in different areas?

      this is a misunderstanding of “observation” also.

      For instance, how do you “observe” a teacher where 3/4 of the class are on grade level and the teacher is performed per the specs for on-grade level kids and 1/4 of the kids are behind?

      how do you handle that?

      how do you handle one class where all kids start out on grade level and another class where 1/3 of them are behind in some subjects?

      the sound-bite approach to this basically ignores the reality of the real world.

      In urban and rural schools – you have parents who themselves are close to being illiterate and are of little help to their kids and these kids often are behind the kids whose parents are well educated.

      this causes big issues for new teachers straight out of college whose background more or less assumes homogeneous on-grade level classes.

      • suppose you have a veteran teacher and she/he is not up to the task – do you really think you can replace that teacher with one straight out of school?

        that’s wishful thinking in a lot of schools and classrooms where they are significant numbers of kids who are NOT on grade level and have significant deficits in some areas – and need not only an experienced teacher but reading and math specialists to send the kids to – to deal with their deficits.

        this is not the sound-bite world people think it is.

        it’s a complex world that works best with experienced teachers and reading and math specialists.. and a logistical plan to ferry kids back and forth between their home classroom and their special help.

        As I said before, I am IN FAVOR of testing. I’m a big believer in the PALs assessment system in Va (that is not yet in widespread use) – so that we can know on a per kid basis – where they are strong and where they are weak – and then to get them help in the areas where they are behind.

        My problem here is how do you set up a fair system of evaluating teachers in such a multi-teacher setting?

        the idea that there is one teacher, teaching all kids, all the time – is a Model T misunderstanding of modern education.

  3. re: ” the PALs system in Va actually assesses the kid on a per grade level basis with granularity to the tenth so when you test a kid in the middle of the year in first grade -if the kid is on grade level – he’ll test out at 1.5.”

    not just for the grade and not even for the major subject like reading and math but the sub-parts of reading – more than a dozen different areas are assessed to determine if the kid is “on grade level” for THAT particular area.

    And if the kid is NOT on grade level – he/she gets sent to spend some time with a specialist to concentrate on that specific deficit – while the home teacher continues the lessons for the kids that are on grade level.

    it’s not only a logistical dance, there are many different teachers involved depending on whether the kid is fully on grade level for all subjects – or not and if not what areas do they need help in – which may be given by one of several different types of concentration specialists.

    One of the problems that we have is a sound-bite perspective of what teaching is these days – and it’s a totally wrong and distorted perspective that has little to do with the actual realities.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>