Last week, the Court essentially
laughed away a challenge to the law. Its reforms addressed a number of
different facets about education, and the suit was based upon the very fact of
diversity in the law. Having already
warned an obstinate lower court judge that the law should not be taken to
violate the single object constitutional standard for legislation, only to have
him stubbornly insist that it did, the Court conclusively decided otherwise
unanimously. The legal tactic by the plaintiffs, a motley collection of
American Federation of Teachers state and local units, was a cover to try to
cancel policy they didn’t like, most controversially for them making teacher
tenure more difficult to attain and basing that decision for about 35 percent
of all teachers on a value-added measure anchored on standardized testing.
In the short term, this changes
nothing. Given the wrenching changes from implementation of and the debate
ongoing about the Common Core State
Standards, the use of a VAM, defined as changes in student scores from one
year to another on standardized tests, as an input into tenure and retention
decisions was delayed
for the last and this school year. Even as CCSS is expected to increase the
profile of standardized testing and expand it to be able to include more
teachers (assuming state policy-makers don’t pull the plug on it, as some
will attempt next year), education leaders, including Louisiana’s, are
wondering whether
testing needs to be more selectively applied.
There is considerable truth to
that sentiment, for the more testing there is, the less time is available for
learning. Relevant to both teacher and school ratings, standardized tests are
used for the former, where they exist in a subject area, in scoring half of a
teacher’s evaluations, with the other half being observation. While the logic
for this use is compelling – research
demonstrates that change observed over time in a student’s academic
progress is significantly attributable to the effort of a teacher – during the
next legislative session in the few months before the rule suspension period
ends this system can be tweaked to improve the quality of the evaluation.
Some changes can address the
inherent strengths and weaknesses of the VAM. Best practices note some
imprecision in measuring different kinds of students, perhaps most notoriously
that student growth potential is far higher among the lowest performing
students that the highest, simply because the latter already are at a level
where incremental gains become less likely and more difficult to entice. Some
interpretations of a VAM factor in all sorts of presumably intervening
conditions, such as gender and poverty level to make assessment by VAM a
multivariate exercise, rather than a simple univariate cause of teaching
quality and change in scores being the effect, but whether demographic factors
such as those are needed is debatable, as this carries the assumption that less
should be expected from these kinds of students in terms of growth if teachers
will be graded less sensitively for those kinds of students, in essence
thrusting a soft bigotry of lower expectations on them.
Also problematic is that there
are some subjects that simply do not make themselves amenable to standardized
testing, such as art and music. The current practice is to work out a set of
goals for a teacher and then evaluate whether they have been reached, which
creates one set of teachers whose evaluations are half objective and half
subjective while the other is almost entirely subjective, which brings up
questions of fairness. That also may be an issue with the weighing the
objective part receives for those in eligible disciplines, as research shows
the VAM contributes but that its moderate degree of validity and reliability
means it should be used only to a moderate degree in an overall evaluation.
Finally, research has revealed that a pretest/posttest regime measuring from
the beginning of the year and at its end has greater validity than using the
previous year’s posttest as the current year’s pretest, as long absences from
the classroom such as experienced during summer have idiosyncratic effects on
children’s retention.
Thus, to produce an evaluation
process maximizing validity and reliability, some changes should be made in
Louisiana’s system starting academic year 2016. For every grade level for every
testable subject (the American College Test requirement for seniors aside),
there should be just three standardized exams given a year – at the very
beginning of the school year as a pretest and so that teachers understand what
they have and where they need to go with it, at the beginning of the calendar
year to measure how far along the class as a whole has gotten and what needs to
be done for successful completion, and then at the end of the year. Further, in
making comparisons of scores from the beginning to the end of the academic
year, an algorithm needs to be worked out where change from the pretest among
higher performers is weighed more heavily than for lower performers.
Some grades may have children too
young for standardized testing to be meaningful, so for teachers of those the
goals system should be employed instead, as well as for those disciplines where
standardized testing can’t be done. Where possible, tests used should be the
same ones being used nationally so that additional tests beyond three a year a
minimized.
Also, instead of counting for
half of the evaluation score, the VAM (or goals system), as well as
observation, each should be cut to one quarter, keeping in line both with the
relative power of the VAM to validly measure student growth and of the
subjectivity of observation. To take up the remaining half, two other
components, widely agreed upon by researchers as indicative of quality in
teaching, should be introduced. One quarter should be devoted to a subject
knowledge test taken each year by teachers, as done already in many states, because teachers with a poor grasp
of the subject material they teach only can retard student progress, while
knowledgeable teachers have a far greater opportunity to help students excel.
The other quarter should be based on a best practices rubric, including such
things as assessment of items like class policies, syllabus construction,
assessment methods, exam construction, etc.; in other words, not issues of
class management that observation would cover.
The end result produces more
fairness among evaluation of teachers across all disciplines relative to each, brings
greater objectivity to measuring teacher capacity, and minimizes potential
validity and reliability concerns surrounding measuring teacher performance. These
alterations might take more than a few months (after passage of the law
enabling these) of transition and planning, so they could be delayed until AY
2017 with another year’s suspension of the current law thrown in, giving
teachers another year of a trial run to adjust to a system that, unlike prior
to the reforms, asks for genuine accountability through objective measures.
The system now in place, if not yet
implemented as far as consequences are concerned, is a vast improvement over
the previous one where annually statewide more tenured teachers died than were
fired for incompetence. But it can be made even better by these kinds of
improvements.
No comments:
Post a Comment