Evaluating Teachers, and Lessons for Evaluating Clinicians

My kids are in college and beyond now, but I read articles about public school teacher evaluations with interest.  The debate about how to rank and pay doctors seems a lot like the arguments about how to rank and pay teachers.

The Obama Administration’s Race to the Top has encouraged states to make a portion of teacher pay dependent upon “value added,” or how much more students improve on standardized tests than they would have been expected.   Sounds familiar to physicians – whose patient care is reviewed through claims analysis and who are assigned to higher or lower cost tiers, or who are granted higher or lower pay for performance payment.

The Los Angeles Times  broke ground late this summer by publishing the results of student test results of all that city’s school district’s teacher evaluations. 

Yesterday’s NY Times  had an article pointing out that while teacher evaluations on the average might be accurate, there is a very high likelihood that they are inaccurate regarding individual teachers. 

In the educational world, a teacher in the top quartile had about a one in three chance of being in the top quartile the following year.   A single year of data would misclassify 35% of all teachers!  

In health care, we have much the same problem.  Researchers report in last weeks’ New England Journal of Medicine that they sent Massachusetts hospital claims and mortality data to four different software vendors and asked them to rank hospitals by risk-adjusted mortality.  The results?   Enormous scatter – hospitals that appeared to have a low risk-adjusted mortality rate with one vendor had a high risk-adjusted mortality with another vendor. The Annals of Internal Medicine reported earlier this year that using different rules to attribute patients to physicians led to as many as 61% of physicians being misclassified individually. 

What does this mean? 

It’s not easy to distill the quality of a teacher (or a physician) into a single number.  The validity of evaluation seems to increase with more data; this is a big problem on the physician side where no single insurer has all claims data.  Risk adjustment of some sort is important – but risk adjustment is contentious and will sometimes just sow more confusion.  The validity of an evaluation is highest when you look at a large group – rather than an individual physician (or teacher).   But, parents want to know the evaluation of their kids’ teachers, just as patients want to know how their doctor scored.  There is a valid argument in both settings that physicians (and teachers) work as teams – so perhaps we should be satisfied with team ratings. 

Should we pay for performance?   A Finnish educator, basking in the glory of a recent OECD report extolling the success of the Finnish public schools, says that the real answer is to treat teachers with respect and honor, rather than grading teachers on a curve.  He advocates equity and cooperation, rather than choice and competition.   In education, as in health care, not everyone will be able to have the best teacher (or doctor).  So making raising the bar for the education and health care for all is exceptionally important.

We in health care should be carefully watching the debate on teacher evaluations and teacher pay.