Robert Harrell on Standards Based Assessment

Over the past ten years Robert has posted 1,076 times. Those of us who have been on the PLC for those ten years are 1,076 times richer for Robert’s sharing. Here is the latest from Robert, on grading theory:

Standards-Based Grading and “Power Grading” are something that Robert Marzano has popularized. Scott Benedict has done work with both of these concepts and has information on his website Teach for June. Look under the Articles tab.

Moving to Standards-Based Assessment requires some significant changes in thinking. Most students and teachers – and just about all grading programs – begin with the idea of either accumulating points for a grade or getting a percentage correct. The “standard” percentages are in increments of 10 until you get to F, and then it occupies 59% of the scale.

SBA looks at the matter differently and assigns numeric values to performance relative to a standard. Ben’s rubric uses a four-point scale. I use a five-point scale. When I explain it to my students, I compare it to the state tests with which they are familiar. Here’s my scale and a short explanation:
5 = Exceeds the standard (in at least some aspects while meeting the standard in all aspects)
4 = Meets the standard (in all aspects)
3 = Approaches the standard (also called “Basic”, i.e. may partially meet the standard but has areas that do not meet the standard or comes close in all areas)
2 = Falls below the standard (does not meet or approach the standard)
1 = Falls far below the standard (e.g. gets every question wrong; turns in a paper with nothing but a name on it; physical presence in class will get a student a 1)
0 = Presents zero material for evaluation (e.g. does not come to class; does not even hand in a paper or quiz)

The the teacher’s problem is how to translate this to percentages. If I take the “standard” percentages, it utterly skews the results against students. On a performance, I may evaluate the student at “basic”, but the grading scale rates 3 out of 5 as 60%, a D minus. That doesn’t reflect the student’s ability at all. Even the 4 out of 5 is 80%, a B minus. B minus indicates lower proficiency than a 4 does on the scale.

The correlation to percentages is even more out of whack with the four-point scale.

The real problem is with the percentage scale commonly used in schools. If you study the history of grades and grading in the United States, this becomes quite clear. Grades were introduced at Yale University about 1795, and a four-point scale was used. (Possibly the origin of the 4.0 scale in the US) Until the middle of the twentieth century, standard scales ranged between three and nine divisions with the most common being three: superior, adequate, poor (or some other set of names that reflect this division). Until the second half of the nineteenth century, even these scales were not in widespread use. “Narrative grades” were given instead, in which the instructor described the students’ abilities and proficiencies.

The desire to have a system that was “transferable” between institutions led to the introduction of letter grades, and the system of A-B-C-D-F was adopted. E existed at one time, but it was dropped because of fear that it would be interpreted as “Excellent”. During the first half of the nineteenth century, the 100-point scale was popularized. With the widespread use of computers and computer programs in the 1950s this scale became dominant. It was introduced and disseminated with no pedagogical justification whatsoever. It was placed into computers because it is an easy scale for computer programmers to manipulate and program. Schools and universities adopted the 100-point scale because it was readily available, not because it was pedagogically sound. In fact, it is highly inaccurate.

Unless every quiz and test consists of 100 questions, the margin for error in accurate grading increases dramatically. On a ten-question quiz, the margin for error in reflecting what students actually know is as great as two letter grades. The scale also introduces a false perception of objectivity and precision. Do we really believe that a grader can distinguish between, say, a work that deserves an 89% and a work that deserves a 90%? Not unless you have a test of 100 discrete items, each worth one point. How many teachers do that on a regular basis? Not many. Instead, teachers generally work in increments of five or 10 points, thus introducing both imprecision and subjectivity. Those in themselves aren’t necessarily bad, but when they are disguised as precision and objectivity, they are dishonest and ethically questionable. A three or five-point scale is actually more accurate in categorizing students than a 100-point scale. It would be much more accurate for teachers simply to enter A, B, C, D or F in their grade books rather than a percentage or number of points. BTW, the A, B, C, D, F scale was introduced by Mt Holyoke College in 1897; in 1898 they briefly experimented with A, B, C, D, E, F.

The other problem with “standard” grading is the 60% for a D. Can we truly distinguish 60 degrees of failure? (I’m counting zero as a degree of failure.) Is failure truly so much more important than success that we weight the grading system toward failure? In what other endeavor do we do this? Baseball players who bat 300 (out of 1000) are stars. What percentage of shots does a star basketball or soccer player make? Except for grading in schools, 50% is average. Perhaps the skewing of our current system arises from the fact that in early applications of the 100-point scale, no one received lower than 50%, so the bottom of the scale was omitted. Thus, modern adoption of the “innovation” of giving an F 50% is really simply a return to earlier practice.

There is a great deal more that can be said about grades and grading, but I hope this helps.

In my own practice, I simply change the scale in the grade book. (Our program allows me to do this.)
100% = A+
99.99 – 80.01 = A
80.00 – 60.01 = B
60 – 40.01 = C
40 – 20.01 = D
20 – 0 = F
So far no one has complained.





15 thoughts on “Robert Harrell on Standards Based Assessment”

  1. Along this same line I believe, Robert, is something someone here suggested ages ago and I have always done since I cut my French teacher teeth on this blog.

    When grading quick quizzes (5 questions), always start with 50% and give 10 points for each correct answer. I give 5 points for partial answers if something other than “oui” ou “non.”
    1 right = 60%
    2 right = 70%

  2. Thank you Ruth. I remember that. Give the kids 50% for just showing up. They are so tired of being beaten over the head w numbers. We must learn to grade in a way that gives them hope.

  3. We use a 5 point scale in my dept but the truth is that is only because people are so attached to more traditional grades. We really have a 3 point scale because students can only get a 1 a 3 or a 5 basically we look at what they do and ask ourselves can the student complete the task on their own (5) do they need help to complete the task (3) or can they not do it yet (1)? Our 3 = C. Basically we know that students aren’t really an 89% in writing or whatever because that’s not how language acquisition (or production) works. I wanted a 123 scale but no one else really was on board so we compromised. I now have kids illustrate, summarize or give 3 details and I grade it on that scale.

  4. Well, well, well. Hello all. I haven’t been able to read much here or on FB and this topic has come back up thanks to Robert’s genius. I’m still struggling with grading.
    Why do we even have to give a grade? To keep our jobs. Pass/Fail would be much easier for me and all, and in a CI/NT classroom it would be much more fair. But it is not an option.
    I work in a very, very grade oriented school system. Kids are stressed out about that number. I would love to find a way to take away that stress and just enjoy the joint creative process without that sword hanging over all of us. Grades are a domination tool. They divide, the haves and have nots, the “abled” and the “disabled”, they can make students feel very incompetent when in reality they just need more time to let the language sink in.
    So I’ve been thinking of trying something to get this problem out of the way and help reduce all our stress. Maybe my idea has been already attempted by others and not proved to be feasible. In any case, I would appreciate any feedback before I “launch” into the pool and there’s no water (as we say in Spanish).

    Thanks to Tina’s marvelous cycles of instruction and “Spa week”, I have assessed kids on reading, listening and writing and I call it summative assessment week. My principal (who knows Spanish and walks into my room all-the-time) is impressed with where the kids are at this time of year. There are some students–those that need more time but will get there–that feel very anxious from not having done “that well” on the 4-1 scale.
    I truly don’t care if they are at a 1-2-3-4 on some rubric. I like spa week for the spa and because it makes me look good with admin. because I am “assessing” for the standards.
    So, what if I offer grading “a la carte”. Students/parents could have two grading choices to pick from:
    a- An average of the assessments of the four skills (4-1 scale) I have a conversion chart for this that I got somewhere, sometime from someone many years ago. For speaking they could choose a speaking assessment with me (very few would take this) a grade for jgr.
    b-your grade is based on the effort you make in class: jgr or whatever variation of ISR.

    I have been under attack since last year from a parent that considers jgr a behavior grade and I can’t keep waging that battle all alone any more. Ironically the spoiled apple that is her son hasn’t done well on any of the skills assessments. So if I offer a choice, it is not much extra work for me–it comes straight out of spa week. The complaining parent wouldn’t have much to complain about anymore, since they can take the “skills based” option, and about 99% of students would likely choose jgr. If they don’t, and they don’t do well on the skills, then it’s not my problem anymore. I could always “catch” the spoiled apples with the writing and speaking part of the “skills” assessment.
    What are the holes that I can’t see in this ‘menu grading’?
    Thanks for any responses.

  5. Laura I don’t see any weak spots. You gave the kid a choice. It seems fine.

    Are you still using jGR or the new one?

    I always appreciate your comments here. So many of us know why we are doing this and it ain’t about any grades….

    One more thing – we notice how a parent who has gauged their child’s success in terms of ability to memorize has affected your own professional decision making. Your grading approach, which we who have been here for years know, is compassionate, accurate and aligned with the standards. So you’ve been bullied by this parent into thinking there are “holes” in your approach. There are none.

  6. Thanks Ben for your positive support. This is such an important place for me.

    I thought about using a version of ISR that Grant Boulanger has on his blog. It has categories: prepare, interact, engage and looks very “standardy”.
    Which is the new one? Would it be an improvement?
    Thanks much.

  7. Hi Laura, is this a Spanish 1 class? I only ask cuz if so, the output skills (writing & speaking) shouldn’t be weighed evenly with the input skills (comprehending via listening & reading). Not at any HS level, prolly, for that matter!
    I would love to see the skill conversion chart you refer to in your comment above.

    I guess I’d just be wary that by allowing the skills assessment to be a choice, you could be sending a message that you don’t want to, and it could get misinterpreted and outta hand quickly…
    So I’d wanna make sure it was airtight to protect what you are doing, and from spoiled apples.

  8. Alisa I don’t think that the mom cares much about anything but making sure her robotic memorizing kid gets an A. That is her real purpose in complaining. I do agree with your point, however. What is best to protect Laura? It’s a tough one.

  9. I am confused: “I’d just be wary that by allowing the skills assessment to be a choice, you could be sending a message that you don’t want to, and it could get misinterpreted and outta hand quickly…”
    I don’t understand this statement. Maybe it’s my limited English skills. I REALLY would like to know what could get out of hand before I put this out into the open.

    Are you saying that it could seem that I don’t want to grade students by the skills, and that this could get me into trouble?
    It is true that I don’t want to give a grade based on skills assessment. I’m hoping all will just chose jgr (or the new one???). How would this get out of hand?

    I know about not “assessing” evenly the output skills. In fact, I’ve never cared much about writing itself, only as a time eater to give myself a break. I’ve never assessed for speaking either. But if a parent wants their child to be graded “objectively” why wouldn’t I include the output skills in the package? What right would the parent have to question the weight the output skills have? The output skills are part of the “standards”. We do “practice” writing, and I am giving an option in lieu of speaking. Where is the weak link here? I think that if they want a grade to be about “objectivity”, about what a child can really do with the language, then the assessment should include all 4 skills. I guess they’d like to have their cake and eat it too.
    My non CI french colleague grades the output skills and has HW at 40!!!.

    I hope I don’t come across as defiant or something. I am candidly saying what I think here so the group can see what I can’t and I won’t get knocked down again. Honestly the only grading tool that I think is helpful is some version of jgr. I think it guides children in what they need to do in order for acquisition to occur. I already know all I need to know when I look into their eyes or ask questions.

    Thanks so much for answering Ben and Alisa.

  10. Oh, the conversion chart I have.
    I will send it to Ben via e-mail. It is a very, very old photocopy that is not digital. I took a photo and have it as a jpeg and don’t have the time to re-type, but you’ll be able to use it if it suits you.

  11. This week I gave quarter tests. I designed them on each classes’ OWIs that had been created. Their OWIs are in the hallway along with the student-generated vocabulary (The vocabulary they chose as the most important from their story). There are also the stories for each character and in some cases a story that a student wrote and illustrated on their own because they wanted to.

    I gave them a paper with a small picture of the OWI a place to take notes and a place to write a story. They had 3 minutes per character to read and take notes. Everything is in the TL in the hallway. There is no translation. They had 3 OWIs. After 9 minutes they had 1 minute to steal ideas from any other OWI in the hallway.

    Then we went into the classroom and spent 5 minutes discussing the 3 OWIs in the TL. Then they had about 10 minutes to write about 1 OWI. With three OWIs that became 30minutes total.

    It worked great. I got some who just described the OWI while others rewrote the OWI story in their own words in the TL and others who were very creative and made new stories and added in other OWIs from other classes.

    This is quarter 1 of Spanish 1. They did awesomely! My biggest problem was assigning a grade. It took me all day because as I read each test I could see how much each student had grown. I was so proud of all of them. Some wrote at a Novice-Mid level while most were at a Novice-High and a few were at Intermediate-Low (These had had me for a semester last year and are really taking off).

    It is getting harder and harder to assign grades when I SEE these brilliant students and the growth they are making.

    1. Cameron said:

      …it is getting harder and harder to assign grades when I SEE these brilliant students and the growth they are making….

      The key word is “SEE” there. We are able to observe gains but unable to truly quantify them. One reason is that in CI students there is such a rich subsoil of vocabulary collected during all the input that it literally can’t be measured.

      The way you gave them the visual prompts of the one word images represents a brilliant way to assess, bc it sparks the deeper mind.

    2. Cameron said:

      …it is getting harder and harder to assign grades when I SEE these brilliant students and the growth they are making….

      The key word is “SEE” there. We are able to observe gains but unable to truly quantify them. One reason is that in CI students there is such a rich subsoil of vocabulary collected during all the input that it literally can’t be measured, no more than we can count sown seeds in the ground. (And of course we don’t even know which seeds will sprout when and we’re not supposed to….).

      The way you gave them the visual prompts of the one word images represents a brilliant way to assess, bc it sparks the deeper mind.

  12. Does anyone else “grade” by completion of task? This is how I end-around our system that is both “competency based” AND still has % numbers.

    Because of SLA and the individual variation expected, I cannot “grade” ethically by linking the grade to the proficiency level. I have seen some rubrics with things like A=Novice High B=Novice Mid, etc. but I don’t see how a certain proficiency level can be targeted over a specific time period.

    Assessment, on the other hand, is feedback to inform my teaching and for the students to note their own growth.

    For assessment, I look at what the student can do. Student looks at what they can do. Student notes on an ACTFL chart where they are, by checking of various skills and writing the date. “Evidence” is in their folder.

    Then, for grade, it’s 50% interpersonal skills (specific skills needed to negotiate meaning in Spanish–not “behavior”) and 50% interpretive assessments (reading and listening). The kicker is that unless they do not complete the assessment, they get 100% on interpretive. RAtionale for this is that they interpreted the language they were able to. I can’t penalize them for individual variation.

    So the grades tend to be pretty high, unless they do not come to class, do not turn in their assessments, and/or they do not demonstrate competency in interpersonal skills. I find that even though the students are not accustomed to the honesty required here (conditioned to copy and paste, copy off their friend, etc), they adapt quickly. I remind them each time we do a listening or reading assessment “If you understood two sentences, write two sentences. If you understood two paragraphs, write 2 paragraphs. You both get the same credit because you completed the task at your current level.”

    Would something like this work for you Laura? Or is it too loosey-goosey? My school is eventually going to get rid of the percentages. That will make everything easier (I assume?). I am assessing totally by competency and not by “grade.” When I put “grades” in I put numbers in to stand for the competency levels. I did not use a conversion chart. I put 4=100; 3=85; 2=64; 1=50

    This coming quarter I hope to “graduate” to the Cameron style. I just did not get many OWIs going due to groups not being able to interact in that way. Disappointing and also that is just what happened. I hope that we can get it together! I may try showing a series of characters / stories from you all, to see if that will be enticing. We’ll see!

Leave a Comment

Your email address will not be published. Required fields are marked *

  • Search

Get The Latest Updates

Subscribe to Our Mailing List

No spam, notifications only about new products, updates.

Related Posts

CI and the Research (cont.)

Admins don’t actually read the research. They don’t have time. If or when they do read it, they do not really grasp it. How could

Research Question

I got a question: “Hi Ben, I am preparing some documents that support CI teaching to show my administrators. I looked through the blog and

We Have the Research

A teacher contacted me awhile back. She had been attacked about using CI from a team leader. I told her to get some research from

The Research

We don’t need any more research. In academia that would be a frivolous comment, but as a classroom teacher in languages I support it. Yes,



Subscribe to be a patron and get additional posts by Ben, along with live-streams, and monthly patron meetings!

Also each month, you will get a special coupon code to save 20% on any product once a month.

  • 20% coupon to anything in the store once a month
  • Access to monthly meetings with Ben
  • Access to exclusive Patreon posts by Ben
  • Access to livestreams by Ben