Unlike the wine and coffee industries, the tea community lacks a widely recognized 100-point rating system that conveys a tea’s quality on numerical scale. Sure, every day, tea industry professionals around the world rigorously evaluate teas, make judgments, and share highly informed opinions with each other and consumers. However, there are no agreed upon evaluation criteria or standards nor scoring methodologies that allow tea professionals and consumers to reliably associate sensory experiences with numerical scores and vice versa. As a result, there are no credible, objective 100-point wine-style reviews to help tea lovers make informed purchase decisions that will increase their appreciation, enjoyment, and consumption of tea. Tea Review seeks to change that.
Tea Review’s mission is to help consumers identify and purchase superior quality teas, thereby increasing demand and prices, and ultimately rewarding farmers and tea companies that invest their time, passion, and capital in producing high quality teas. Our unique contribution to the tea community is to conduct objective, expert sensory evaluations of teas and report those results in the form of 100-point reviews.
Anyone can sip tea and spout opinions. It would be more expeditious for Tea Review to simply start cupping teas, assigning scores, and publishing 100-point reviews. However, that approach would lack the credibility, transparency, and professional consensus needed to be helpful to the tea community and consumers. We’re taking a more rigorous approach. We have begun the process of developing a tea evaluation and scoring methodology that is thorough, rigorous, disciplined, and informed by some of some of the most highly respected tea industry professionals.
Sound ambitious? It is. Countless tea producers, processors, exporters, importers, wholesalers, retailers, and consumers from all over the world, speaking different languages have been evaluating teas in various capacities for hundreds of years. It shouldn’t be terribly surprising that such a subjective, constantly changing, globe-spanning product can’t be easily summarized by a single score on a 100-point scale.
But, if it can be done for wine and specialty coffee, it can be done for tea. And, that’s where we begin.
We convened the Tea Review Tasting Panel — a diverse group of highly respected tea cuppers and industry professionals — to help us establish consensus on quality standards and scoring methodologies. Our Tasting Panel will participate in a series of group cuppings, with the goal of developing a substantial degree of consensus on assigning a numerical score to a traditionally subjective sensory experience. The first cupping is underway and will be completed by mid-October. Each member of the Tasting Panel was sent 16 tea samples — four white, four green, four oolong, and four black — to cup and score on a blind basis. We used the Tea Review Cupping Form to capture our sensory observations and score the teas.
Our process includes three related efforts that are occurring in parallel:
- Establish consensus on quality standards and scoring methodologies
- Calibrate cuppers’ sensory evaluations and scoring to standards
- Align sensory evaluations and scores among cuppers
Establish Consensus on Quality Standards and Scoring Methods
Our first step is to build a reasonable degree of consensus around the criteria for evaluation, that is, the quality standards and scoring methodologies that will be applied. As a starting point, Tea Review has created a cupping form and scoring methodology that defines the categories for evaluation that are often used by tea professionals. The categories or criteria for evaluation are aroma, flavor, body, astringency/structure, and aftertaste. In each category, we use a ten-point scale to assess quality, intensity, and compatibility for the origin, type, or style of tea.
This assessment includes both objective and at times measurable “what is it?” assessments like degree of acidity and discernible flavor and aroma characteristics like lavender or eucalyptus as well as subjective, hedonistic assessments that consider “do I like it? Cuppers combine their assessment of “what is it” with “do I like it” to achieve individual conclusion on “is it good?” and how good and how good on a 100-point scale.
Ultimately, the group goal is to establish degree of consensus on associating these sensory evaluations with category scores and resulting summary scores. Cuppers should feel comfortable that the score they ascribe to a tea accurately represents a numerical summary of the tea they cupped.
But isn’t all of this just a forced attempt to make a judgement objective and numerical when it is inherently subjective? No. In The 100-point Coffee Rating Paradox, an insightful 2015 article by Kenneth Davids, co-founder and editor-in-chief of Coffee Review, he makes a compelling argument that a dominant global community of interpretation and shared assumptions exist around coffee (and I would argue for tea) that allows a community of expert tasters to largely agree on the criteria for excellence. In the case of coffee, Kenneth offers the following examples:
- Acidity is fundamentally good, so long as it is not harsh, overbearing or excessively astringent.
- Smoothly viscous or lightly syrupy/silky mouthfeel is better than thin, watery, or silty mouthfeel.
- Aromatic and flavor notes that are complex and intense are better than those that are simple or faded.
- Given that coffee is an inherently bitter beverage, natural sweetness is good, whereas too much bitterness is bad.
- Aromas and flavors that develop naturally from the coffee bean itself, like flowers, fruit, citrus, honey, molasses and chocolate are better than flavors that come from mistakes made during fruit removal and drying, like fermented fruit, mustiness or moldiness, or rotten or medicinal flavors.
- A long, sweet, flavor-saturated aftertaste is better than a short, fast-fading, astringent or aromatically empty aftertaste.
He goes on to note that this set of assumptions and the interpretive community around them existed long before anyone proposed 100-point ratings systems. In other words, these are criteria or standards for quality against which cuppers can make a rather objective assessment. This methodology allows us to establish what sensory experiences translate to on a 10-point scale for on the key evaluation categories of aroma, flavor, body, astringency/structure, and aftertaste, which ultimately translate into a summary numerical score on a 100-point scale.
Calibrate Cuppers’ Sensory Evaluations and Scoring to Standards
Of course, cuppers don’t have identical sensory experiences when they cup identical teas. There is a good degree of subjectivity in “do I like it” and even to some degree “what is it” considerations. But tea cuppers are not usually called upon to agree with each other on something as measurable as a single score on a 100-point scale. As it currently stands, it’s just two opinions, which often differ. No problem. Just agree to disagree and move on.
However, for us, comparing scores is a way to synthesize the assessments of a group of individuals such as the members of the Tea Review Tasting Panel. We test the consistency and reliability of those responses by comparing category scores, tasting notes, overall scores, averages, ranges, highs and lows, medians, standard deviations, etc. So, at the same time we begin to develop consensus around scoring, we can offer feedback to cuppers calibrate their evaluations and scores such that they are more in line with the group consensus. Through practice, self-reflection, and re-evaluation, cuppers can begin to recognize evaluation criteria and calibrate their scores accordingly. For example, cuppers may be able to refine their ability to identify astringency in a tea that has high acidity. Or discern flavor or aroma characteristics such as jasmine or eucalyptus when these characteristics are thought (or known) to be present. Cuppers need to be able to assign a meaningful numerical value to standards on an agreed upon sensory scale. We move from a set of disparate opinions, to more harmonized opinions, to informed, calibrated assessments that are consciously reported relative to known standards.
Align Sensory Evaluations and Scores Among Cuppers
Arguably, if all cuppers in a group are well calibrated to the scoring criteria and standards, they should be well aligned with each other. While that may be true with a group that cups together frequently and has great familiarity and experience with a scoring methodology and a particular set of teas, it’s highly unlikely to occur naturally at the beginning of a fundamentally new evaluation process. It’s imperative, especially early in our process, that we try to align the sensory evaluations and scores among our Tasting Panel members.
We expect cuppers to have somewhat differing sensory assessments and scores for the same tea. Modestly different scores are perfectly acceptable and may even offer perspective on a range of reason rather than a single precise score. However, dramatic differences in cuppers’ scores for the same tea suggest that cuppers are not on the same page. For example, you don’t want one cupper to identify high acidity and another low acidity. You don’t want one cupper to identify a predominant blueberry aroma and another predominant vanilla. In contrast, it would be perfectly fine if two cuppers both identify an attractive vanilla aroma but one score it 8/10 points and another 9/10 points.
Reasonably aligned scoring increases confidence that a consumer will have a similar positive sensory experience when they purchase and enjoy a highly rated tea. We would be more confident in publishing a 90-point rating if, for example, ten cuppers each scored the same tea between 88 and 92 points to arrive at a 90-point score as opposed to ten cuppers’ scores ranging from 75 to 98 and somehow average 90 points.
Our first group cupping will be complete by mid-October. By November, we will be able to share the results with readers. Be sure to follow us on Twitter or sign up for our free email newsletter to track our progress.