; Cwyn's Death By Tea: Tastes Like Bacon? ;

Friday, September 19, 2014

Tastes Like Bacon?


Midwesterners call a spade a spade. Very often I think of the speech patterns of rural Wisconsin as just a hair shy of a language disability, because of the value placed on spare speech, of saying only what needs to be said in the fewest number of words possible. Too much about too little, such that when the need arises for people to really say something, they can't find the words. Whole worlds exist inside the mind that emerge only with difficulty.

Some folks here have diagnosed the spare speech trait here as a throwback to Scandinavian ethnicity burrowing into the current culture, or more specifically, cold weather ethnicity, of what happens to people who are stuck indoors for nine months of the year, who can't simply go outdoors to escape the annoyances of seemingly endless life in a room with other people. Perhaps in order to survive harsh winters with families intact, people learn to keep their mouths shut, to say as little as possible, to speak only when spoken to, or only when required to speak, so as not to annoy your housemate beyond the level of endurance already required to get through the darkness and cold. Oh, and let's not forget the "be humble" part, that too goes with it.

Along with such a culture apparently goes a higher suicide rate and lower murder rate. In contrast to warmer climes perhaps, where better weather allows the luxury of heading outside to get away from words, from heaven or hellfire poetry the size of Dante. Instead of doing your own self in, you can just kill the other guy. Personally, I'd rather go kill the other guy than suck it up, but in Wisconsin overall, a "put up and shut up" mentality reigns supreme. If you need to say anything, try and be as passive and vague as possible, and hope your annoying roommate gets the hint. 

Apparently the same can happen with tea blogs. A poster on a tea forum complained that tea blogs eventually progress to the point of vagueness, of saying nothing at all. I highly suspect that if such a tendency exists in tea writing, the reason is when we speak of qualities of tea, or Aesthetics, if you will, many descriptors of tea are what we can call Qualitative. In scientific research, we use Qualitative Methods to study and understand phenomena that are complex, multidimensional, highly subject to personal opinion, "Relative to the Individual," political, or ethnographic. Ethnographic means relative to the culture.

I'm asking an Ethnographic question if I inquire whether Wisconsin spare speech is really a Scandinavian or cold weather cultural trait. In tea, I'm similarly asking a Qualitative question if I inquire whether or not a puerh is "complex" or has "qi." Such questions imply a whole host of multidimensional variables, many of which might be hard to pin down, and perhaps even vary by the individual doing the tasting. All of a sudden tea drinkers who have a lot to say might experience difficulties in finding the words. Exact words. Words that, as another tea poster put it, might actually help a new puerh drinker as opposed to leaving the newbie "confused."

Separating out Quantitative (Objective) variables from Qualitative (Relativist) variables is not that hard. Quantitative variables are straightforward, and can be be measured with numbers. The real question is, why isn't anyone in aesthetics interested in doing any quantitative work with variables we actually can measure? All it takes is a pencil, or computer, and a little bit of high school algebra. But first we have to understand what kinds of variables can be quantified and studied with numbers, and which variables require qualitative methods.

I started out talking about Ethnography, and this is a Qualitative aspect of tea we must separate out if we want to look at more objective variables. For example, if I say, "this tea tastes like bacon," what does this mean? Probably a third of the folks sitting around the tea table are going to get up and leave the discussion because they have not tasted bacon. They have not tasted pork in their lifetime, and have no intention of doing so. Saying that a tea tastes like bacon is culturally-based, Relative to the Culture. If I'm the only person at the table who has tasted bacon, then I have said something highly Relative to only myself as an individual.

The Post-Objectivist researcher looking to study the larger Population (p) of tea drinkers worldwide needs to find quantifiable variables. An easy way of doing this is to identify binomial variables. These are either/or traits. Bacon could be an example of a binomial variable. Either a tea tastes like bacon or it doesn't. That's assuming we can agree on what a bacon taste is, and we don't have a wise guy in the room who says something like "I get the pork reference, but to me it's more of a prosciutto." Pork or bacon traits could be binomial variables, but unfortunately they don't have the wider cultural meaning for a worldwide Population (p) we are looking for. Perhaps as a Relativist, Qualitative researcher, I might otherwise write an interesting "Ethnography of Bacon Characteristics of Puerh Tea in Certain Wisconsin-Dwelling Individuals (Sample=P) with Internet Access." An amusing read maybe, but not helpful to non-bacon eaters and not for generalizations about aesthetics. Thus we must say "tastes like bacon" is not an aesthetic of tea for the general population (Population = p) of tea drinkers worldwide, and not worth bothering to study as a binomial variable by a Post-Objectivist quantitative researcher.

So, let us consider a more viable example, using Jane the Tea Vendor.

        Jane the Tea Vendor runs an online tea business selling a variety of teas and teaware. She also   offers puerh teas, but these aren't selling as well as she had hoped. Jane sees some of customers on tea forums, where she goes to promote her tea (smart lady), but these customers have primarily been buying her tea ware, bamboo charcoal, tea pets and non-puerh teas. She knows many of her customers drink puerh tea, but why aren't they buying hers? Jane decides to investigate her customers' taste preferences in tea, so that she can invest her money in teas that will sell, and stop wasting money buying teas that don't sell. How can Jane confidently survey her customer preferences?

One of Jane's teas that is not selling is a Menghai tuo, rather like this one.
2005 Menghai tuo from Yunnan Sourcing
Jane's Menghai tuo isn't exactly the same, but hers is very close. Brewing up 8 grams in her gaiwan, she notices the tea has a dark and significant smoky quality, and visible char in the strainer. She doesn't feel this processing trait impairs the tea in any way, and possibly a humid storage method, or enough aging into a tea will work this flavor out to some extent. But perhaps her customers don't agree, so she decides to check customer preferences for Smoky Tea. Fire and smoke are scents which are not culturally based, but a common human experience.
2005 Menghai
By selling tea worldwide, Jane knows that Smoky can have a positive or negative connotation with regard to cigarette smoking in some parts of the world. If customers new to puerh tea are asked directly about Smoky Tea, they might automatically respond negatively, when in fact they may already be drinking Smoky Tea, but have not recognized the smoke trait yet. Other customers might be turned off from buying a tea that anyone calls smoky, even before trying it for themselves. She also knows of customers who really hate smoky teas, and they won't drink a smoky tea if they know about it in advance.

So Jane plans to give two blind, unnamed tea samples to a number of her customers asking them to taste the teas, and respond to two survey questions online afterward. She offers a 10% discount coupon to her customers who taste the teas and answer the questions. Jane knows she can deduct these expenses on her taxes, and will recoup the costs in the future by investing her capital in teas her customers want.

Jane decides on a sample of her Menghai tuo along with another tuo of Jianshen Lancang tea, rather like this one.
2004 Jianshen tuo from white2tea
Again, Jane's is not exactly the same as Cwyn's, but close. The tuo is supposed to have "tobacco" notes. She brews up 8 grams of this tea too in 125 ml water, just to make sure she hasn't missed any smoky trait the tea might be hiding. But nope, the Jianshen tuo doesn't taste very smoky, because the processing exhibits almost no char in the strainer. She feels that her tuo differs enough from the Menghai that customers will have a preference for one or the other.
2004 Jianshen
An either/or preference situation can be represented by a binomial confidence interval, which will give Jane an idea whether her customers prefer a smoky tea or a non-smoky tea. Jane doesn't want a "no preference" situation, so she will ask customers to definitely pick one of the two teas.

Our Positive will be Smoky Tea, and the Negative will be Non-smoky tea. Because she doesn't really know about her customer preferences in truth, she will arbitrarily hypothesize that 50% will prefer the Smoky Tea and 50% will prefer the Non-Smoky Tea. Thus:

Positive=Prefers the Smoky Trait. (Menghai tuo=A)
Negative=Prefers Non-Smoky Trait. (Jianshen tuo =B)

The two survey questions will be:

1. Which of the two tea samples tastes smoky? (A or B)
2. Which of the two tea samples would you prefer to buy? (A or B)

Jane can use these two questions to sort out whether or not the customer detects the Smoky Tea, and then whether or not the customer would buy either of those teas. If the customer identifies the correct Smoky Tea, and prefers to buy the Smoky, then Jane records a Positive for that customer. If the customer correctly identifies the Smoky tea, but instead prefers to buy the Non-smoky, Jane can record a Negative for that answer. If the customer incorrectly identifies the Non-smoky, but prefers the Smoky, Jane records a Negative and assumes the customer didn't recognize the trait. If the customer incorrectly Non-smoky taste and prefers Non-smoky, Jane also records a Negative. The possible answers are succinctly summarized like this:

A,A=Positive
A,B=Negative
B,A=Positive
B,B=Negative

By using two questions like this, rather than the one question "which do you prefer," Jane can learn whether or not her customers detect the Smoky trait and have buying preferences based on that, or whether they don't detect it, and their buying preferences aren't affected by this trait. Or, that the customer might correctly identify the Smoky tea, but for unknown reasons, other than just Smoke, the customer wants to buy the Non-smoky (leaf quality, or number of steeps, complexity or some other trait).

Jane would like to have a 90% confidence interval with no more than 10% margin of error. She mails out the two sample teas along with the survey instructions and continues mailing only those two tea samples in all customer orders until she has mailed out 100 sample sets. She gets 68 people who try the teas and answer the survey questions to get their coupon.

Then Jane counts up the number of Positives and Negatives as described above. Jane gets 48 people (or 71% =.71) of customers with Positives for Smoky, and 20 people (or 29%=.29) Negatives for the Non-smoky. Remember, she started out hypothesizing an equal 50/50 split in the Smoky vs. Non-smoky. To find her confidence interval, she calculates the following:

N/z²[PQ+z²/2N +/- z√P x Q/N + z²/4N²]
or
68/(1.29)²[.71+1.29²/2(68) +/- 1.29√(.71 x .29)/68 + 1.29²/4(68)²]

where N = number of customers, P = %Positives as a decimal, Q =%Negatives as a decimal, z = critical test value, which on a z chart is 1.29 for a 90% confidence interval with 10% margin of error.

"Can this get any easier?"

Try using this online calculator. Enter the Positives 48 into the "Passed" box, and for "Total Tested" enter 68. Use 90% for Confidence. The results will be a chart on the right side of the page. Most of the results will be very, very similar. Differences have to do with corrections based on the sample size and for situations where fewer than 5 cases fall into Positives or Negatives.

"I can't handle math..."

Try this simple online calculator. Enter 48 for the number preferring the first option, and 20 for the number preferring the second option, and 90% for Confidence. The quick result will tell you whether or not a significant preference exists, without any numbers.

Originally Jane had hypothesized that 50% (or .50) of her customers preferred the Smoky Trait. But her experiment turned out that the actual trait falls between about 61% (.61) and 79% (.79). It's a pretty wide interval. But she had set her confidence interval for a big margin of error in order to reduce the number of customers she needed to mail teas to, and still get enough information to make a decision. Jane rejects her original hypothesis of a 50/50 split. Her customers are somewhat more likely to prefer the Smoky Menghai over the Non-Smoky Jianshen.

Plus she can make one Incredibly Important conclusion.

Jane can make generalizations that her results approach the entire population of tea drinkers, not just her own 68 customers. You might wonder, why go through all this riga-marole? Why not just use one survey question and take that 71% and go with it? This might be okay if all Jane wants to know is about her 68 customers who answered her survey question. However, you cannot generalize sample data from 68 people to the larger population using only a survey percent!! (Even though people do so erroneously all the time.) But because Jane went through all the above math and blind taste tests, she can make generalizations about entire range of the tea community beyond her own sample with a certain degree of confidence.

Let me say that again: a confidence interval of a binomial distribution with sufficient power, and a sufficient sample of people can be generalized to the larger Population (p). But a % mean average of a survey by itself, without the above methods, won't give you a range beyond Jane's customers. Jane went through all this work for a reason. She spent her money in the right way so that she knows something more about a Trait in the larger population of tea buyers, not just about those 68 customers alone.

Remember one more thing: Jane knew that if she simply asked customers if they like a Smoky Tea, a far larger majority probably would see the word Smoky and say "no," just because of negative cultural connotations about Smoke. She would not have got a true picture of tea drinker preferences just by asking a question alone, without the blind samples. She might have been tempted to dump all her Menghai tuos, or sell them for far less than she might otherwise. In fact, after all her samples, she is likely to actually sell more of her Menghai tea now than she would if she simply describes it "Smoky" in a survey or in her online sales description without blind sampling beforehand. She now can use this data to guide her tea descriptions, pricing and marketing.

Some other considerations...

Speaking of money, Tea Vendors might say "Hey wait a minute. If I send 2 samples of 10 grams each to 100 people, that's 2 kilos of tea. I can't afford that!" There are two ways of getting around this problem. If you want to use a smaller sample size, you'll need to adjust the Margin of Error you're willing to accept, and the Confidence Level. For decisions about capital investment, and for social science research, a higher Margin of Error (say, 10%) and a lower Confidence Level (say, 80-90%) might be perfectly acceptable. This will lower your required Customer N by quite a bit. You can use a smaller N of 50 people, and get that hopefully by mailing out the samples to 75 customers or so. But you can't use fewer than 50 people, or else the procedure isn't valid.

There are a couple of downsides to using fewer people, one being you may not get much information with a wider confidence interval, not to mention skewing your survey. As it is, a 90% with 10% margin of error is pretty wide. For exploratory work people preferences, however, this can be okay. But in fields like medicine, we don't want to be giving a medication to people unless we have a very high degree of confidence (99%) that we won't be causing adverse effects or death. In the "hard" sciences we need confidence levels extremely high, and error extremely low, compared to trait preference situations like Jane's. Does this make sense?

Another way around the expense problem is by replicating a study. If enough people conduct the same study, we can pool together the results and create a new Sample N > 300 and apply our more stringent margin of error (5%) and higher confidence level (95%). This is a post-hoc analysis. Researchers do this regularly in reviews of available research. Jane could get together with other tea vendors who agree to similarly sample their customers. They will all have different teas, but the main requirement will be to sample a definitely Smoky tea with a Non-smoky tea and ask the same exact two online survey questions.

Tea vendors can pool their results with Jane's, and apply the confidence interval calculator on their new aggregate group sample N. All they need to do is convert their percentages of answers into decimals for the formula, or plug the Positives into the online calculator. Using different teas is perfectly fine for tea vendors, and for social scientists studying Aesthetics or Traits. Obviously different teas won't be good enough to provide information for tea factories or people interested in leaf differences, for example. But to merely understand a trait, as long as the samples conform to the trait, the teas can certainly differ. And we know that teas often differ even among batches.

Jane can also get more data by looking at single answers. For example, if she gets an unusually large number of people with B, B answers (>10%), for example, these are people who didn't correctly identify the Smoky tea, and who prefer to buy the Non-smoky. She might want to follow up with buyers who have actually purchased the Jianshen and get information from them about their experience with it, and look for any reviews of the tea. These buyers might have additional variables for her to consider as she makes decisions on buying tea for her business in the future. She might also question whether the Smoky tea she picked for her study really provided enough of a contrast for reliable customer data.

Obviously I made up Jane's sample results, and they are not actual values. Nevertheless, binomial variables are a straightforward, either/or way to study a preference or Trait and determine with a degree of confidence whether that trait is likely to exist in the general population, given an appropriate sample size. We can make conclusions about tea traits with a reasonable understanding of statistical power and unexplained error.

Can you think of any other traits about tea that can be explored in this way?







2 comments:

  1. Cloud did something almost as systematic (no maths) when he did a major sample distribution of the 2003 Silver Dayi in 2007.

    ReplyDelete
  2. Yes, I believe his was a contest for writing a 200 word review, and a few got a prize of some sort. Your comment reminded me that a single sample is fairly small for packaging, he had that photo of all the samples packed up with postage affixed. I don't recall that he had any particular queries, just the review requirement with no format.

    There are qualitative (relativist) methods that systematically analyze what qualitative researchers refer to as "narratives." However, some structure is usually sought in advance, questions that are answered by everyone. Even without the structure, data like that can be used, doing something like "counts" of particular words, such as "smoky" or "tobacco." One could try retrieving some of that data, but using a review site like Steepster on a cake with a fairly decent review set is a way to get available data that has a little more structure, brewing parameters for example. Cakes like the Wild Monk 2012 by Mandala which has a lot of reviews toward a sample N can be gleaned. If one wants to use the much-maligned ratings data, all that involves is reversing the algorithm, which appears to be an additive numerical weight based on number of reviews where the null = 70, or some such.

    ReplyDelete