Before knowing about the money sentiment, firstly, we have to know about sentiment in general meaning, as in the social context.
What’s Sentiment Analysis Good For (in social media monitoring)?
The fundamental flaw in number based positive/negative approach to
sentiment analysis is not in the maths, technology or practicality. It
is in the fact that it starts from an assumption that people are
something they’re not.
Every person’s life tends to happen at the same basic levels. We’re
all a person with an idea of this fixed being, which we call me. Then we
go about our lifes experiencing things, these we call our first kiss or
“auch, I hurt my knee”. Sometimes we feel the need to express these
experiences, that is what I’m doing right here, expressing myself.
Each of these is a diluted version of the previous. As a person we
feel fixed and we feel ourselves, then within that we have an
experience. The way we experience events is entirely depended on our
person. For example when someone dents your car, it is entirely up to
you how you react in that situation. If you’re indifferent about it,
then there is no significant experience. You just take his details and
get it fixed. Or you get angry and talk for days about how someone
dented your car.
When you take your experiences and put them in to words, they’re
further diluted from the actual substance, the richness of human
experience. The idea of being able to take human experience and fit it
on a scale of 0-100 in terms of positive or negative is ridiculous.
When experiences are verbalized, a natural distortion happens, in a
way the experience itself is corrupted by the attempt of limiting its
richness to words. What sentiment analysis is trying to do, is to say
that it can capture the essence of the expression (experience and person
behind it) and record it as a single numeric value.
As a consumer I maybe someone who gets pissed off and expressive
about bad experiences, but I’ll be the first to praise you when you
redeem yourself. Or I could be someone who never says anything, good or
bad. How is this accounted for in the current situation and direction
for text analytics? Brands are not looking for instances, but
relationships.
While I understand the usefulness of text analytics to answer yes/no
questions in a closed domain with good preparation and proper
customization, this is a very limited approach. I’m always more
interested to know why people preferred that someone guided them
personally instead of just giving directions, or how the ones who didn’t
get personal guidance felt when they just got directions. The current
approach to sentiment analysis at best offers limited solutions to such
an approach.
Bottom line is that you can’t classify people, experiences or expressions on a scale of positive or negative. We are not that type of creatures. There is no such a situation that is totally positive or totally negative. Our relationships with brands are no different from the way we interact with life at large. Those relationships hold all the complexities and richness of our personalities, experiences and expressions.
The Human Factor
The fact that people don’t see things similarly in terms of positive
or negative is no surprise at all. Classic philosophists knew this
thousands of years ago, it is one of the underlying concepts in
virtually every religion, philosophy or other system.
We can be affected by so many different things; weather, economics,
relationships, time of day, medication. Attributes such as the ones
mentioned before are used widely in econometrics to model actual
situations in which commerce happens.
To further complicate things, there is the whole dimension of our
relationship with ourselves, the way in which we understand and don’t
understand our own personas, experiences and expressions.
We’re left with that other approach in which I show 10 different
people pictures of 10 angry people and 10 happy people, or I show 10
passionate people and 10 passive people, the situation becomes much more
human. We’re that kind of beings, we get angry and happy, then we’re
sad. That is the level at which we relate, with each other, with brands
and with the world around us.
I’m a big fan of automation and always believed that we should thrive
to automate everything we believe machine can do better than us. The
rest we leave for ourselves to do. The way net sentiment is utilized in
social media monitoring is something I think should be left completely
alone. At the level of net sentiment scoring, it is not worth the time
of human nor machine.
There is a better solution for both man and the machine in this
situation. The fact that something was started 15 years ago in a certain
way doesn’t necessarily means it’s the best way. Our job is to make
sure that we’re all open for what ever ways may be out there.
We all eventually want the same thing, so defending one’s convictions
becomes a slippery slope. In Zen there is a saying: “In the beginner’s
mind there exists many possibilities, in expert’s mind exists only few”.
After doing one thing for a really long time, I find this to be the
most valuable guideline.
So instead of using our time defending the ivory towers of the text
analytics industry and where it’s at now, let’s figure out where we can
take it together!
In A True Spirit of Debate
Below my responses to some of the arguments made in the post "Is Sentiment Analysis an 80% Solution?
Test data about people agreeing on things with 80% accuracy has
little to do with how and why a single system (social media monitor
technology) has a 20% error margin. It’s like comparing pears to
bananas. The way these language systems works is that there is a set of
rules as base for everything and there is plenty of secret sauce in all
of this.
No more seems the example about InfoGlutton relevant. When it comes
to language based systems, success is all about teaching the system to
work in that given environment (defining the rules). When you have a
domain specific system (restaurants) with a limited number of entities
(below 100k), continuously optimizing the system is an option. But when
you work in an open generic domain (the internet) and you have virtually
unlimited number of entities which produce indefinite amount of unique
content, tweaking the system becomes very problematic. Think of the
difference of learning the 300 most common words in Spanish versus
internalizing all great philosophies in their original languages.
All this being said, often when you start looking things from two
extremes, you’ll eventually find the golden middle way most suiting. My
hope is that we can do that by working together on directions that make
most sense for everyone.
Thanks so much for the chance to have this discussion Seth, and
thanks everyone for taking the time to read this through.