The Sentiment Tool
Previously, when I discussed onemilliontweetmap.com I left off a feature from my discussion. Today, I’m going to show you why.
Time for The Sentiment Tool.
To get an idea of how people feel about a certain keyword or hashtag you can look at how it is being used in association to the connotations of the words in the same tweets. The idea is that words within a language have connotations that can be used to convey how someone is feeling on a subject. The above map is the baseline prior to implementing any keywords or hashtags to narrow it down.
For these I decided to include the Daytime/Nighttime layer since this is a 24 hour visualization. I’m not doing top 5 countries for these images, and instead am focusing on the overall sentiment patterns and I’m going to explain why I’m not the biggest fan of this tool.
1. Keyword: Party
This one is fun because it is so ambiguous. “Party” can be related to a birthday party or a political party. Some would claim the word “party” has a preexisting positive connotation. Some regions of the world are more negative than others. Africa and the Americas are the most negative based on the tweets from the past 24 hours.
What’s fun about this tool is that you can zoom in on specific tweets to see what some of the positive vs. negative examples are. What I did find was that there’s a lot of inconsistency as to what is counted as “positive.”
Here’s a false positive via Hawaii:
Let’s try another one! This tweet turned out to be a true positive and is way more wholesome:
What about the negatives?
2. keyword: Election
Well, more people feel negative or neutral, rather than they feel positive on a global scale. That’s… not good? Maybe the world hates politics! Election itself has a fairly neutral connotation, so I would have expected a neutral sentiment overall.
Based on example 1, we can tell that tweets inclusive of the string “Hitler” can still be detected as positive with a combination of words that still convey a positive connotation by the sentiment algorithm used by onemilliontweetmap.com – so, there’s some work that can be done in regards to reliability.
In summary, many of the positive, negative, and neutral tweets in the US have to do with the suggestion that Election day be moved. Due to consistency, I’ll spare the examples.
Let’s do something more fun!
3. Keyword: Chocolate
The assumption often goes that everyone likes chocolate, right? Maybe not! I assumed chocolate would have a neutral or positive connotation to it because it’s food and not everyone likes the same food. But 70% of the world is tweeting “positive” things about chocolate. Delving into the connotation question more, it seems that this is a major barrier for ESL speakers and authors, particularly because the words selected, though direct translations, may not be appropriately based solely on connotation instead of denotation.
4. Keyword: Sadness
To get an idea of how untrustworthy the sentiment tool is I decided it was time to do a test.
Okay. Hold up.
This was meant to be my 100% of people don’t think this is happy. Something isn’t right here. Let’s look into this.
False Positive #1:
False Positive #2:
Most of these false positives are celebrations of life – cases where positive language is used in combination with language of grief. Clearly this confuses the heck out of the sentiment tool.
5. Keyword: Fantastic
So what about false negatives? Well, those can happen too! Using the keyword “fantastic” with the assumption of 100% positive results, I was able to find an example.
In this case it looks like the algorithm may have been confused by the word “missed”? Otherwise, I am uncertain as to why this tweet was counted as a negative sentiment.
The Sentiment Tool In Summary
It’s important to remember how flawed these tools are in the face of judging human emotion. While it’s powerful to be able to look at large populations to gain an understanding of their overall attitudes, as you begin to break it down everything falls apart. There’s too much nuance to trust an algorithm to determine what is objectively positive, negative, or neutral without additional data. Each connotations is determined by social research that is flawed and fails to capture the diversity present within language, instead focusing on a standardization model that homogenizes word sentiment. This is done by some set of people deciding the connotations for words within the sets their algorithms scan for.
Connotations around language are based in culture and regional dialects, rather than the denotations found in dictionaries. Here are some examples of positive things to say where I grew up that would not be interpreted that way elsewhere:
- Well, she/he/they ain’t ugly.
- This food is just terrible – I’ll do everyone a favor and finish it.
- I’d hate to meet you under better circumstances.
What are some regionalisms from where you grew up that don’t match up with the assumed connotations of words? Do you think these would be confusing to someone not from there?
What is your opinion of the sentiment tool? Would you find it helpful in writing? Do you think it’s helpful or does it introduce more confusion?
Let me know what you think in the comments!
Thank you so much for reading this and I hope you have a fantastic rest of your day. If you like what you read, please consider liking, commenting, or sharing. This helps me know which posts my readers enjoy the most. And as always – thank you so much for taking some time out of your day to spend with me.