Twitter Is A Strange Tool For Geospatial Analysis Of Emotion

The Sentiment Tool

Previously, when I discussed onemilliontweetmap.com I left off a feature from my discussion. Today, I’m going to show you why.

Source: onemilliontweetmap.com 2:38PM 30 July 2020 US MT – baseline sentiment

Time for The Sentiment Tool.

To get an idea of how people feel about a certain keyword or hashtag you can look at how it is being used in association to the connotations of the words in the same tweets. The idea is that words within a language have connotations that can be used to convey how someone is feeling on a subject. The above map is the baseline prior to implementing any keywords or hashtags to narrow it down.

For these I decided to include the Daytime/Nighttime layer since this is a 24 hour visualization. I’m not doing top 5 countries for these images, and instead am focusing on the overall sentiment patterns and I’m going to explain why I’m not the biggest fan of this tool.

1. Keyword: Party

Source: onemilliontweetmap.com 1:45PM US MT 31 July 2020 keyword party

This one is fun because it is so ambiguous. “Party” can be related to a birthday party or a political party. Some would claim the word “party” has a preexisting positive connotation. Some regions of the world are more negative than others. Africa and the Americas are the most negative based on the tweets from the past 24 hours.

What’s fun about this tool is that you can zoom in on specific tweets to see what some of the positive vs. negative examples are. What I did find was that there’s a lot of inconsistency as to what is counted as “positive.”

Here’s a false positive via Hawaii:

Well… that doesn’t seem very positive.

Let’s try another one! This tweet turned out to be a true positive and is way more wholesome:

What about the negatives?

Oh, well that is sad. A Back To The Future V watch party would be fun.

2. keyword: Election

Source: onemilliontweetmap.com 1:56PM US MT 31 July 2020 keyword election

Well, more people feel negative or neutral, rather than they feel positive on a global scale. That’s… not good? Maybe the world hates politics! Election itself has a fairly neutral connotation, so I would have expected a neutral sentiment overall.

Based on example 1, we can tell that tweets inclusive of the string “Hitler” can still be detected as positive with a combination of words that still convey a positive connotation by the sentiment algorithm used by onemilliontweetmap.com – so, there’s some work that can be done in regards to reliability.

In summary, many of the positive, negative, and neutral tweets in the US have to do with the suggestion that Election day be moved. Due to consistency, I’ll spare the examples.

Let’s do something more fun!

3. Keyword: Chocolate

Source: onemilliontweetmap.com 2:05PM US MT 31 July 2020 keyword chocolate

The assumption often goes that everyone likes chocolate, right? Maybe not! I assumed chocolate would have a neutral or positive connotation to it because it’s food and not everyone likes the same food. But 70% of the world is tweeting “positive” things about chocolate. Delving into the connotation question more, it seems that this is a major barrier for ESL speakers and authors, particularly because the words selected, though direct translations, may not be appropriately based solely on connotation instead of denotation.

4. Keyword: Sadness

To get an idea of how untrustworthy the sentiment tool is I decided it was time to do a test.

Source: onemilliontweetmap.com 2:09PM US MT 31 July 2020 keyword: sadness

Okay. Hold up.

This was meant to be my 100% of people don’t think this is happy. Something isn’t right here. Let’s look into this.

False Positive #1:

False Positive #2:

Most of these false positives are celebrations of life – cases where positive language is used in combination with language of grief. Clearly this confuses the heck out of the sentiment tool.

5. Keyword: Fantastic

source: onemilliontweetmap.com 2:54PM US MT 31 July 2020 keyword: fantastic

So what about false negatives? Well, those can happen too! Using the keyword “fantastic” with the assumption of 100% positive results, I was able to find an example.

In this case it looks like the algorithm may have been confused by the word “missed”? Otherwise, I am uncertain as to why this tweet was counted as a negative sentiment.

The Sentiment Tool In Summary

It’s important to remember how flawed these tools are in the face of judging human emotion. While it’s powerful to be able to look at large populations to gain an understanding of their overall attitudes, as you begin to break it down everything falls apart. There’s too much nuance to trust an algorithm to determine what is objectively positive, negative, or neutral without additional data. Each connotations is determined by social research that is flawed and fails to capture the diversity present within language, instead focusing on a standardization model that homogenizes word sentiment. This is done by some set of people deciding the connotations for words within the sets their algorithms scan for.

Connotations around language are based in culture and regional dialects, rather than the denotations found in dictionaries. Here are some examples of positive things to say where I grew up that would not be interpreted that way elsewhere:

  • Well, she/he/they ain’t ugly.
  • This food is just terrible – I’ll do everyone a favor and finish it.
  • I’d hate to meet you under better circumstances.

What are some regionalisms from where you grew up that don’t match up with the assumed connotations of words? Do you think these would be confusing to someone not from there?

What is your opinion of the sentiment tool? Would you find it helpful in writing? Do you think it’s helpful or does it introduce more confusion?

Let me know what you think in the comments!

Thank you so much for reading this and I hope you have a fantastic rest of your day. If you like what you read, please consider liking, commenting, or sharing. This helps me know which posts my readers enjoy the most. And as always – thank you so much for taking some time out of your day to spend with me.

Things That Influence My Writing: Linguistics And My Thoughts On Linear A

So I’m watching Trey The Explainer and he mentions this written language used by the Minoans called Linear A. It is related to and pre-dates Linear B. I love historical mysteries like this. Why? Because I love linguistics. I’ve talked about this previously a little when talking about my accent.

The Evolution Of Language

Languages have evolved over time and are still evolving. As an editor, I am in favor of embracing new defined versions of languages specific to geographic locations and unique communities. By defining language systems and fighting to preserve, rather than erase them, we fight language extinction if only in the written form.

I recognize that this gets to be a tricky area because depictions of dialects within racial communities have had a strong tendency to be inappropriate. Period. This is why I think it’s important for the rules of any language to be set by native speakers, that way if someone is attempting to write in a specific way to represent a dialect they are doing so in a consistent manner that has been created accurately. Sadly, this is a very slow process. Linguistics research, in general, is slow. You can help by participating in Accent Tag!

Since languages are constantly evolving, written languages are evolving along side them when present. One of the most fascinating cases of language evolution to me is that of the evolution of alphabets representing sounds – whether they are complete syllables or individual vowels and consonants.

Linear A

Linear A is the “undeciphered” ancient Minoan script that is similar to the later script known as Linear B.

Why Can’t We Use The Rules Of Linear B To Read Linear A?

Current interpretations of Linear A try to assume Linear B preserved the same rules. One of the issues I currently have with this assumption is that language evolution doesn’t mean that “shared phonetic symbols sound the same”.

In Europe, that’s not how phonemes translated to/from written languages worked – and there are letters that have fallen out of use in English and German since the rise of the printing press and Northern Renaissance. Then, during the Enlightenment some European dudes in really fancy pants decided lots of European languages needed to have defined grammar, structure, and spelling.

In China, this first started much earlier with printing presses (everyone knows Bi Sheng invented the first movable type printing press, right?) and was a constant battle between isolated populations with varying languages and the various national governments.

Standardizing languages doesn’t work out – but sadly, even La Francophonie often chooses to ignore basic human rights regarding autonomy and language preservation in favor of pretending that they do instead.

We can assume that as the language evolved into Linear B vowel shifts and consonant shifts happened, changing the assumed pronunciations for these symbols. Therefore, it’s an incorrect assumption to backward apply pronunciation and meaning since Linear B is a younger language. That’s like trying to read and understand Old English using the rules of Middle English.

How Would We Translate Linear A?

So what should we do instead? As with any problem, we have to establish what we know.

Well, we can’t look at anything Greek because genetic evidence suggests that the Minoans weren’t from Greece, rather they were from the East, meaning that the older language to look at would not be Indo-European.

After the fall of the Minoan Civilization, their culture, language, and genetics did merge with the Greek people.

They were seafaring traders that interacted with people living as far as the Iberian peninsula and Egypt. This provides us with some starting information to figure out a bit.

Another clue may be the unique behavior of the mathematical system that they used, that included a built in log scale. Mathematical systems can be a great indicator of the history and culture of a people because they develop out of necessity for explanation.

So what could Linear A’s parent be? Peter Z. Revezs seems to think it’s a shared ancestor of Hattic and Hungarian languages, making it a member of the Uralic language family. This also means we can use the algorithm developed in this paper to get more consistent translations with our growing set of written examples. Their consistency and applicability is compelling. What’s unique about this paper is that it examines only the linguistics using a computational approach and nothing having to do with DNA sequencing or the anthropological history of human remains. That said, more recent studies looking at the oldest remains of individuals come to similar conclusions. In fact, a number of individuals have now come to the same conclusions using vastly differing methods, but there’s still a significant amount of disagreement. […I do recommend reading that thread if you want an interesting source of entertainment and great laughs. There’s some incredible pettiness between researchers who can’t agree on anything and have absolutely no sense of what they look like to everyone watching. Have your popcorn ready.]

Why Is Linear A So Difficult To Crack?

If the Minoans were such prolific seafaring traders, why is it so hard to decipher their language when we are able to decipher Egyptian? Well, we don’t have a ton of writing samples.

We can also assume that Linear A was a language used by seafarers. These languages, such as English once the British decided they just had to colonize all the things, pick up words from other languages and evolve into variants at a much faster pace by forming pidgins with other languages along trade routes.

This means that whatever Linear A did originate from could be unrecognizable without seeing the in between pidgin forms that could be present at all of their trade locations and then building a linguistic timeline. We don’t have a large enough sample size for that. Hopefully, as we find more remnants of Minoan trading, most recently in places like France and Serbia, a timeline could develop to support or debunk the computational relationship to the Hattic language establishing it as a member of the Uralic language family.

Why Does Any Of This Matter?

The evolution of language matters because it’s one of the ways we can understand the evolution of the human brain and we can understand the things some cultures found important enough to write down. The written component of a language provides us with one small piece of the puzzle that is a dead language. The more we understand about how languages evolved, the more we can understand about contact between different human groups as well as how their civilizations were structured. Language provides huge insights into details about general views, such as the self and how one relates to past, present, and future.

All languages require the transmission of information from source to a receiver. This information must be encoded in some way. This information must then be passed to the recipient via a channel – such as a physical medium or sensory perception. This is the tricky part – if the recipient can’t decode the information effectively things can be problematic. We can’t decode Linear A, we can’t read things written in Linear A. To better talk about this I’m going to focus on language encoding because decoding is the reason why it’s all important.

We have 5 primary ways that we’ve communicated (encoded information) over time:

  • Pictographic
  • Phonetic
  • Written Phonetic
  • Signed Languages (Phonetic / Symbolic)
  • Interpretive or Body Language

Pictographic

Okay, see? You’re not crazy for thinking emojis are a natural part of English. Illuminated manuscripts contained illustrations to complement or work in place of written words, occasionally, in line. Pictographs are some of the oldest forms of language we have if you don’t want to count early stone stippling forms that often predate full pictographs on some continents. We have absolutely no way of knowing how languages that use these symbolic languages pronounced their words and often the false assumption that they had simplistic language structures is made. Modern symbolic languages and improved understanding of language evolution has provided evidence that this is not the case. One of the problems with pictographic languages is that they rely on subjective interpretation to convey messages. This leads to opportunities for miscommunications and that’s how arguments get started over who sent who the wrong emoji and what it meant.

Phonetic/Spoken

Purely spoken languages with no preservation disappear when the people do. This is why many indigenous languages are endangered, and why one of the only good things that came out of the messed up stuff those evangelical missionaries did during the colonization of the world was to come up with ways to translate the Bible into lots of indigenous languages. This didn’t save all indigenous languages and there are many that still need help to be preserved by supporting the sovereignty of these nations to teach public schools, print road signs, and write government documents, and have local media stations dedicate to indigenous languages. These methods have been used to save Welsh, Maori, and some indigenous languages here in the United States. I first learned about these efforts in 2006 when Congress passed the Esther Martinez Native American Language Preservation Act. This program runs out in 2024 and has not been enough, with many native speakers threatened now more than ever by SARS-CoV-2. When all innate speakers of a language die out, the reality is the language is gone forever, even if recordings and written documentation of the language remain. This is because a spoken language is more than just words. As previously mentioned in my post on accents, how a person sounds when they speak communicates information to the listener. With dead languages (no original speakers remaining) this information that could be compared to epigenetics in the sense that histone modification of gene expression is passed down through generations and can be traced through relatives based on their environmental exposures and life experiences. It’s weird.

Written Phonetic

Written phonetic languages are those that tell you exactly what they sound like. Or at least they try to. Vietnamese and Arabic are both great examples of a written phonetic languages. It’s important to note that not all languages will neatly fit into either a Written Phonetic language or a Pictographic language – Modern English and Japanese both doing great jobs of demonstrating this and Linear A is an additional example.

Spoken languages rely on body languages, subjective implication, and contextual interpretation for the full communication for information. Written phonetic languages, while preserving a bit more structural information about a language, without the other components provide the bare minimum regarding the information being conveyed. Subjective interpretation of written phonetic languages is something that requires

Signed Languages

An accidental study and total violation of human rights lead to the discovery that humans will always develop natural language when in a sufficiently socialized group, even if non-auditory. Natural signed languages do not naturally possess a phonetic linkage except in communities where there is full integration between auditory and non-auditory communicating communities.

Some phonetic signed languages are constructed languages. When I was a kid, I learned Cued Speech with a childhood friend. We didn’t really talk using words, instead relying on our own sign language I wish I remembered today. She used Cued Speech to learn how to use phonetic pronunciation in a speaking world and I used it for speech therapy.

Signed languages are distinguished from body language by the use of specific gestures with unique meanings and the defining unit features of a language required to designate grammatical rules. When a sign language dies with no record, much like a spoken language, it is considered a dead language and even the signed languages that American Sign Language was based on have now died.

Body Language

Different elements (forms) of non-verbal communication [14].  
Source: https://www.researchgate.net/figure/Different-elements-forms-of-non-verbal-communication-14_fig1_221217376 <- really cool paper on non-verbal communication in video game avatars.

Body language is the major component to the communication of language that is often left out. It’s the most subjective, easy to misinterpret, and complementary to spoken/phonetic language or signed languages. Body language can often be broken down into “universal” and “non-universal”.

Body language studies are fairly controversial and some of the best studies focus on the behaviors of non-human primate species. “Universal” body language seems to have some level of genetic predisposition and in neurotypical infants is among the first forms of language to be understood. This “universal” body language is considered easier for neurodiverse individuals to learn thanks to clear definitions.

“Non-universal” body language includes gestures that may fall into a scenario where gestures can represent a number of things depending on context related to a local region, spoken language, and a specific culture. This kind of body language is highly subjective and evolves (or becomes extinct) quickly. This kind of body language is the kind often related to miscommunication, misinterpretation, and culture shock. What may be polite to one person may be like slapping another person’s grandmother.

Linguistics Is More Complicated Than This

I don’t want anyone to be under the impression that this is all there is to linguistics. I have barely scratched the surface with the above introductions to inaccurate terms that are a bit friendlier to a non-jargon seeking audience. My interests tend to focus on the documentation of unique variants of languages and the cracking of written languages that are not yet understood so they can be decoded. If you’re interested in linguistics, you should read primary literature on these topics and consider getting involved in citizen science efforts such as the Accent Tag.

Language extinction hurts everyone. Because language gives structure to abstract experiences, loan words between languages allow for the expression of and adoption of words for human experiences that had not previously been given concrete language or attempts at understanding. That means that, in some regard, by preserving global language diversity, we have the ability to control our intellectual evolution.

TL;DR

I advocate for people to:

  • Learn about threatened and endangered languages
  • Learn the basics of linguistics
  • Learn about language evolution, so it can be embraced and accepted

If you enjoyed this, please like, comment, and/or share – it helps me know which types of content my readers want to see.

Thank you so much for taking the time to read my rambling about Linear A and linguistics today. Without you these would be bits of data floating around waiting to be accessed in that 1.2 petabytes we call the internet.

Thoughts On YouTube, Podcasts, And Accents

I was 18, sitting on a dock over Lion’s Creek.

The Article

Today there is a repost of an article by Jessica Love in The American Scholar titled The Disappearing Accent. In this article the author goes on to discuss how certain age groups have more difficulties distinguishing English accents than others, particularly younger age groups to focus on only familiar accents and will tune out unfamiliar accents.

Accents and dialects play an important function socially by helping individuals distinguish locals from non-locals. This gives an immediate sensory input of “in-group” vs. “out-group” and based on the associations with that group a person will have a response. Accent responses contribute to a global issue of systemic racism and sometimes, these responses aren’t so friendly (see: almost every anti-immigrant accent joke ever – even Disney is guilty of a long history of these).

Accents do help individuals determine where, geographically, someone is from rapidly without conscious thought. Interestingly, accents can tell us a lot about the history of human migration as well.

Expanding on this, even English accents and dialects demonstrate this history of human migration. The accents found throughout the former British Empire are based on the timing of colonization compared to when the Great Vowel Shift occurred, when and where the colonists originated from, and whether their English dialect originated from Victorian or Elizabethan English.

As someone from the Chesapeake Bay my accent originates from Elizabethan English prior to the Great Vowel Shift. This is unique and part of what makes accents from this area special and different sounding from all other Southern accents. Tangier, Hog, and Smith Island are the famous Chesapeake Bay islands, but there are so many others no longer occupied by more than one or two houses, if any. The watermen lived along the shorelines and worked the bay.

My grandfather was born north of the Bay and we came into the area. My parents lived most of their lives elsewhere, then raising us in towns always on the Shore as opposed to on the Islands. This is an important distinction. My accent is not multi-generational, and therefore not as thick as others.

The Accent Tag

If you haven’t been exposed by now, there’s an incredible thing called the Accent Tag. This has been used extensively for documenting the way people speak through YouTube videos and is a wonderful resource for authors who want to research how someone from a particular area would sound. I decided to read off the words from the word list after several hours of silence and white noise as auditory input to provide a baseline of my accent.

Here’s a recording of me saying the Accent Tag words

What About Youtube Videos And Podcasts?

I would love to! Based on my pronunciations above, do you think people could understand me if I slip into that? Do you think I’ll need subtitles? I’ve had students accuse me of needing subtitles before, during classes while teaching and that’s been embarrassing. In the past my accent has made it difficult for people to understand me.

In past relationships it meant I was lectured on correct pronunciation, and it may have played a role in why they never introduced me to their family. I have been told that my accent makes me sound “low class” and “uneducated”. I’ve had to explain to my own husband that he needed to back off with the “you’re pronouncing it wrong” bull crap.

Long story short, people experience accent discrimination by losing job opportunities and by experiencing people being dicks to them, sometimes their own spouses and friends. The moment this is combined with any other factor their lives get way worse. To be blunt: it’s a lot of effort to keep constantly worrying about how I’m pronouncing things. You can hear me trip up in the word list with “Spitting Image” because… That’s not how I would even begin to say that phrase because it’s not even spelled that way in my head.

For these reasons, I’m nervous about being public with my voice. I know my accent that slips out is not as thick as a Tangier Islander accent:

That said, my accent is something I think is special and unique. It is one of the most beautiful things about where I am from and about the history of the United States. And it’s disappearing. Accidentally, I may be part of the last generation of Americans to have a Chesapeake Bay accent.

The Delmarva peninsula and the Chesapeake Bay are the settings of many of my stories. I look forward to sharing these with everyone so you too can know the joy of stories of Accomack, Onancock, Harborton, Onley, Wallops Island, and more.

Concluding Thoughts

Accents are complicated. They are used to make judgments that are often unfair and completely uncalled for. They are used as a deciding factor in job interviews and by random people we meet in passing for an introduction.

“An accent comes with a connotation. You think you know if someone is smart or stupid because of their accent. And yet the truth is an accent is not a measure of intelligence, it’s just someone speaking your language with the rules of theirs.”

Trevor Noah Afraid of the Dark

In Trevor Noah’s quote, which I love, I think dialect comes into play. A dialect is a particular form of a language specific to a region. Think about an accent as a language being spoken with the rules of a dialect or another language than the one the listener thinks is “normal”. That’s it.

So… Next time you want to correct someone for pronouncing something “wrong”, pay attention. Is that how they always pronounce it? Are they consistent? Is that how everyone pronounces that word where they’re from? Maybe it’s okay to not correct accents that are different from your own. Besides, it’s on both of you to adjust during the conversation to improve communication.

So what do you think? If I slip up and say a word (or a lot of words) with my rounded, drop vowels and soft start consonants will it bother you too much for me to make videos or podcasts? Should I do both formats and put subtitles on the videos? Let me know in the comments!