A Day with Bangladesh on Twitter

May 10, 2018

A report published by Daily Star on December 01, 2017 reported that according to Google’s record a total of 40 million people actively use the internet in Bangladesh. Out of that 40 million users, 35% access it every day. A majority of that internet users spend a significant portion of their internet presence on social networking sites. Which makes Dhaka the 2nd most active city on Facebook as reported in Global Digital Statshot, a report published by online media sites Hootsuite and ‘we are social’, last year. It’s already a proven fact that we Bangladeshis have accepted social media as a part of our lives. Social media platforms have also evolved. From being merely personal networking online mediums they have evolved into global opinion sharing platforms which now even shape people’s opinion. For Bangladesh, this huge number of population, mostly young, thriving on the internet could be a great way to represent itself to the world. But how do we collectively represent our country in the social media? What are the things that we care most or how do we feel when we talk about Bangladesh? Study of the social media contents with the help of advanced data analytics techniques and machine learning algorithms may give us some idea. This article will give a glimpse of finding from such a study.

According to a report published in 2015 by CNN, Facebook is the most widely used social media platform in Bangladesh followed by Twitter. Accordingly, though Facebook should be the first choice, Facebook does not provide integration facility with data analytics tools to collect large-scale data to analyze for such study. On the other hand, Twitter’s publicly accessible API (application programming interface) makes it a good candidate to fetch large scale of publicly available tweets to study. Which makes Twitter a perfect candidate for this analysis. Since the study interest was about knowing what impression we give to the global audience, only the messages written in English were considered for analysis. For the study, a total number of 20,000 tweets in English that contained ‘Bangladesh’ were collected which were posted on a single day in January 2018. These 20,000 tweets were collected and analyzed using R, an open- source data analysis tool and different machine learning models were used to derive the insights which will be briefly discussed along the way of discussion.

To put our discussion in a context, let’s consider this study as a random walk on the street called ‘Twitter’ with a celebrity named ‘Bangladesh’. During our walk, we recorded all the comments about Bangladesh. Now we will analyze the records (tweets) and try to understand what people talked about most and what kinds of emotion they exuded.

To unfold the story, we will start our analysis with the words from random comments about Bangladesh, in this context tweets, that caught our attention. A word cloud formed from the tweets may give us an idea. From the word cloud using the most frequently used words in the tweets, we can see Rohingya, News, Refuge, Cricket are some of the words that people used most often. Apparently, there are a lot of gossips going on relating Rohingya, Cricket, and News. But what about the other words that we heard and how do they relate to each other or do they

even relate at all? Or in other words, can we say that that people talked about the news on Rohingya or news on something else? Or what about the country names that we see in the word cloud? Why do people care about India and Pakistan?