Digital Humanities 150:
Social Media Data Analytics
abouT the data
D A T A S E L E C T I O N
Our dataset consists of tweets, user information, locations, dates, and times scraped from Twitter. All of the tweets include the hashtag "#NotDying4WallStreet". The data specifically consists of tweets from March 24, 2020, which was the day the hashtag started trending.
​
W O R K P L A N
​
The Twitter data from March 24th (2 hours duration) was scraped using TAGS. We checked to see if there were any NULL values in columns and rows and deleted them accordingly. After cleaning the data, we used different technological tools to perform various analytics. To see how the content of the tweet potentially relates to the location that the tweet came from, we used Tableau to answer our first research question. Then we used Voyant Tools to observe the relationship among the words and phrases that were frequently used. Furthermore, we used R to figure out which hashtags were used the most frequently to support our 2nd finding.
We went in-depth with our research and performed vader sentiment analysis in Python and R sentiment analysis. Our purpose using R sentiment analysis is to get which emotions were detected the most from the tweets and vader sentiment analysis to get the percentage of positive and negative sentiments detected from the tweets. Lastly, we used Tableau to get the top 20 accounts with the most followers to research their occupation and used Gephi to conduct social media networking analysis on retweets and mentions.
T E C H N I C A L S P E C I F I C A T I O N S
​
​
Since we are analyzing Tweets, we performed a sentiment analysis to better understand the emotions behind the tweets. By conducting a sentiment analysis, we were able to unpack whether people had overwhelmingly positive or negative reactions toward going to work amidst the ramp-up of COVID-19 cases. We made use of additional visualizations to further unpack Twitter users' attitudes, including what other hashtags were popular along with #NotDying4WallStreet. We also explored celebrity tweets and their potential impact on other users' engagement and attitudes toward the hashtag. We also analyzed a geo map to see if the emotional response varied across different regions of the U.S. and the world. Tying these investigations together helped us gain a better understanding of our data and how #NotDying4WallStreet contributed to not only a national, but global, conversation surrounding the COVID-19 pandemic, politics, and economics.
R
A programming language and for statistical computing & graphics. The coding language is widely used among statisticians & data miners.
Tableau
An interactive data visualization software and data analytics tool that helps people see and understand the data in clear visualizations.
Voyant Tools
A programming language and for statistical computing & graphics. The coding language is widely used among statisticians & data miners.
Python
A programming language known for its cross-functional capabilities and numerous applications. Utilized for a number of our visualizations.
Gephi
An open-source network analysis and visualization software that allowed us to study social media networks.