图层_edited.jpg

Global Sentiment and
Geography Index Dataset


Promoting well-being is one of the key targets of Sustainable Development Goals at the United Nations. Many governments worldwide are incorporating subjective well-being (SWB) indicators to complement traditional objective and economic metrics. Our global sentiment and geography  index dataset (GSGD) can provide a high granularity monitor of well-being worldwide.

​Twitter Sentiment Geographical Index

Promoting well-being is one of the key targets of Sustainable Development Goals at the United Nations. Many governments worldwide are incorporating subjective well-being (SWB) indicators to complement traditional objective and economic metrics. Our Twitter Sentiment Geographical Index (TSGI) can provide a high granularity monitor of well-being worldwide.

164 Countries

Our dataset provides multiple administrative levels of sentiment, including world, country, state/province, and county/city.

10 Year Coverage

Thanks to the original geotagged Twitter data collection, our sentiment data dates back to 2012.

83% Accuracy

Using the state-of-the-art neural network model, we achieve high sentiment classification accuracy on the test dataset.

 

Data playground

 
Data Technology

Data Source

The raw tweet data we used to produce the global sentiment and geography index dataset (GSGD) is from Harvard CGA Geotweet Archive v2.0, a global collection of geotagged tweets spanning time, geography, and language maintained by the Harvard Center for Geographic Analysis. The Archive extends from 2010 to the present and is updated daily. The number of tweets in the collection is approximately 10 billion.

Methodology

 
Fig1 Schematic.jpg

The sentiment index for global geotagged tweets is made in the following steps:

  • First, we vectorize the text into a 768 dimensions vector.

  • Then, we feed the vector into a trained neural classifier to get the single sentiment score.

  • Finally, we aggregate the scores in different administrative areas to represent the local subjective well-being.

Data Processing_edited_edited.jpg

Applications

Abstract Lines

Acknowledgment

The TSGI is a collaborative project between MIT Sustainable Urbanization Lab and Harvard Center for Geographic Analysis.

yuchen.png

Yuchen Chai

Graduate Researcher

MIT

picture1_01.png

Devika Kakkar

Data Science Project Manager

Harvard

juan.png

Juan Palacios

Post-doctoral Researcher

MIT

siqi.png

Siqi Zheng

Faculty Director and PI

MIT

Modern Interior Design

Contact Us

Thanks for submitting!