Project Details
Client:
Academic / Personal NLP Project
Tool:
Python, pandas, spaCy, NLTK, Gensim, Hugging Face Transformers, pyLDAvis, seaborn
Twitter Discussion Analysis – SDG 5 (Gender Equality)
Uncovering Public Sentiment and Topics Around Gender Equality Through Twitter NLP
This project applies Natural Language Processing techniques to analyze over 10,000 tweets related to the United Nations Sustainable Development Goal 5 (Gender Equality). The goal was to build an end-to-end pipeline that extracts, processes, and structures online conversations into usable insights for researchers, policymakers, and the public.
Using Python libraries such as pandas
, spaCy
, and NLTK
, the pipeline cleans and tokenizes the raw data. Sentiment analysis was performed using TextBlob, with temporal trends visualized via seaborn
. Topic modeling was conducted using Gensim's LDA, enhanced with pyLDAvis
for interpretability. To convert thematic clusters into stakeholder-ready summaries, a BART transformer model from Hugging Face was integrated for text generation.
This project also demonstrates agile project delivery with sprint tracking in Trello, versioned notebooks in Jupyter, and final insights compiled into a written report and executive slides. It highlights not only technical NLP workflows but also the importance of transforming unstructured social data into actionable narratives for real-world advocacy.
Explore the project in full: https://github.com/dangquii/Twitter-Discussion-Analysis