Graduation Year


Document Type




Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Computer Science and Engineering

Major Professor

Sriram Chellappan, Ph.D.

Committee Member

Srinivas Katkoori, Ph.D.

Committee Member

Mehran Mozaffari Kermani, Ph.D.

Committee Member

Nasir Ghani, Ph.D.

Committee Member

Nathan Fisk, Ph.D.


Classification, Crisis, Event Detection, Ranking, Text Mining


The use of social media is expanding significantly and can serve a variety of purposes. Over the last few years, users of social media have played an increasing role in the dissemination of emergency and disaster information. It is becoming more common for affected populations and other stakeholders to turn to Twitter to gather information about a crisis when decisions need to be made, and action is taken. However, social media platforms, especially on Twitter, presents some drawbacks when it comes to gathering information during disasters. These drawbacks include information overload, messages are written in an informal format, the presence of noise and irrelevant information. These factors make gathering accurate information online very challenging and confusing, which in turn may affect public, communities, and organizations to prepare for, respond to, and recover from disasters. To address these challenges, we present an integrated three parts (clustering-classification-ranking) framework, which helps users choose through the masses of Twitter data to find useful information. In the first part, we build standard machine learning models to automatically extract and identify topics present in a text and to derive hidden patterns exhibited by a dataset. Next part, we developed a binary and multi-class classification model of Twitter data to categorize each tweet as relevant or irrelevant and to further classify relevant tweets into four types of community engagement: reporting information, expressing negative engagement, expressing positive engagement, and asking for information. In the third part, we propose a binary classification model to categorize the collected tweets into high or low priority tweets. We present an evaluation of the effectiveness of detecting events using a variety of features derived from Twitter posts, namely: textual content, term frequency-inverse document frequency, Linguistic, sentiment, psychometric, temporal, and spatial. Our framework also provides insights for researchers and developers to build more robust socio-technical disasters for identifying types of online community engagement and ranking high-priority tweets in disaster situations.