Date of Award


Document Type


Degree Name

Doctor of Philosophy (PhD)


Computer Engineering and Sciences

First Advisor

Ronaldo Menezes

Second Advisor

Fernando Buarque de Lima Neto

Third Advisor

Nezamoddin Nezamoddini-Kachouie

Fourth Advisor

Heather Crawford


We live in a digital era where everyday activities are increasingly being replaced by online interactions. In addition, technology advances and data availability are changing the way we expand our knowledge about ourselves, society, and the environment. The increasing availability of data, especially social media data, has called the attention of researchers, and we have been witnessing an outbreak in studies relying on this rich source of information. However, most social media research is tuned to improve the outcomes of specific problems. Therefore, the reuse of techniques used in different areas is limited to data specialists. We propose a straightforward data-driven methodology to perform exploratory analysis of social media data by processing the unstructured stream of social data into user characterization. Emergent collective behaviors are obtained by aggregating individual characterizations. The structured representations are analyzed using Statistics and Data Science techniques. The results highlight the methodology generalization capacity, since we apply it in three different domains: (i) sports, characterizing football supporters; (ii) culture, characterizing languages; and (iii) health, characterizing organ donation awareness. Finally, the knowledge extracted from these applications (experience) serve as input to further research; we propose a measure for social disorganization using the diversity of supporters in a region, and we show language network centralities as proxy for quality of life.