Date of Award
5-2020
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Computer Engineering and Sciences
First Advisor
Eraldo Ribeiro
Second Advisor
Theodore Petersen
Third Advisor
Ronaldo Menezes
Fourth Advisor
Philip Bernhard
Abstract
The success of humans cannot be attributed to language, but it is certainly true that language and humans are inseparable. Since the first language appeared, we have seen that language continually evolving over space and social gatherings to formed around 7,000 languages today. The origin and evolution of languages still vague, and state-of-the-art in languages evolution still lack a comprehensive characterization. In general, this problem is mainly tackled by statistical measuring the changes on the part of the language ( e.g., words and sounds). Given the current availability of data and computational power, this dissertation proposes a comprehensive data-driven characterization of language evolution using vocabulary in two main fields. First, extracted and classified the structural and chronological relations between the languages using its vocabulary. Second, studied the Spatiotemporal effect on language vocabulary and its relation with socio-economic factors ( i.e., educational attainment). The results demonstrated that the proposed method is capable of uncovering the relation between languages from both structural and chronological aspects, also we found that the vocabulary levels can reveal the educational attainment of a resident population for specific areas and times.
Recommended Citation
Hamoodat, Harith A. Hamdon, "Characterization of Written Text Using Data and Network Science" (2020). Theses and Dissertations. 752.
https://repository.fit.edu/etd/752
Comments
Copyright held by author