Computational linguist advancing writing system analysis
About Me
I am a computational linguist specializing in writing systems. I analyze corpora to understand how humans encode and transmit language across different writing systems. My research bridges traditional linguistics with computational and corpus approaches to uncover the fundamental principles of written communication.
This website showcases my academic work, projects, and services.
Research Assistance Services
I provide technical support for research projects, including but not limited to:
- Data Scraping: Extracting structured and unstructured data from online sources.
- Data Analysis: Utilizing Python or R for advanced data processing.
- Data Visualization: Designing clear, visual representations of complex data.
If you need technical support for your research, I offer a free consultation. Visit my Services page for details.
Education
- M.A. Linguistic Theory & Typology, University of Kentucky (2024)
- B.A. Philosophy, Western Kentucky University (2018)
Research Interests
My research centers on understanding how writing systems encode linguistic information and uncovering universal principles in human communication. I approach this from a computational perspective, integrating concepts from information theory, computational linguistics, and historical linguistics, building on the foundational work by scholars such as (Coulmas, 1989, 2003) and (Sproat, 2000).
1. Writing Systems
I study the structural and functional principles of writing systems, focusing on questions such as:
- How do writing systems balance the representation of phonology, morphology, and syntax?
- What universal constraints influence the design of writing systems across cultures and time periods?
- How do written forms adapt to encode linguistic information differently than spoken language?
Recent projects include:
- Analyzing the trade-offs between efficiency and redundancy in alphabetic and syllabic systems.
2. Corpus Linguistics
By leveraging corpus analysis, I uncover patterns in orthographic representation and linguistic variation across languages. Key contributions include:
- Developing tools to analyze multilingual corpora, accommodating diverse scripts and encoding systems.
- Quantifying linguistic regularities using entropy, redundancy, and other statistical measures.
- Studying diachronic text corpora to understand how writing systems evolve over time.
Applications of this research range from exploring phonological transparency in modern alphabets to tracing historical linguistic change in under-documented languages.
3. Computational Linguistics
I use computational methods to study sub-word-level patterns in writing systems. Key areas include:
- Using information-theoretic methods to measure predictability and linguistic efficiency in text.
- Modeling orthographic redundancy to understand its role in readability and error correction.
- Building machine learning models to analyze how writing systems encode phonological, morphological, and syntactic features.
Notable achievements include:
- Developing algorithms to classify writing system directionality using entropy and Gini coefficients.
- Designing machine learning models to predict linguistic properties from script features.
Selected Presentations
Year | Title | Venue |
---|---|---|
2024 | Predicting the Direction of Writing Using Character Gram Sequences | 16th Annual Meeting of the Illinois Language and Linguistics Society (ILLS16) |
2024 | Data Degradation in the Linguistic Atlas of the Gulf States | American Dialect Society |
2023 | Investigating the Etymology of ‘Bogle’ – A Dialectological Approach | American Dialect Society |
2023 | Deciphering Historical Texts Using Word Embeddings | Central Kentucky Linguistics Conference |
2022 | Logography? More like NOgography: Reconsidering Writing System Typologies | Central Kentucky Linguistics Conference |
2015 | Critique of Zimmerman’s Philosophy of Time | Kentucky Philosophical Association |
2014 | Syntactic Analysis of Elision: ‘Want to/Wanna’ | Western Kentucky University Undergraduate Conference |
Affiliations
Linguistic Atlas Project
University of Kentucky
A comprehensive study documenting linguistic variation in American English dialects since 1929. The project includes digitization of over 90 years of linguistic data to make it accessible for modern research. Learn More
DECRYPT Project
Stockholm University
An interdisciplinary initiative focused on decrypting historical manuscripts. Combining computational linguistics, cryptology, and philology, the project develops tools for analyzing and making historical ciphers accessible. Learn More