top of page

Research & Initiatives

We conduct research and development in several program areas. Learn more about each program below.

  • Cultural Analytics

 

This research program applies natural language processing and AI techniques to analyze and understand cultural phenomena as reflected in language. ​This interdisciplinary field combines computational linguistics, cultural studies, sociology, and data science to study societal patterns, values, and identities in texts, traditional and social media, and multilingual datasets.

Key areas of cultural analytics include sentiment and opinion analysis to gauge public attitudes, discourse analysis to study societal narratives, and topic modeling to identify themes in cultural texts. It also encompasses studying multilingual or cross-cultural datasets to identify linguistic diversity and cultural nuances. Applications include tracking cultural shifts over time, studying how different communities interact linguistically, detecting rise in hate speech or radicalization, and exploring the representation of marginalized groups in digital or historical texts.

By examining how language shapes and reflects culture, cultural analytics fosters insights into societal dynamics and promotes the development of inclusive, context-aware technologies.

  • NLP Resources and Evaluations

 

Natural Language Processing (NLP) plays a crucial role in the development of resources for underrepresented languages, enabling their integration into digital technologies and fostering linguistic diversity. This involves creating and curating essential data assets such as annotated corpora, lexicons, and knowledge graphs that capture the linguistic structure, vocabulary, and semantic relationships unique to these languages. Techniques like data augmentation, transfer learning, and unsupervised learning are often employed to maximize the utility of limited resources.

 

Additionally, this research program is designed to conduct in-depth evaluations of existing systems for low-resource languages, assessing their performance in tasks like machine translation, text classification, and speech recognition. This evaluation identifies gaps, biases, and areas for improvement, ensuring that the systems meet both technical standards and cultural relevance. 

  • Pedagogical Linguistics

 

Our pedagogical linguistics program advocate for the integration of linguistic theory, formal language acquisition research, and AI-driven approaches within the language teaching curriculum, with emphasis on Persian. By encouraging collaboration between linguists, NLP practitioners, and language teachers. we can form a robust foundation for a pedagogical approach that bridges theory, technology, and practice. Linguists bring expertise in language patterns and structures; NLP practitioners contribute technical skills to develop tools such as language models, interactive platforms, and automated feedback systems that enhance language learning and analysis; and language teachers ensure the pedagogical relevance of these efforts, tailoring content to learners' needs and integrating technology into practical, engaging curricula.

By working together, we can create innovative programs that incorporate cutting-edge NLP technologies like adaptive learning platforms, corpora-driven language exercises, and automated assessments into traditional teaching methods. This interdisciplinary approach not only improves the accessibility and effectiveness of language education but also fosters a deeper understanding of linguistic principles, empowering learners with tools that adapt to their unique linguistic and cultural contexts.

  • Formal Linguistics

 

This research program aims to document, analyze, and preserve the languages and the linguistic diversity of the historically and culturally rich regions in the Middle East and the Caucasus. The program emphasizes the development of formal models of phonology, morphology, syntax, and semantics that capture the structural and functional properties of these languages. By leveraging theoretical frameworks and data-driven methods, the research seeks to address the unique linguistic phenomena of these languages, such as complex morphological systems, diverse syntactic structures, and rich oral traditions.

  • Linguistic Modeling

 

Leveraging formal linguistic analysis and theoretical insights enables the development of computational models that are not only powerful but also linguistically informed and interpretable. Computational grammars developed from linguistic theoretical insights create a valuable feedback loop between theory and application, allowing linguists to test and refine their ideas. By encoding grammatical rules and linguistic structures—such as syntax, morphology, and semantics—into computational models, these systems can parse and generate language data, validating theoretical predictions against real-world linguistic behavior. When discrepancies occur, they highlight gaps in the theory, prompting refinement and improvement. These grammars also enable the application of theoretical frameworks to diverse languages, including low-resource ones, testing their adaptability to unique linguistic features like rich morphology or free word order. This iterative process not only enhances computational tools but also deepens our understanding of language typology and universals, bridging the gap between linguistic theory and practical technology.​

Join our mailing list for updates on publications and events

Miami, Florida, United States

(561) 592-8272

© 2024 by The Zoorna Institute. Powered and secured by Wix

bottom of page