Computational morphology and phonology

Language is a pattern, and patterns must be computable. But, different patterns need different power. I focus on studying the computability of morphology and phonological from the perspective of Formal Language Theory and finite-state calculus.

Reduplication

Computation of reduplication and its typology

Reduplication is a typologically common yet computational difficult process to model. Under the supervision of Dr. Jeffrey Heinz, I have explored the use of uncommonly used finite-state technology (2-way finite-state transducers) in handling the high computational complexity of reduplication. Using 2-way FSTs, we have:

  1. looked at its ability to generate the typology of reduplication
  2. discovered subclasses of 2-way FSTs that match the linguistic typology
  3. determined the learnability of these subclasses.
  4. shown how they are an insightful computational model of reduplication
  5. developed the RedTyp database on reduplication using 2-way FSTs

RedTyp can be accessed on our GitHub. A synthesis of all this work is in our paper in the Journal of Language Modeling: paper (link). To learn more about these points, check out:

  • [1] CLS53 paper and slides
    Dolatian, Hossep and Jeffrey Heinz (2017). “Reduplication with finite-state technology.” Proceedings from the Annual Meeting of the Chicago Linguistic Society, vol. 53, no. 1, pp. 55-69. Chicago Linguistic Society, 2017
  • [2] NAPhCX slides
  • [2&3] ICGI paper (link) and slides
    Dolatian, Hossep and Jeffrey Heinz (2019) “Learning reduplication with 2-way finite-state transducers.” Proceedings of The 14th International Conference on Grammatical Inference 2018, PMLR 93:67-80, 2019.
  • [4] SIGMorPhon paper (link) and poster
    Dolatian, Hossep and Jeffrey Heinz (2018) “Modeling reduplication with 2-way finite-state transducers.” Proceedings of the Fifteenth Workshop on Computational Research in Phonetics, Phonology, and Morphology (pp. 66-77). https://doi.org/10.18653/v1/W18-5807
  • [5] SCiL paper (link) and slides
    Dolatian, Hossep and Jeffrey Heinz (2019) “RedTyp: A Database of Reduplication with Computational Models,” Proceedings of the Society for Computation in Linguistics: Vol. 2 , Article 3. https://doi.org/10.7275/ckx7-s770
  • [1-5] JLM paper (link)
    Dolatian, Hossep and Jeffrey Heinz (2020). “Computing and classifying reduplication with 2-way finite-state transducers.” Journal of Language Modelling, 8(1), 179–250. https://doi.org/10.15398/jlm.v8i1.245

Generative capacity of reduplication and reduplicative theories

Reduplication has a well-defined computational expressivity, but the same can’t be said for most theories of reduplication. In collaboration with a lot of folks, we look at how specific models of reduplication fare with respect to the computation of reduplication. Most models are computationally much more powerful than the attested typology.

  • Paper in review (paper) on precedence-based phonology with Eric Raimy
    Dolatian, Hossep and Eric Raimy (in review) “Evaluating Precedence-Based Phonology: Logical structure of reduplication and linearization”
  • Glossa paper (link)
    Rawski, Jon, Hossep Dolatian, Jeff Heinz, and Eric Raimy (2023) “Regular and polyregular theories of reduplication”, Glossa: a journal of general linguistics 8(1). doi: https://doi.org/10.16995/glossa.8885

Formal language theory and neural networks

Formal language theory provides abstract but computible descriptions of natural language process. In collaboration with Max Nelson, Brandon Prickett, and Jonathan Rawski, we test how well neural networks can match the behavior of subregular functions and their finite-state transducers.

Our main case study so far is reduplication. Reduplication can be computed by different types of 1-way and 2-way finite-state transducers, and by different types of neural networks. When learning reduplication, we show that different parameters for a neural network correspond to different classes of finite-state transducers.

  • SCiL paper (link) and slides
    Nelson, Max, Hossep Dolatian, Jonathan Rawski, and Brandon Prickett (2020) “Probing RNN Encoder-Decoder Generalization of Subregular Functions using Reduplication,” Proceedings of the Society for Computation in Linguistics: Vol. 3 , Article 5.

Phonology of reduplication and pseudo-reduplication

Although it is known that reduplication is more complex than the rest of phonology, there is little work on the computation of reduplication-sensitive phonology. As a response to Hayes & Jo ms, we argue that these processes are regular. Regularity is ensured because phonology has access to the morph boundaries created by reduplication and by pseudo-reduplication. We also show the differences in the computation of copying across linguistic modules.

  • NecPhon slides and paper [TBA]

Multi-input processes

With Jonathan Rawski, I’ve recently looked at the computational power needed to model certain types of phonological processes that use multiple inputs using multi-tape transducers.

Templatic morphology in Semitic

Semitic templatic morphology or root-and-pattern morphology is classically studied as a process of combining multiple input morphemes. In collaboration with Jonathan Rawski, we discovered how the subregular hierarchy for single-tape finite-state transducers can be extended to multi-input functions like template formation in Arabic verbs. We formulated the class of local multi-input functions and multi-tape transducers.

  • SCiL slides and PLC paper (link)
    Dolatian, Hossep and Rawski, Jonathan (2020) “Finite-State Locality in Semitic Root-and-Pattern Morphology,” University of Pennsylvania Working Papers in Linguistics: Vol. 26 : Iss. 1 , Article 10.
  • A more detailed SCiL paper (link)
    Dolatian, Hossep and Jonathan Rawski (2020) “Multi-Input Strictly Local Functions for Templatic Morphology,” Proceedings of the Society for Computation in Linguistics: Vol. 3 , Article 28.
  • Synthesis paper with tone (link) that’s very deep in review and re-writing.

Tone and Tonal phonology

As an outgrowth of our work on Semitic morphology, Jonathan Rawski have extended the use of multi-tape transducers for tone and tonal phonology and autosegmental structure. As with templatic morphology, a large chunk tonal phonology can be computed with a restricted type of local multi-input functions and multi-tape transducers.

  • SCiL poster and paper (link)
    Rawski, Jonathan and Hossep Dolatian (2020) “Multi-Input Strict Local Functions for Tonal Phonology,” Proceedings of the Society for Computation in Linguistics: Vol. 3 , Article 25.
  • Synthesis paper with root-and-pattern morphology (link) that’s very deep in review and re-writing.

Formal definition of n-regular functions

Locality in multi-input functions is a significant computational generalization in templatic morphology and tone. In collaboration with Adam Jardine and Tadjou-N’Dine Mamadou Y., Jon Rawski and I are expanding and refining our initial formal definitions of locality in multi-input functions.

  • Paper (TBA)

Finite-state formalisms beyond reduplication

Besides reduplication, I’ve also explored the use of finite-state devices in other morphophonological areas.

Tier-based computation in vowel harmony

In collaboration with Yiding Hao and Samuel Andersson, we’ve looked at how vowel harmony patterns can be computed over tiers. We show that harmony patterns are locally computed over a projected vowel tier. The composition of multiple harmony rules is however non-local.

  • AMP poster and proceedings (link)
    Andersson, Samuel, Hossep Dolatian, and Yiding Hao (2020) “Computing vowel harmony: The generative capacity of search & copy.” In Proceedings of the Annual Meetings on Phonology . (11–22)

Computation of prosodic phonology

Finite state isn’t enough for recursive prosody

Although finite state devices are useful, they can’t do everything. In collaboration with Aniello De Santo and Thomas Graf, we found that unbounded recursive prosodic structures are not finite-state definable. Such structures are often argued for prosodic phonology.

  • SIGMORPHON proceedings (link)
    Dolatian, Hossep, Aniello De Santo, and Thomas Graf (2021) “Recursive prosody is not finite-state” In Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, vol. 8. 2020. https://doi.org/10.18653/v1/2021.sigmorphon-1.2

Prosodic mappings from syntactic trees

With Mai Ha Vu and Aniello De Santo, we looked at how to syntactic trees can be mapped to prosodic trees using logical transductions. As a case study, we focused on ditransitive sentences because they have a rich typology.

  • SIGMORPHON paper (pdf, link) and slides
    Mai Ha Vu, Aniello De Santo, and Hossep Dolatian (2022) “Logical Transductions for the Typology of Ditransitive Prosody”. In Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 29–38. https://aclanthology.org/2022.sigmorphon-1.4

Formal logic for phonology & morphology

Besides using finite-state formalisms, I likewise work on applying formal logic to describe phonological and morphological phenomena. I specifically use monadic second order graph-to-graph transductions. By doing so, my dissertation extends One-Level work in Declarative Phonology to “Two-Level Declarative Phonology”. Check out Kristina Strother-Garcia’s work too!

Logical structure of cyclicity in morphophonology

For my dissertation, I used formal logic to formalize the phonology-morphology interface. I formalized the generation and modification of prosodic and morphological trees. Cyclicity is a common theoretical tool but has little computational implementations. I showed how we can define domains for phonological rules (cophonologies) and apply these domains in a cyclic framework.

  • Dissertation (here)
    Dolatian, Hossep (2020) “Computational locality of cyclic phonology in Armenian.” PhD diss., State University of New York at Stony Brook
  • Paper (draft) that’s very much in the re-writing phase

Locality of iterative prosody and epenthesis

With Nate Koser, Jonathan Rawski, and Kristina Strother-Garcia, we showed that iterative prosodic processes are computationally local over recursive formulas. We focused on iterative stress, syllabification, and epenthesis.

  • AMP poster and proceedings (link, paper)
    Dolatian, Hossep, Nate Koser, Jonathan Rawski, and Kristina Strother-Garcia (2021) “Computational Restrictions on Interative Prosodic Processes.” In Proceedings of the Annual Meetings on Phonology, vol. 9. https://doi.org/10.3765/amp.v9i0.4920

Strong generative capacity of morphology

With Jonathan Rawski and Jeffrey Heinz, we looked at the strong generative capacity of different morphological theories using formal tools like origin semantics and order-preservation. Focusing on complex phenomena like infixation and others, we found that some morphological theories require more expressivity than others in order to directly capture certain generalizations.

  • SCiL paper (pdf, link) and slides
    Dolatian, Hossep, Jonathan Rawski, and Jeffrey Heinz (2021) “Strong Generative Capacity of Morphological Processes,” Proceedings of the Society for Computation in Linguistics: Vol. 4 , Article 22. https://doi.org/10.7275/sckf-8f46
  • For applications to reduplication, see the previous reduplication section.

Tree-based representations for allomorphy

With Shiori Ikawa and Thomas Graf, we looked at how to morphological allomorphy can be computed over strings vs. over trees. Local cases can be handled easily using local finite-state transducers (over strings) and tree transducers. But for non-local allomorphy, more refined types of tree transducers are needed based on whether the allomorphy is inwardly-sensitive vs. outwardly-sensitive.

  • SIGMORPHON paper (pdf, link) and slides
    Dolatian, Hossep, Shiori Ikawa, and Thomas Graf (2022) “Trees probe deeper than strings: an argument from allomorphy”. In Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 51–60. https://aclanthology.org/2022.sigmorphon-1.6

 

Skip to toolbar