Explaining Morphological Irregularities

Michal Starke’s invited talk at NELS 51, entitled “Universal Morphology”, addresses the following question: Are morphological “irregularities” to be analyzed as exceptions or should we try and find some sort of regularity in them? His answer is “they should be”, and they can be with the appropriate theory. He illustrates his point by providing derivations for irregular morphology in French finite verbs, focusing on suppletive roots and portmanteau suffixes in the paradigms of present, subjunctive and past tense, see (1) below.

principles, indicating that the nano algorithm successfully parses a larger set of French irregular verbs than the ones discussed in his presentation.
Nanosyntax provides interesting hypotheses on the architecture of grammar, atoms, rules and principles interacting in the derivation of linguistic expressions. In the following paragraph I ask: In what sense does it provide an explanation of apparent morphological irregularities? In what sense does parsing large sets of data adds to the understanding of I-language, the language internal to the mind?

Explanation in linguistic theory
The question whether we should analyze morphological irregularities as exceptions or try to find some sort of regularities in them has been addressed in Generative Grammar since its beginnings.
Different hypotheses have been put forward on whether morphological irregularities are located in the lexicon (Chomsky 1970), on whether morphological derivations are located in a dedicated morphological component, (Halle 1973), distributed in different components of the grammar, (Halle and Marantz 1993), or derived in syntax, e.g. Starke (2009), to name a few. The question is to what extent do these hypotheses contribute to our understanding of I-language?
According to recent minimalist thinking (Chomsky, Gallego, Ott 2019), an explanatory theory of I-Language reduces its central operator MERGE to its simplest form. Simplicity is a basic methodological principle of science. In addition, a truly explanatory theory of I-language should also meet the criteria of evolvability and acquirability (Chomsky 2019). Children develop rapidly the language to which they are exposed, notwithstanding the poverty of the stimulus. Archeological records indicate that I-language evolved rapidly in homo sapiens, while the externalization of linguistic expressions be part of an ancillary system predating the emergence of I-language.
It is interesting to consider nanosyntax in this perspective and ask how different it is from alternative theories and how explanatory it is.

Explanation in a theory
Nanosyntax incorporates insights from Distributed Morphology (Halle & Marantz 1993, et seq.), Asymmetry Morphology (Di Sciullo 2005, et seq.), Syntactic Cartography (Cinque 1999, Rizzi andCinque 2017, et seq.), and the Minimalist Program, (Chomsky 1995, et seq.). For example, the idea that MERGE applies across the board is a common assumption in these frameworks. The hypothesis that morphological elements project their own structure, which takes the form of "minimal trees" is investigated in Asymmetry Morphology at the morphemic level (Di Sciullo 2005, et seq.). Yet, nanosyntax differentiates itself from these frameworks. For example, as it is the case for Asymmetry Morphology, submorphemic elements spell out morphosyntactic trees. In Nanosyntax only one privative feature heads each morphosyntactic trees, however, and the lexicalization of the latter is subject to universal principles of the grammar (UG). These hypotheses lead to questions open for discussion. I shortly detail a few of them below.

Features and trees
According to nanosyntax, mismatches between surface chaos (morphological irregularities) and underlying basic order are attributable to the fact that syntactic heads are assumed to host bundles of features, see (2). Once these are eliminated, apparent morphological irregularities can be derived from uniform principles of UG. This hypothesis further extends functional projections, see (3), and adds complexity, defined in terms of length of derivations, projections (compare 2 and 3, from Michal Starke's presentation). (2) (3) (4) Furthermore, it is unclear how morpho-conceptual features can be elegantly accommodated. In nanosyntax, pieces of inflectional morphology include Gricean categories analyzed as aspectual features (e.g., Speaker, Participant (Spk)), as well as aspectual and tense features (e.g., Aspect, Past), see (4) and (5) from Michal Starke's presentation. Additional features are needed for finer-grained semantic features, such set of eventualities, state, activity, achievement, and accomplishment, decomposed into sub-features, i.e., terminus and subinterval. How would these privative aspect sub-features account for simple contrasts such as courrir (to run)/acourrir (to flee), construire (to construct)/*aconstruire, naître (to be born)/*anaître, ressembler (to resemble)/*aressembler? (see  for discussion)?
Furthermore, languages vary with respect to the conceptual features associated with a functional head. For example, contrary to English, in Italian the preposition a is [locative] with certain stative verbs, e.g., stare a scuola, 'stay at school' and [directional] with certain activity verbs, e.g., andare a scuola, 'go to/*at school', see Di Sciullo (2019-20). Here again it is unclear how the facts would follow without adding complexity to the derivations. More broadly, the question arises whether such complexity could be tractable by the human brain, given its limited resources. Recent work in neuroscience indicate that the brain eliminates part of the complexity brought about by the sensorimotor system. It might be the case that a morphemic instead of a sub-morphemic spell-out of morphosyntactic trees could contribute to reduce sensorimotor complexity.

Universal principles
Lexicalization is taken to be a (partial) matching relation (<->) between a lexical tree and its spell-out, as depicted in (4) and (5) above. According to nanosyntax, Lexicalization is subject to universal principles, see also Starke (2009) for discussion. The general principle for Lexicalization is that an XP can lexicalize if the same XP, or part thereof, exists in the lexicon. This however requires complex search in the lexicon, comparing trees and sub-trees in order to select the best matching candidate. Furthermore, Merge has 3 options for lexicalization: Do not move at all and lexicalize. B.
Move (2 ways on moving) and lexicalize (might make the wrong choice) C.
Backtrack to the previous cycle, undo last-resort movement, and take the second choice.
One surprising aspect of (6) is that it includes backtracking, the option to countercyclically retrograde to previous steps of a derivation. This contravenes the Markovian, deterministic nature of syntactic derivations assumed in generative grammar (Chomsky 1965, Yang 2016, Chomsky 2017. Why would morpho-syntactic derivations differ from syntactic derivations, in a framework where morpho-syntax is syntax? Furthermore, the application of this operation differs from current theorizing in the Minimalist Program (Chomsky, Gallego, Ott 2019), according to which MERGE applies freely, it is not subject to Principles internal to UG/Ilanguage, but its is rather subject to principles of efficient computation external to UG/I-language. It might be the case that options such as the ones in (6) fall into the realm of E-language, which is assumed to be subject to variation and performance factors of different sorts, including hesitations and false starts. This question is not addressed in Michal Starke's presentation.

Language acquisition and variation
Questions also arise with respect to the acquisition of morpho-syntax by the child endowed with the rules and Principles of nanosyntax. How would (s)he learn apparently irregular verbal inflection in French, or in other languages? What would make nanosyntax better in explaining the child's language acquisition path?
Language acquisition is the primary source of variation (Chomsky 2005). In the Minimalist Program, variation follows from AGREE and feature valuation. In nanosyntax, however, variation reduces to the size of lexically-stored trees. Why would comparing tree sizes be more elegant, to account for the variation resulting from language acquisition or language contact, than an account that derives variation from simplest MERGE and third factors Principles of efficient computation, such as AGREE and feature valuation? This question is not addressed either in Michal Starke's presentation, but see Starke (2011) for discussion.

Parsing
The fact that a grammatical system can be implemented in a parser, a technological system, does not provide any explanation on the properties of a biological system, such as I-language. The parts of the human brain dedicated to language have access to limited resources. The brain does not have the computational capacity to parse very large quantities of data in a very limited time frame. Notwithstanding, more details on the nano algorithm would have been interesting to hear. This may have led to questions related to its computational efficiency when compared with morpho-syntactic parsers implementing morphemic feature agreement in minimal trees, such as the one discussed in Di Sciullo and Fong (2005).

The road ahead
The understanding of morphological irregularities has been part of Generative Grammar's research agenda since its beginnings. Several working hypotheses have been identified, along with interesting insights on their derivations, such as the ones brought forward by nanosyntax. The latter invites further discussion in view of a truly explanatory theory of I-language.