Reply to comments on Universal Morphology

One perk of academic life is the gratification of high caliber intellectual exchanges which emerge haphazardly in personal discussions between colleagues, within and across disciplines. The hope of experiencing more such exchanges lead me early on to be an avid reader of journals such as Behavioral and Brain Sciences and others which published comments on their articles and replies to those comments. Big was my surprise as a young researcher to read senior professors calling into questions each other’s basic comprehension skills and using rhetorics of the type that has now come to be associated to comment sections of social media. Perhaps in an effort to preserve intellectual joy, I have come to believe in a ‘let a thousand flowers bloom’ approach to science. Let researchers develop the best science they can, be delighted by each other’s creativity, and let posterity decide which approach led closer to truth. The field of human capacity for language is vast, it has more than enough space for the various approaches currently under consideration. This stance has kept me away from public cross-theory comparisons and comments, while enjoying the ideas and theories carefully expounded in talks and articles that I end up disagreeing with. I'll therefore try to keep to clarifications and pointing out trade-offs – or at least that is the intended reading of the following remarks.


The creation of morphemes
The essence of nanosyntax, and the first few minutes of the talk under discussion, is that morphemes are created by syntax: recursive grouping (merging) of single features creates articulated constituents corresponding to a single morpheme. To put it perhaps too plainly: there is a regular syntactic structure inside the belly of each morpheme. A happy discovery is that this allows us to entirely bypass some traditional problems of morpho-syntaxportmanteaus just happen to have a bigger constituent in their belly, allomorphy and suppletion happens when the same or similar enough concepts happen to have slightly different sizes of syntactic constituents in their belly. A corollary is that arranging morphemes into words, and words into sentences, is the wrong starting place to understand language and morphologyarranging features into morphemes (and words, sentences and paragraphs) is where the light shines.
There is a by now relatively traditional idea that syntax is responsible for the arrangement of morphemes into words. Within modern generative syntax, influential proposals along these lines include Baker's Mirror Principle (Baker 1985) and Pollock's proposal about verb movement as interpreted by Belletti (Pollock 1989, Belletti 1990), but also work by many other scholars of that time, such as Brody, Cinque, Rizzi, Roberts and others. I will refer to this syntactic school of thought as Arranging Morphemes in Syntax, or les amis for short.
Perhaps the deepest and most frequent misunderstanding around nanosyntax is that it does just the same as les amis: it just arranges morphemes into words in syntax, repeating what was done 30 years ago. This is on display in a couple of comments, for instance Manzini states that: Nanosyntax is meant to [...] entirely reabsorb morphology under syntax. But it is neither alone nor first in advancing this idea. Two decades ago, Manzini and Savoia (2002) suggested that phenomena generally deemed to be morphological, such as those involving the internal organisation of the clitic string, are in reality purely syntactic.
Interesting as it is (which is very), Manzini and Savoia's mentioned work belongs to what I here call les amis, and limits itself to arranging morphemes in syntax (here clitics) and does not take the step that makes nanosyntax nano: creating individual morphemes in syntax by mergers of single features, and thereby resolving issues such as portmanteau, suppletion, the *ABA or allomorphy. In fact nanosyntax goes further and entails that les amis are wrong: for les amis, morphemes are syntactic terminals, i.e. syntactic primitives, while nanosyntax claims that having morphemes (or words) as syntactic primitives is what has led linguistics astray.
The same is true of Thornton's claim that nanosyntax is "strikingly parallel to just items (lexicon) and arrangement (syntax)" and this "since [in both,] items (lexical items / morphemes) are put together (i.e., "arranged", in the American structuralist terminology, or "merged", in the recent syntactically oriented parlance) in syntax". Again: nanosyntax claims that this view of Items & Arrangement gives the wrong starting point, obscuring the underlying regularity of morphology and of language in general. Having morphemes arranged in syntax is a secondary side-effect of opening a morpheme's belly and discovering regular syntax insidea claim shared by none of the above (yet, one hopes!).

Where is morphology, again
The tenet that a morpheme is created by recursive groupings of single features into syntactic phrases has another corollary: the operations creating morphemes (and as a side-effect arranging morphemes into words) are regular syntactic operations, nothing else. Take for instance Thornton's "it is not clear to me [...] what the evidence for the existence of last-resort movement is". Last resort movement is a tool introduced by Chomsky (1995) into syntactic theory, used since in deriving a multitude of various syntactic phenomena. The fact that this same tool is used by nanosyntax in the derivation of apparently irregular morphology is a good illustration of how regular syntactic operations are used in the derivation of surface morpho-syntactic irregularity. This is the polar opposite of approaches such as autonomous morphology, morphomics, post-syntactic morphological components (or parallel morphological components for that matter). Many of us agree that there is hidden regularity underlying morphology, we however locate that regularity in entirely different components of the grammar. Nano is firm in locating it entirely inside the syntax. 2 In doing so, nanosyntax also constitutes a research program about the nature of syntax itself. Take for instance the fact that syntactic terminals are single privative features. This eliminates assumptions about valued/unvalued features, core to many versions of minimalism. Implicit assumptions such as Di Sciullo's are thus a misunderstanding: "Why would morpho-syntactic derivations differ from syntactic derivations [in Chomsky 1965, Yang 2016, Chomsky 2017, in a framework where morpho-syntax is syntax?". This implicitly assumes that all syntax is of the "Chomsky 1965, Yang 2016, Chomsky 2017" type --which is clearly not the case. 3 In fact nanosyntax aims to simplify the rich technological 2 There seems to have been some misunderstanding based on the remark in the talk that [some facts] "have led people to give up on (irregular) morphology". This is an invited talk for a syntax slot given at one of the leading conferences in generative grammar, by a generative syntactician. The intended reading of that remark is thus that syntacticians who have tried their hand at morphology have indeed typically given up on phenomena such as English irregular verbs, French suppletive roots, etc.and not the unlikely (and unintended) reading that morphologists have given up on (regularity in) morphology. 3 Di Sciullo also makes the claim that merging features in binary mergers (as in nanosyntax) is more complex than merging them in a flat structure (as in blundles): "This hypothesis further extended functional projections, see (3), and add complexity, defined in terms of length of the derivation, to the derivations". Putting aside the factual accuracy of that claim, two considerations need to be kept in mind in discussions of computational complexity. First, cherry-picking where one looks for complexity (e.g. "length of derivation") leads one to overlook complexity induced elsewhere (e.g. introduction of a different module and the concomitant cross-module coordination costs, complexity costs of accessing features hidden inside bundles or handling bundles to begin with, etc)resulting in inaccurate complexity assessments. More importantly perhaps, it is well understood in computational discussions of complexity that results are often counterintuitive and hence informal reasonings are of limited use. Only a thorough formalisation, apparatus of Minimalism: no valued/unvalued features, no EPP features and accompanying postulates about movement triggering, no phases, and hence no stipulations about edge of phases, no Agree operation, etc. In place of that, nanosyntax seeks to squeeze juice out of two assumptions needed by everyone: First, any approach needs some technology to express that the order of mergers is highly constrained. Complementisers are merged above/after the verb of the corresponding clause, never below, C > V. Some tense distinctions are always merged before complementisers, never after, C > T > V. Aspectual features are merged before tense features, C > T > Asp > V. The same is true of agentive, mood, evidential and many other features. It is traditional to ask 'why that particular order' and I have seen no plausible answer so farbut what is usually overlooked is that the existence of this ordering allows us to drive the building of derivations and the triggering of movements, making it unnecessary to postulate technology such as the EPP, uninterpretable features, and their concomitant assumptions about movement triggersgreatly simplifying minimalism and adjacent theories.
Second, any approach needs some way of mapping syntactic representations onto gestures (ultimately yielding sounds or other physical artefacts) and concepts. Again, nanosyntax pursues the conjecture that a lot more juice can be squeezed out of that shared assumption than is commonly understoodthrough what we call the lexicalisation algorithmhopefully simplifying the theoretical apparatus of other approaches.
Nanosyntax thus puts all of morpho-syntax firmly inside syntax, but also aims for that syntax to be simpler and more elegant than traditionally assumed.
As is usual in research, different starting points lead to focus on different parts of the data. Thornton for instance notes that viewing morphology as a separate domain is conducive to emphasising cross-paradigmatic patterns, as is beautifully done in Aronoff's work, or in work by Martin Maiden, Edwin Williams and others. Can that data be handled while unifying all of morpho-syntax inside a single principled generative component? Some of it is, and will be e.g. attributed to a functional element (an affix) recurring across paradigms and triggering the same effect in each one of them. Others aren't, and will typically be attributed to diachrony (perhaps the "diachronic moulding force" Maiden discusses) rather than the synchronic state we're investigating.
The synchronic aspects of this data is indeed an area of high interest to nanosyntacticians. For instance, some of us are involved in work on Latin paradigms, with an eye towards Aronoff's generalisations. Similarly for the other facts and work cited by Thorntonwith more hours in a day, or more people participating, we would love to work on each and every single one of those patterns we think they will illuminate the theory of our syntactic competence, and I have tried to illustrate in the talk under discussion why I think this is a promising bet despite a long tradition of skepticism. Our priorities however lie in other areas, that have been comparatively neglected by other morpho-syntacticians.
Which approach is correct, only time will tell.

The "Universal" in Universal Morphology
or thorough measurements within limited bounds, can move the discussion towards reliable conclusions.
There is a lot to be appreciated in Anderson's frankness that his world-view and experience doesn't let him "get inside Starke's Weltanschauung" (I am less sure what to think of his confession that he hasn't watched the entire talk he is commenting on). There is one misunderstanding though that may be useful to the reader. Anderson informs us that nanosyntax "reminds me of" using "rules [which] happen to be formulated in such a way that they couldn't ever apply to anything except forms [under discussion]". Nanosyntax, and the talk, repeatedly insist on the fact that the same small universal algorithm is used in all of nanosyntactic work, whether it is on Ossetic case (Caha 2020 Putting these points together, the research programme under discussion is that syntax operates on individual features, recursively creating constituents large enough to be lexicalised by a morpheme, and does so using a small set of universal/invariant syntactic operations. The goal of the talk was to illustrate that this is not only feasible, but illuminates areas of morphology that have been traditionally difficult to handle in syntax, or in any regular way, such as root suppletion or portmanteaus.

Do you really?
A result that is dear to our hearts is that apparent irregularities are handled without adding any mechanism dedicated to them, or justified on their basis. Rather, we only use a general and universal syntax, a lexicon containing only well-formed syntax, and a small set of invariant principles relating the two (essentially the Elsewhere Principle and cyclic lexicalisation). Surprisingly, De Belder challenges this: "Do not be mistaken, the irregularity is there in the Nanosyntactic approach as well".
This assertion rests on the idea that irregularity in nanosyntax "is masked as a lexical item". What hides behind this, is I think a common misunderstanding; let's get into it in some detail. 5 4 Similarly for Anderson's other reminiscence that "when there are two rules, and the results are different depending on the order in which they are applied, apply the rules in such a way as to produce the correct form". The charge here is of circularitypresupposing the result and choosing the rule as a function of the presupposed result. It is easy to imagine that this may be the way it looks like when one didn't "get inside" the talk enough to follow the mechanics of the proposal. But even without understanding, watching the end will reveal a computer implementation of the algorithm, a fact clearly incompatible with Anderson's aspersion. Indeed, nothing could be further from the truthwe strive to not only have a principled but also formalised and testable approach to morpho-syntax. 5 De Belder adds a number of other surprising assertions, from the idea that a talk about morpho-syntax (slide 3, minute 3:00) should rather handle morpho-phonology, to mischaracterisations of this research as handling the regular, inflectional part of morphology. Handling regular morphology is for instance what les amis do, for the most First, there is a reading of the above quote, under which it is correct, but not relevant: nanosyntax can handle surface irregularities, it hence has a way to represent those irregularities (as lexical items). This is for instance the reading of "irregularity" that De Clercq uses in her comment: "Irregularity (in roots or affixes) is cashed out in the system as the presence of lexically related lexical items whose structural size differs". In that sense the irregularities 'are there'obviously if one is explaining thembut without affecting the relevant point: there are no mechanisms dedicated to them, or justified on their basis.
Seeing what it looks like to have some mechanism dedicated to irregularities (or justified by them) would perhaps help. Let's take the example of Distributed Morphology (DM). In DM, morphemes (qua syntactic bundles) "are put together ... in syntax" to reuse Thornton's description. Typically it adopts a so-called "minimalist" flavour of contemporary syntax, and hence uses tools such as Agree, phases, Merge, etc. The relevant question is then: does DM handle portmanteaus, suppletions, etc. using nothing but those syntactic tools? Or does it add something to its grammatical toolset in order to deal with those? As is well known and well commented upon, it adds a large toolset 6what De Belder calls "the elephant in the room", and which she describes as: Distributed Morphology has this post-syntactic module called Morphology which hosts a number of operations, including Fission (Noyer 1997), Impoverishment (Bonet 1991), and several Merge operations, including Morphological Merger (Marantz 1984), Syntactic Lowering (Bobaljik 1994) and Local Dislocation (Embick and Noyer 2001).
In addition to those tools, DM brings back language-specific, contextsensitive rules (in its Vocabulary Insertion mechanism), a device widely understood to be so powerful as to prohibit an explanatory theory. 7 The justification for this reintroduction is to handle irregularities.
The substantive fact then is that nanosyntax has none of that: it adds no tool dedicated to or justified by irregular morphology (and no independent morphological system), using only regular syntax and a restrictive format of lexical entries, while still deriving the apparent irregularities of morpho-syntax. I take that to be a significant breakthrough, and we will keep it that wayor as close as possible to that way.

Where do lexemes come from?
If lexemes contain syntactic trees, where do these trees come from? A frequent question, brought up by Borer in her comment: "the nature of lexical entries and how they are constructed is critical to the assessment of the system". We think of them as a 'garage of interesting syntactic structures', and language acquisition (of part, but bears little resemblance to this talk, to nanosyntax, or to research on root suppletion and similar topics. 6 See Caha (2019) for article-length comparisons between DM and nanosyntax. 7 Whence the efforts to at least partially reign it in, e.g., Bobaljik (2000), Embick (2010), Choi & Harley (2019). syntax) is the act of populating that garage. When the learner encounters an "interesting" structure built by syntax, she may decide to keep a (frozen, static) copy of that syntactic structure in her garage. Syntax is the dynamic component building structures, the lexicon is a static repository of such structures.
Which structures count as interesting? Here we are back to familiar territory: 'interesting' is mostly determined by predictability. If the meaning or pronunciation/gesturing of a structure is not predictable, it needs to be stored in the garage, associated with its meaning or pronunciation/gesturing. Other factors may make a structure interesting: the learner may decide to store compositional but very frequent structuresan issue more often discussed in psycholinguistics or neurolinguistics.

Where is the discussion of the data?
Almost none of the commentators engaged with the empirical topic of the talk: irregular verbs (in French), their suppletion patterns, their portmanteau patterns, their syncretisms. One exception is in Thornton's comment, which asks about the derivation of present tense structures: At minute 17:34 it is said that no lexical item has the sequence [#[T …]] in it ("none of the lexical items down here has number and T as their lower layers"); therefore, a derivation containing this sequence crashes, and must be rescued by "Last-resort movement", that moves the T head to the left of the # head. However, the lexical item ɛ introduced at minute 19:44 does contain exactly the sequence [#[T …]] in the lowest layers of its tree. It is unclear to me why the derivation would crash at minute 17:34 for lack of a lexical item matching it, if this lexical item does indeed exist, since it comes up a few minutes later. I can understand that this item ɛ was not introduced earlier in the talk for expository reasons, but it should be present in the mental lexicon of a mature French speaker at the same time in which õ, e, ə and i are presentso the derivation shown at minute 17:34 should not crash, in my view. I am very interested in knowing what I have missed: that might explain why the crash would happen.
The answer lies in the full representation of ɛ, which is given at minute 24: Beyond French irregular verbs, Di Sciullo asks how "privative aspect subfeatures [would] account for simple contrasts such as courrir ('to run') / acourrir", or how such an approach would account for the different expression of directional meanings across languages: For example, contrary to English, in Italian, the preposition a is [locative] with certain stative verbs, e.g., stare a scuola, 'stay at school ', and [directional] with certain activity verbs, e.g., andare a scuola, 'go to/*at school' One simple type of answer is that the directional feature situated between the verb and the noun is lexicalised differently in different languages: together with the verb in some languages, by a preposition in other languages, yielding the core of Talmy's distinction between satellite-framed and verb-framed languages (see Fábregas 2007, Son & Svenonius 2008, Son 2009 for discussion). Similar considerations apply in the aspectual domain brought up by Di Sciullo.
Borer brings up the interesting case of templatic verbal systems. Part of her discussion revolves around the idea that each cycle of lexical access must "constitute a well-formed phonological word". This may be so in other approaches, but it does not follow from the nanosyntax setup so far, and hence "the existence of intermediate (syntactic) derivational stages with a partial structure which does not, and indeed cannot constitute a well-formed phonological word" is prima facie not an issue. Similary, Borer takes the "discontinuous representations" involved in templatic phenomena to be a challenge to "bottom-up lexical representations". Given the well known tier-based proposal that such "discontinuous representations" are underlyingly concatenative, it is not clear that this fact is a particular challenge. 8 There are in fact at least three persons working on deriving the details of templatic verbal paradigms from nothing-but-fine-grained-(nano)syntax -Nico Baier on Kabyle, Gioia Cacchioli on Tigrinya, and Abdullah Nabil Almahaish on Hasawi Arabic. Similarly for the Italian gemination facts mentioned by Thornton, a phonologist and two syntacticians are currently looking into related Italian issues of root suppletion: Edo Cavirani, Maria Cortiula and myself. So as in the talk, I'll end this with an invitation to join: if the prospect of a fully principled explanation integrated with syntax appeals to you, feel free to contact me and join our online working groups. Our activities involve not only theory, but also building a comprehensive database of root suppletions and minor ("irregular") paradigms across a range of languages, as well as computer modelisation. You'll find my email at the beginning of this note.