How We Do It


Bottom-Up… All The Way!

Biolinguistics is, by definition, a biological investigation of linguistic issues. A proper appreciation of this definition requires appropriate convergence of the issues at stake. Namely, what is meant by linguistics and how is that meaning amenable to biological investigation. In 1976 the nobel laureate biologist Salvatore Luria suggested that the theory of linguistics developed by Chomsky and colleagues is particularly amenable to confrontation with biological evidence. The theory that Luria refers to, however, differs substantially from what is often referred to as linguistics in various circles (such as sociolinguistics, applied linguistics etc.) in the sense that a Generative view of linguistics is solely concerned with understanding Language as an organ. In other words, Language in the singular sense of a biological phenomena restricted within a single species (cf. Cutler, 1988, Why Not Abolish Psycholinguistics, for frequent misconceptions of different disciplinary objectives in interdisciplinary research). Early works in the tradition largely left the biological part as a reminder, and focused on explicating the formal properties that characterize this ability. The reminder set aside for then concerns the fact that whatever formal computational properties a Generative Grammar comes up with must be accommodated within a biological framework compatible with our understanding of the human brain and human genetics. A theory of Generative Grammar, then, is a theory of an organ of the human brain – the faculty of Language (FoL).


We try to investigate FoL in three broad sub-domains: (a) its origin within the Homo Sapiens species, (b) the developmental trajectory it triggers within an individual, and (c) the manners in which it constrains the actual use of languages in communication. Traditionally, data has usually been gathered from “actual use of language”, limiting (severely) the scope of the resulting theory. In reality, though, data for linguistics comes from all three of levels. The grammar, a biological ability innate to the species, is responsible for determining what is (or is not) computable. The developmental pathway of the brain speaks to the gradual growth of our linguistic processing/parsing abilities, and also informs the neural anatomy required to support them. Finally, a proper appreciation of the synchronic and diachronic nature of typological patterns is helpful in determining, and delineating, effects of grammar from forces of historical change.

With the advancement of techniques for neuro-imaging research, as well as breakthroughs in bio-informatics, genetics and computational sciences, we are at a better position now to create a converging approach for a building a theory of Language that addresses it convincingly at all three levels. Here a convergence of formal and experimental methodologies appear to be not just plausible but also very fruitful, provided experimental models are formally-motivated. This is what we attempt to do in our research, with the goal of formulating a purely bio-computational account of grammar that makes no reference to grammar-external factors (e.g. well-formedness, historical language change, language contact etc.).


Formal investigations are useful in understanding the precise computational nature of linguistic processes. The processes themselves have been descriptively studied for the last fifty or so years of Generative Grammar, but a formalist inquiry seeks to go beyond typological attestability and tries to uncover the causal forces that make linguistic processes the way they are. Consider, for an example, the requirement that all phrases must have a subject. In Linguistic Theory this requirement is formalized as the Extended Projection Principle (EPP). Explanatory accounts of why Universal Grammar (UG) should contain such a stipulation remains mysterious. This issue is compounded further in light of the minimalist program which sets forth a strict requirement in terms of computational economy – perform an operation only if you must, and then do so in the most optimal/cost-effective way possible.

MERGE, a major hypothesis in this regard, is widely attested in the neurobiological researches into online structure building in the brain (Friederici and Kotz, 2003; Zaccarella and Friederici, 2015), and is taken to be a simple set-building combinatoric mechanism. Simply put, MERGE takes two units, say A and B, and combines them into an unordered set containing both (and only) the aforementioned units.

1. MERGE <A, B> → [A,B] (=[B,A])

The process of repeated merging, however, does not proceed randomly. That is, constituents are not just merged onto one another as they are fed into the computational device, however, but are put together respecting certain structural schemas with restrictions on when constituents with specific types of labels (e.g. specifiers, heads etc.) can be merged (Chomsky, 1965). One of the most widely accepted schema for generating phrase structures is the x-bar – simply put it proposes a basic template of PHRASE → [PHRASE [TERMINAL PHRASE]] to be used as the pattern of growth from MERGE.

But if MERGE proceeds along x-bar, then it begs the question, “x-bar, what is it good for”?

Suppose we allow MERGE to generate a tree from the numeration {A,B,C,D,E,F,G,H,I,J,K,L}, we will get an idealized structure like in the figure below

Fibonacci Sequence in X-Bar

Fibonacci Sequence in X-Bar

With the standard generative assumption that a label that is immediately dominated by the projection is an xP (where x is a random variable; x = H | xP → HP), as are xP/x0 ambiguous labels, we can count the number of xP nodes (red headers in figure 1 above) at each line to a Fibonacci sequence – (1,1,2,3,5). Fibonacci numbers show up in too many guises across the natural world. The most common example involves the fractal qualities of stripes on Zebras, that are often interrupted by the growth of the ears, or even birth-marks/scars. Anatomical arrangements in plants are another example.

As it turns out, this is not just a mere co-incidence. Alan Turing famously observed that certain phenomena are of a very general nature. So they must have some very general advantages. For instance, when realized as auditory sequences Fibonacci systems elicit a special behavioral response compared to other combinatoric systems (Krivochen and Saddy, 2018), explaining why all human languages converge on specific types of rules. Simply put, specific structures are more salient in terms of neural processing of information. With regard to MERGE specifically, we again observe the relevance of Fibonacci numbers. Generally, MERGE is hypothesized to have three functional subroutines: (a) merge only two terms, (b) combine into an unordered set, and (c) assign a label. Considering them individually, from a semantic standpoint (b) is easy to justify in terms of compositionality. The other two, however, appear to be stipulations. Building on the Fibonacci idea, however, it is possible to derive overt pressures from natural architectural patterns that would demand conditions like those in (a) and (c). For instance, if MERGE is allowed the option to select three terms and merge them into a ternary structure, then Fibonacci fractal observed above would disappear (e.g. allowing A to be merged with both DP and another maximal projection in line 3 of fig.1), and due to the optional nature of binary vs. ternary merging it would be impossible to predict such breaks. This, in turn, would make search-probe operations during online parsing much less predictable. Such predictability is of core importance to the natural productivity of languages, and are hallmarks of natural combinatoric systems (e.g. natural numbers). Likewise, labeling is a necessary mechanism for determining whether a node is, say, a XP. As such a mechanism for determining this is necessary in order for the Fibonacci-ness to be satisfied, and it could be argued that tree maximization in the x-bar format illustrated above could provide an external motivation for the third sub-routine of MERGE as well.

This is but one example of the alogrithmic beauty of Language, as Juan Uriagerika puts it, that can be unravelled through formal investigations. Such insights allow us to better understand the causal forces that shape computational processes in the brain, and make them the way they are. Beyond typology, they tell us why our observed typologies appear in some specific patterns, and what patterns are impossible to come across. For instance, David Medeiros has done some very interesting work that looks into the causal computational underpinnings of Cinque’s universal-20 in word order preferences.

Of interest, also, are the works of Charles Reiss, Bill Idsardi, Mark Hale, Veno Volenec, Pedro T. Martins, Cedric Boeckx, Tobias Scheer, Sylvia Blaho, Thomas Graf, Bridget D. Samuels, Iris Berent, Martin Kramer et al. in the domain of phonology. Charles Reiss, in particular, has argued for the distinction between computable, processable, attestable and attested sets of data referred to in the Venn diagram above. Stressing the importance for a proper explication of the syntax of phono-LOGICAL rules, Reiss also argues that without a proper theory of rules and equivalence classes it is impossible to evaluate whether there exist rules that don't refer to equivalence classes. Following these lines of reasoning leads one to some surprising conclusions, but perhaps the most important one concerns what Cedric Boeckx refers to as biological plausibility of biolinguistics. Linguistics, as a scientific study (not of languages, but rather) of the Faculty of Language, must retain amenability to biological investigations. As such there is no place within Linguistic Theory (of a biological, generative type) for notions and constructs that have no biological interpretations (e.g. markedness, well-formedness etc.). Either these notions must go, or linguistic theory as “natural science” is moonshine.


Our brains are uniquely capable of these type of computations, and we do it not just to speak or communicate, but to do algebra, write symphonies and think about our place in the Universe! The brain is the hardware that runs a very special software, and it creates infinite possibilities that has made us the dominant species on this planet. Both deserve and demand a proper understanding with the finest possible details, and a formal inquiry into the algorithm of this software is necessary to both understand what kind of a hardware could run it, and to rethink whether our understanding is correct in terms of whether the kind of brain-software we propose can really be run by the hardware that we are stuck with!

A related line of inquiry, informed by formal methods but often involving sophisticated neuro-imaging, eye-tracking and other instrumental methods, is useful in studying how the brain enables Language and supports it. Children learn to use it in a handful of months, beginning to babble by themselves by the time they approach the end of the first year of their lives. Moreover, while adults are often more able in their use of a language once-picked-up, they are remarkably bad compared to babies at picking up new ones! How do babies manage to pick up the rules of Language, produce only rule-abiding sentences and sounds, before they even have the notion of rules or languages or what they mean? Chomsky, along with legendary biologists and geneticists, like the nobel laureate Luria, Pollack and Jacob (Wexler, 2013), have argued that much of the restrictions on the architecture of Language is hardwired into the human genome. Babies are born with an expectation of what rules an information encoding system must adhere to if they are to be productive in this sense, and any (and all) systems that meet the criteria are readily picked up the developing brain. In other words, children do not learn/construct rules at all, they merely confirm them. Or rather, their brains do without them consciously knowing what they are doing while they are doing it! But what is special about human neonate brains that allows for such phenomena? What changes after the critical period to limit such abilities drastically in adults? Is their a continuity between adult and neonate brains? Are there specific parts of the brain dedicated to specific types of linguistic operations? Do they reflect some specialization that enable the super-specialized (linguistic) system they work with?

Basic Illustration of a Computational System

Basic Illustration of a Computational System

Implementing A Computational System in the Human Brain

Implementing A Computational System in the Human Brain

Taken together, these two lines of inquiry dovetail into a concentrated effort to understand the biology of the uniquely human mind/brain that enables in us a functioning ability without any parallels in the known biological world. What constitutes an impossible language? What boundaries of human-ness do they violate? What determines the limits of human-ness?

For detailed discussions, see:

Carnie, Andrew, Medeiros D., and C. Boeckx. 2005. Some Consequences of Natural Law in Syntactic Structure. Ms. University of Arizona, Harvard University.

Chomsky, N. (1959). A review of BF Skinner’s Verbal Behavior. Language, 35(1), 26–58.

Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge: M.I.T. Press.

Chomsky, N. (1975). The Logical Structure of Linguistic Theory (Vol. 40).

Chomsky, N. (2002). Syntactic structures. Berlin; New York: Mouton de Gruyter.

Chomsky, N. (2005). Three Factors in Language Design. Linguistic Inquiry, 36(1), 1–22.

Chomsky, N. (2007). Biolinguistic explorations: Design, development, evolution. International Journal of Philosophical Studies, 15(1), 1–21.

Dawkins, R. (2006). The selfish gene: with a new introduction by the author. UK: Oxford University Press.(Originally Published in 1976).

Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002). The faculty of language: What is it, who has it, and how did it evolve? Science, 298(5598), 1569–1579.

Kuhn, T. S. (1970). The structure of scientific revolutions ([2d ed., enl). Chicago: University of Chicago Press.

Lewontin, R. C. (1974). The genetic basis of evolutionary change (Vol. 560). Columbia University Press New York.

Uriagereka, J. (2000). Rhyme and Reason: An Introduction to Minimalist Syntax. MIT Press.

Wexler, K. (2013, February). Luria’s biolinguistic suggestion and the growth of language.