Similar Information Rates Across Languages, Regardless of Varying Speech Rates

cognitive psychologyartificial intelligence

Aug 9

“As soon as human beings began to make systematic observations about one another's languages, they were probably impressed by the paradox that all languages are in some fundamental sense one and the same, and yet they are also strikingly different from one another” (Ferguson 1978, p. 9).

Language and linguistics are studied through a wide range of tools and perspectives. Over the past few years, a proliferation of mathematical methods, newly available datasets and computational modeling (namely probability and information theory) has led to an increased interest in information transmission efficiency across human languages. A substantial body of emerging literature has been examining how natural language structure is affected by principles of efficiency, ranging from the lexicon (Bentz, 2018) and phonology (Priva and Jaeger, 2018) through morphosyntax (Futrell, Mahowald and Gibson, 2015), and, as a result, how typologically diverse languages could be optimized for communication and inference. Particularly, the universal properties of human languages have been examined across the language sciences. These studies indicate that efficiency is a universal feature of human language.

Human language is extremely diverse, with over 6,000 languages in use around the world, according to the World Atlas of Languages Structures (WALS). Each language has its own unique grammar and vocabulary, they can vary by how many syllables they use, whether or not they employ tones to convey meaning, syntax, transmission media (e.g. speech, signing), writing systems, the order in which they express information and more. Further, the rate of speech—or, how fast an individual speaks a language—varies widely across languages. It is no surprise that the way people express themselves differs between countries. A language spoken in a country with a low population density might be spoken at a slower rate than a popular language in a densely populated area. The English language, for example, is spoken at a rate of approximately 177 words per minute, while the Russian language is spoken at a rate of only 38 words per minute. However, while the rate of speech can vary, it has been documented that languages do not differ in their ability to convey a similar amount of information, with recurring “universal” patterns across languages. Japanese may seem to be spoken at a higher speed than Thai, but that doesn't mean that it is more “efficient.”

Generally, the term ‘information’ in the context of communication is somewhat illusive and inconclusive. However, in this case, by way of borrowing from the field of information theory, the definition of ‘information’ refers to as it was first introduced by Claude Shannon in a 1948 paper, describing it in terms of the correlation between each signal produced by the sender and the sender’s intended utterance or how much a given signal reduces a sender’s unpredictability about the intended utterance. Further, according to Gibson et al., the term ‘efficiency’ in relation to information can be defined as “...communication means that successful communication can be achieved with minimal effort on average by the sender and receiver...effort is quantified using the length of messages, so efficient communication means that signals are short on average while maximizing the rate of communicative success” (Gibson et al., 2019). Thus, one may argue communicative efficiency is manifested via the structural ability of language to resolve its complexity and ambiguity. ‘Informativity’ in language is measured by the relative amount of content words to non-content words, typically in the context of a given text. In the case of human language, informativity is highly variable over time; it is defined as “the weighted average of the negative log predictability of all the occurrences of a segment” (Priva, 2015). In other words, rather than being a measure of how probable a communicative segment is in a particular context, it is a measure of how predictable that segment is when it occurs in any context. As a receiver comprehends language, he/she can expect that the language sender’s message will be unpredictable in some way. Ultimately, language should be efficient so that a speaker is able to transmit many different messages successfully with minimal effort. In linguistics, informativity (speech rate) is typically calculated by breaking down a number of syllabic segments per second in each utterance and measured in bits of information (bits per second).

Following the definitions above and in the reference to the Language Log post “Speed vs. efficiency in speech production and reception” (Mair, 2019), the focus of this paper lies on the 2019 cross-linguistic study, published by the journal Science Advances, in which researchers looked at the relationship between language complexity and speech rate (SR), and how it affects information transmission, using information theory as a framework (conditional entropy) (Coupé, Oh, Dediu, and Pellegrino, 2019). The researchers have shown that human languages might widely differ in their encoding strategies such as complexity and speech rate, but not in the rate of effective information transmission, even if the speeds at which they are spoken vary. This relationship is universal, and is true for all languages’ capacities to encode, generate and decode speech, such as languages with more complex grammar and faster speech rate are less efficient in transmitting information. This is because a greater amount of effort is required to process them.

The researchers calculated the information density (ID) of 17 languages, from 9 different language families—Vietnamese, Basque, Catalan, German, English, French, Italian, Spanish, Serbian, Japanese, Korean, Mandarin Chinese, Yue Chinese/Cantonese, Thai, Turkish, Finnish and Hungarian—by comparing utterance recordings of 15 brief texts describing daily events, read out loud by 10 native speakers (five men and five women) per language. For each of the languages, speech rate, in number of syllables per second, and the average information density of the syllables uttered was measured. (The more easily the utterance of a particular syllable may be predicted by conditioning on the preceding syllable, the less information the former is deemed to provide.) According to their findings, each language has a different information density in terms of bits per syllable. The researchers found that higher speech rates correlate with lower information densities—as in German—and slower speech rates with higher information densities—as is often the case with tonal Asian languages like Chinese and Vietnamese. Japanese, for example, with only 643 syllables, has an information density of about 5 bits per syllable, whereas English, with 6949 different syllables, had a density of just over 7 bits per syllable. Vietnamese, comprising a complex system of six tones (each of which can further differentiate a syllable), had the highest number of 8 bits per syllable. Finally, by multiplying speech rate by information density, all languages’ transfer information rate (IR), no matter how different, has shown to converge to the rate of approximately 39 bits per second. The explanation is a trade-off between speech rate and the average amount of information carried by linguistic units.

In summary, these findings confirm a previously raised in the literature supposition that information- dense languages, those that group more information about tense, gender, and speaker into smaller linguistic units (e.g. German that delivers 5-6 syllables per second when speaking), move slower to compensate for their density of information, whereas information-light languages (e.g. Italian that delivers about 9 syllables per second when speaking) moves at a much faster speech rate. For example, the sentence in Mandarin ‘qǐng bāng máng dìng gè zǎo shang de chū zū chē’ (请帮忙订个早上的出租车) (‘Please take a request for an early-morning taxi’) is assembled of denser syllables and produced slower on average than the equivalent sentence ‘Por favor, quisiera pedir un taxi para mañana a primera hora’ in Spanish. However, notable limitation of the study that weakens the universality claim appears to be in its sample; it did not include any languages from the Niger-Congo family (e.g., Swahili) or Afro- Asiatic family (e.g., Arabic), which represent the third- and fourth-largest language families, respectively.

More broadly, in the context of speech perception and processing systems, these findings can be framed in an evolutionary perspective as they suggest a potentially optimal rate of language processing by the human brain in a manner that optimizes the use of information, regardless of the complexity of the language. Despite significant differences between languages and the geographical locations their speakers are subjected to, different languages share a common construction pattern. Specifically, the findings indicate the presence of a fundamental (cognitive) constraint on the information processing capability of the human brain, with an upper bound reached despite differences in speech speed and redundancy. An underlying process appears to be based on an interconnectivity between the pattern of cortical activity and the informational bandwidth of human communication system (Bosker and Ghitza, 2018). In the context of technology applications, it may be argued that this work paves the path to building future reference benchmarks for artificial communication devices such as prosthetic devices or brain–computer interfaces (BCI) for communication and rehabilitation. For example, rather than designing devices based on the words per minute performance, which inherently varies across languages, future engineers and designers can devise communication interfaces targeting 39 bits per second transmission delivery frameworks. Moreover, further study of communicative efficiency may guide the research of natural language processing in artificial intelligence/machine learning applications, marrying linguistics, cognitive sciences and mathematical theories of communication.

linguisticsLanguage

Dean Mai

I specialize in identifying emerging high-risk, high-reward technologies in early-stage startups, research universities, government sponsored laboratories and commercial companies.

In my current role at Xerox Ventures, I lead early- and growth-stage investments to catalyze rapidly evolving technologies, with particular interest in AI & ML, low-code/no-code, security, cloud infrastructure, dev tools, fintech/DeFi, serverless, and open-source.

Previously, I managed sourcing and diligence of strategic technology opportunities for Dyson Research, Design and Development (RDD), New Product Innovation (NPI), and New Product Development (NPD) groups, focusing on companies with a novel and differentiated scientific understanding or tough engineering solution for Dyson core research categories—energy storage (batteries), high-speed digital motors, power electronics, AI & ML, embedded sensors, turbomachinery (aero-thermodynamics and flow), spectroscopy, particle separation (filtration), and materials—in the U.S., Israel and China.

https://www.deanm.ai

Similar Information Rates Across Languages, Regardless of Varying Speech Rates

Federated Machine Learning as a Distributed Architecture for Real-World Implementations

Intuitive Physics and Domain-Specific Perceptual Causality in Infants and AI