Similar Information Rates Across Languages, Regardless of Varying Speech Rates

Language and linguistics are studied through a wide range of tools and perspectives. Over the past few years, a proliferation of mathematical methods, newly available datasets and computational modeling (namely probability and information theory) has led to an increased interest in information transmission efficiency across human languages.

“As soon as human beings began to make systematic observations about one another's languages, they were probably impressed by the paradox that all languages are in some fundamental sense one and the same, and yet they are also strikingly different from one another” (Ferguson 1978, p. 9).


Language and linguistics are studied through a wide range of tools and perspectives. Over the past few years, a proliferation of mathematical methods, newly available datasets and computational modeling (namely probability and information theory) has led to an increased interest in information transmission efficiency across human languages. A substantial body of emerging literature has been examining how natural language structure is affected by principles of efficiency, ranging from the lexicon (Bentz, 2018) and phonology (Priva and Jaeger, 2018) through morphosyntax (Futrell, Mahowald and Gibson, 2015), and, as a result, how typologically diverse languages could be optimized for communication and inference. Particularly, the universal properties of human languages have been examined across the language sciences. These studies indicate that efficiency is a universal feature of human language.

Human language is extremely diverse, with over 6,000 languages in use around the world, according to the World Atlas of Languages Structures (WALS). Each language has its own unique grammar and vocabulary, they can vary by how many syllables they use, whether or not they employ tones to convey meaning, syntax, transmission media (e.g. speech, signing), writing systems, the order in which they express information and more. Further, the rate of speech—or, how fast an individual speaks a language—varies widely across languages. It is no surprise that the way people express themselves differs between countries. A language spoken in a country with a low population density might be spoken at a slower rate than a popular language in a densely populated area. The English language, for example, is spoken at a rate of approximately 177 words per minute, while the Russian language is spoken at a rate of only 38 words per minute. However, while the rate of speech can vary, it has been documented that languages do not differ in their ability to convey a similar amount of information, with recurring “universal” patterns across languages. Japanese may seem to be spoken at a higher speed than Thai, but that doesn't mean that it is more “efficient.”

Generally, the term ‘information’ in the context of communication is somewhat illusive and inconclusive. However, in this case, by way of borrowing from the field of information theory, the definition of ‘information’ refers to as it was first introduced by Claude Shannon in a 1948 paper, describing it in terms of the correlation between each signal produced by the sender and the sender’s intended utterance or how much a given signal reduces a sender’s unpredictability about the intended utterance. Further, according to Gibson et al., the term ‘efficiency’ in relation to information can be defined as “...communication means that successful communication can be achieved with minimal effort on average by the sender and receiver...effort is quantified using the length of messages, so efficient communication means that signals are short on average while maximizing the rate of communicative success” (Gibson et al., 2019). Thus, one may argue communicative efficiency is manifested via the structural ability of language to resolve its complexity and ambiguity. ‘Informativity’ in language is measured by the relative amount of content words to non-content words, typically in the context of a given text. In the case of human language, informativity is highly variable over time; it is defined as “the weighted average of the negative log predictability of all the occurrences of a segment” (Priva, 2015). In other words, rather than being a measure of how probable a communicative segment is in a particular context, it is a measure of how predictable that segment is when it occurs in any context. As a receiver comprehends language, he/she can expect that the language sender’s message will be unpredictable in some way. Ultimately, language should be efficient so that a speaker is able to transmit many different messages successfully with minimal effort. In linguistics, informativity (speech rate) is typically calculated by breaking down a number of syllabic segments per second in each utterance and measured in bits of information (bits per second).

Following the definitions above and in the reference to the Language Log post “Speed vs. efficiency in speech production and reception” (Mair, 2019), the focus of this paper lies on the 2019 cross-linguistic study, published by the journal Science Advances, in which researchers looked at the relationship between language complexity and speech rate (SR), and how it affects information transmission, using information theory as a framework (conditional entropy) (Coupé, Oh, Dediu, and Pellegrino, 2019). The researchers have shown that human languages might widely differ in their encoding strategies such as complexity and speech rate, but not in the rate of effective information transmission, even if the speeds at which they are spoken vary. This relationship is universal, and is true for all languages’ capacities to encode, generate and decode speech, such as languages with more complex grammar and faster speech rate are less efficient in transmitting information. This is because a greater amount of effort is required to process them.

The researchers calculated the information density (ID) of 17 languages, from 9 different language families—Vietnamese, Basque, Catalan, German, English, French, Italian, Spanish, Serbian, Japanese, Korean, Mandarin Chinese, Yue Chinese/Cantonese, Thai, Turkish, Finnish and Hungarian—by comparing utterance recordings of 15 brief texts describing daily events, read out loud by 10 native speakers (five men and five women) per language. For each of the languages, speech rate, in number of syllables per second, and the average information density of the syllables uttered was measured. (The more easily the utterance of a particular syllable may be predicted by conditioning on the preceding syllable, the less information the former is deemed to provide.) According to their findings, each language has a different information density in terms of bits per syllable. The researchers found that higher speech rates correlate with lower information densities—as in German—and slower speech rates with higher information densities—as is often the case with tonal Asian languages like Chinese and Vietnamese. Japanese, for example, with only 643 syllables, has an information density of about 5 bits per syllable, whereas English, with 6949 different syllables, had a density of just over 7 bits per syllable. Vietnamese, comprising a complex system of six tones (each of which can further differentiate a syllable), had the highest number of 8 bits per syllable. Finally, by multiplying speech rate by information density, all languages’ transfer information rate (IR), no matter how different, has shown to converge to the rate of approximately 39 bits per second. The explanation is a trade-off between speech rate and the average amount of information carried by linguistic units.

In summary, these findings confirm a previously raised in the literature supposition that information- dense languages, those that group more information about tense, gender, and speaker into smaller linguistic units (e.g. German that delivers 5-6 syllables per second when speaking), move slower to compensate for their density of information, whereas information-light languages (e.g. Italian that delivers about 9 syllables per second when speaking) moves at a much faster speech rate. For example, the sentence in Mandarin ‘qǐng bāng máng dìng gè zǎo shang de chū zū chē’ (请帮忙订个早上的出租车) (‘Please take a request for an early-morning taxi’) is assembled of denser syllables and produced slower on average than the equivalent sentence ‘Por favor, quisiera pedir un taxi para mañana a primera hora’ in Spanish. However, notable limitation of the study that weakens the universality claim appears to be in its sample; it did not include any languages from the Niger-Congo family (e.g., Swahili) or Afro- Asiatic family (e.g., Arabic), which represent the third- and fourth-largest language families, respectively.

More broadly, in the context of speech perception and processing systems, these findings can be framed in an evolutionary perspective as they suggest a potentially optimal rate of language processing by the human brain in a manner that optimizes the use of information, regardless of the complexity of the language. Despite significant differences between languages and the geographical locations their speakers are subjected to, different languages share a common construction pattern. Specifically, the findings indicate the presence of a fundamental (cognitive) constraint on the information processing capability of the human brain, with an upper bound reached despite differences in speech speed and redundancy. An underlying process appears to be based on an interconnectivity between the pattern of cortical activity and the informational bandwidth of human communication system (Bosker and Ghitza, 2018). In the context of technology applications, it may be argued that this work paves the path to building future reference benchmarks for artificial communication devices such as prosthetic devices or brain–computer interfaces (BCI) for communication and rehabilitation. For example, rather than designing devices based on the words per minute performance, which inherently varies across languages, future engineers and designers can devise communication interfaces targeting 39 bits per second transmission delivery frameworks. Moreover, further study of communicative efficiency may guide the research of natural language processing in artificial intelligence/machine learning applications, marrying linguistics, cognitive sciences and mathematical theories of communication.

Read More
cognitive psychology Dean Mai cognitive psychology Dean Mai

Intuitive Physics and Domain-Specific Perceptual Causality in Infants and AI

More recently, cognitive psychology and artificial intelligence (AI) researchers have been motivated by the need to explore the concept of intuitive physics in infants’ object perception skills and understand whether further theoretical and practical applications in the field of artificial intelligence could be developed by linking intuitive physics’ approaches to the research area of AI—by building autonomous systems that learn and think like humans.

More recently, cognitive psychology and artificial intelligence (AI) researchers have been motivated by the need to explore the concept of intuitive physics in infants’ object perception skills and understand whether further theoretical and practical applications in the field of artificial intelligence could be developed by linking intuitive physics’ approaches to the research area of AI—by building autonomous systems that learn and think like humans. A particular context of intuitive physics explored herein is the infants’ innate understanding of how inanimate objects persist in time and space or otherwise follow principles of persistence, inertia and gravity—the spatio-temporal configuration of physical concepts—soon after birth, occurring via the domain-specific perceptual causality (Caramazza & Shelton, 1998). The overview is structured around intuitive physics techniques using cognitive (neural) networks with the objective to harness our understanding of how artificial agents may emulate aspects of human (infants’) cognition into a general-purpose physics simulator for a wide range of everyday judgments and tasks. 

Such neural networks (deep learning networks in particular) can be generally characterized by collectively-performing neural-network-style models organized in a number of layers of representation, followed by a process of gradually refining their connection strengths as more data is introduced. By mimicking the brain’s biological neural networks, computational models that rapidly learn, improve and apply their subsequent learning to new tasks in unstructured real-world environments can undoubtedly play a major role in enabling future software and hardware (robotic) systems to make better inferences from smaller amounts of training data.

On the general level, intuitive physics, naïve physics or folk physics (terms used here synonymously) is the universally similar human perception of fundamental physical phenomena, or an intuitive (innate) understanding all humans have about objects in the physical world. Further, intuitive physics is defined as "...the knowledge underlying the human ability to understand the physical environment and interact with objects and substances that undergo dynamic state changes, making at least approximate predictions about how observed events will unfold" (Kubricht, Holyoak & Lu, 2017).

During the past few decades, motivated by the technological advances (brain imaging, eye gaze detection and reaction time measurement in particular), several researchers have established guiding principles on how innate core concepts and principles constrain knowledge systems that emerge in the infants’ brain—principles of gravity, inertia, and persistence (with its corollaries of solidity, continuity, cohesion, boundedness, and unchangeableness)—by capturing empirical physiological data. To quantify infants’ innate reaction to a particular stimulus, researchers have relied on the concept of habituation, or a decrease in responsiveness to a stimulus after repeated exposure to the same stimulus (i.e., shows a diminished duration in total looking time of visual face, object or image recognition). Thus, habituation is operationalized as amount of time an infant allocates to stimuli with less familiar stimuli receive more attention—when new stimulus is introduced and perceived as different, the infant increases the duration of responding at the stimulus (Eimas, Siqueland, Juscyk, & Vigorito, 1971). In the context of intuitive physics, in order to understand how ubiquitous infants’ intuitive understanding is, developmental researchers rely on violation of expectation of physical phenomena. If infants understand the implicit rules, the more newly introduced stimulus violates his or her expectations, the more they will attend to it in an unexpected situation (suggesting that preference is associated with the infant's ability to discriminate between the two events).

Core Principles

A variety of studies and theoretical work defined what physical principles are and explored how they are represented during human infancy. In particular, in the context of inertia, the principle invokes infants’ expectation of how objects in motion follow an uninterrupted path without sporadic changes in velocity or direction (Kochukhova & Gredeback, 2007; Luo, Kaufman & Baillargeon, 2009). In the context of gravity, the principle refers to infants’ expectation of how objects fall after being released (Needham & Baillargeon, 1993; Premack & Premack, 2003). Lastly, in the context of persistence, the principle guides infants’ expectation of how objects would obey continuity (objects cannot spontaneously appear or disappear into thin air), solidity (two solid objects cannot occupy the same space at the same time), cohesion (objects cannot spontaneously break apart as they move), fuse with another object (boundedness), or change shape, pattern, size, or color (unchangeableness) (Spelke et al., 1992; Spelke, Phillips & Woodward, 1995; Baillargeon, 2008). An extensive evidence that can be drawn from theories in the field of research on cognitive development in infancy aptly shows that, across a wide range of situations, infants can predict outcomes of physical interactions involving gravity, object permanence and conservation of shape and number as young as two months old (Spelke, 1990; Spelke, Phillips & Woodward, 1995).

The concept of continuity was originally proposed and described by Elizabeth Spelke, one of the cognitive psychologists who established the intuitive physics movement. Spelke defined and formalized various object perception experimental frameworks, such as occlusion and containment, both hinging on the continuity principle—infants’ innate recognition that objects exist continuously in time and space. As a continuous construct on the foundations of this existing knowledge, research work in the domain of early development could lead to further insights into how humans attain their physical knowledge across childhood, adolescence and adulthood. For example, in one of their early containment event tests, Hespos and Baillargeon demonstrated that infants shown a tall cylinder fitting into the tall container were unfazed by the expected physical outcome; contrarily, when infants were shown the tall cylinder placed into a much shorter cylindrical container, the unexpected outcome confounded them. These findings demonstrated that infants as young as two months expected that containers cannot hold objects that physically exceed them in height (Hespos & Baillargeon, 2001). In the occlusion event test example, infants’ object tracking mechanism was demonstrated by way of a moving toy mouse and a screen. The infants were first habituated by a toy moving back and forth behind a screen, then a part of the screen was removed to introduce the toy into infants’ view when moving; when the screen was removed, the test led infants of three months old to be surprised because the mouse failed to be hidden when behind the screen.

In the concept of solidity test, Baillargeon demonstrated that infants as young at three months of age, habituated to the expected event of a screen rotating from 0° to 180° back and forth until it was blocked by the placed box (causing it to reverse its direction and preventing from completing its full range of motion), looked longer at the unexpected event wherein the screen rotated up and then continued to rotate through the physical space where the box was positioned (Baillargeon, 1987).

Analogously to the findings demonstrating that infants are sensitive to violations of object solidity, the concept of cohesion captures infants’ ability to comprehend that objects are cohesive and bounded. Kestenbaum demonstrated that infants successfully understand partially overlapping boundaries or the boundaries of adjacent objects, dishabituated when objects’ boundaries cannot correspond in position within their actual physical limits (Kestenbaum, Termine, & Spelke, 1987).

Lastly, there has been converging evidence for infants at the age of two months and possibly earlier to have already developed object appearance-based expectations, such as an object does not spontaneously change its color, texture, shape or size. When infants at the age of six months were presented with an Elmo face, they were successfully able to discriminate a change in the area size of the Elmo face (Brannon, Lutz, & Cordes, 2006). 

Innateness

Evidently, infants possess sophisticated cognitive ability seemingly early on to be able to discriminate between expected and unexpected object behavior and interaction. This innate knowledge of physical concepts has been argued to allow infants to track objects over time and discount physically implausible trajectories or states, contributing to flexible knowledge generalization to new tasks, surroundings and scenarios, which, one may assume in the evolutionary context, is iterated towards a more adaptive mechanism that would allow them to survive in new environments (Leslie & Keeble, 1987).

In this regard, the notion of innateness, first introduced by Plato, has long been the subject of debate in the psychology of intuitive physics. Previous studies have argued whether the human brain comes prewired with a network that precedes the development of cortical regions (or domain-specific connections)—connectivity precedes function—specialized for specific cognitive functions and inputs (e.g., ones that control face recognition, scene processing or spatial depth inference) (Kamps, Hendrix, Brennan & Dilks, 2019) versus whether specific cognitive functions arise collectively from accumulating visual inputs and experiences—function precedes connectivity (Arcaro & Livingstone, 2017). In one recent study, the researchers used resting-state functional magnetic resonance imaging (rs-fMRI), which measures the blood oxygenation level-dependent signal to evaluate spontaneous brain activity in a resting state, to assess brain region connections in infants as young as 27 days of age. The researchers reported that the face recognition and scene-processing cortical regions were interconnected, suggesting innateness caused the formation of domain-specific functional modules in the developing brain. Additional supporting studies, using auditory and tactile stimuli, have also shown discriminatory responses in congenitally blind adults, presenting evidence that face- and scene-sensitive regions develop in visual cortex without any input functions and, thus, may be innate (Büchel, Price, Frackowiak, & Friston, 1998). Contrary to the notion of connectivity precedes function, previous empirical work on infant monkeys has alternatively shown a discrepancy between the apparent innateness of visual maps and prewired domain-specific connections, suggesting experience caused the formation of domain-specific functional modules in the infant monkeys’ temporal lobe (Arcaro & Livingstone, 2017). Thus, the framework of intuitive physics, does not encompass nor is restricted merely to humans—often invoking similar cognitive expectations in other living species and even (subjected to training) computational models.

Intuitive Physics and Artificial Intelligence

Despite recent progress in the field of artificial intelligence, humans are still arguably better than computational systems at performing general purpose reasoning and various broad object perception tasks, making inferences based on limited or no experience, such as in spatial layout understating, concept learning, concept prediction and more. The notion of intuitive physics has been a significant focus in the field of artificial intelligence research as part of the effort to extend the cognitive ability concepts of human knowledge to algorithmic-driven reasoning, decision-making or problem-solving. A fundamental challenge in the robotics and artificial intelligence fields today is building robots that can imitate human spatial or object inference actions and adapt to an everyday environment as successfully as an infant. Specifically, as a part of the recent advancement in artificial intelligence technologies, namely machine learning and deep learning, researchers have begun to explore how to build neural “intuitive physics” models that can make predictions about stability, collisions, forces and velocities from static and dynamic visual inputs, or interactions with a real or simulated environment. Such knowledge-based, probabilistic simulation models therefore could be both used to understand the cognitive and neural underpinning of naive physics in humans, but also to provide artificial intelligence systems (e.g. autonomous vehicles) with higher levels of perception, inference and reasoning capabilities.

Intuitive physics or spatio-temporal configuration of metaphysical concepts of objects—arrangements of objects, material classification of objects, motions of objects and substances or their lack thereof—are the fundamental building blocks of complex cognitive frameworks, leading to a desire of their further investigation, analysis and understanding. Generally, in the field of artificial intelligence specifically, there has been growing interest in looking at the origins and development of such frameworks, an attempt originally described by Hayes: "I propose the construction of a formalization of a sizable portion of common-sense knowledge about the everyday physical world: about objects, shape, space, movement, substances (solids and liquids), time..." (Hayes, 1985).

However, in the context of practical emulation of intuitive physics concepts for solving physics-related tasks, despite its potential benefits, the implementation and understanding of neural “intuitive physics” models in the computational settings are still not fully developed and focus mainly on controlled physics-engine reconstruction while, in contrast to the process of infant learning, also require a vast amount of training data as input. Given computational models’ existing narrow problem-solving ability to complete tasks precisely over and over again, the emulation of infants’ intuitive physics cognitive abilities can give technology researchers and developers the opportunity to potentially design physical solutions on a broader set of conditions, with less training data, resources and time (i.e., as it is currently required in the self-driving technology development areas). For deep networks trained on physics-related data input, it is yet to be shown whether models are able to correctly integrate object concepts and generalize acquired knowledge—general physical properties, forces and Newtonian dynamics—beyond training contexts in an unconstructed environment. 

Future Directions

It is desired to further continue attempts of integrating intuitive physics and deep learning models, specifically in the domain of object perception. By drawing a distinction between differences in infants’ knowledge acquisition abilities via an “intuitive physics engine” and artificial agents, such an engine one day could be adapted into existing and future deep learning networks. Even at a very young age, human infants seem to possess a remarkable (innate) set of skills to learn rich conceptual models. Whether such models can be successfully built into artificial systems with the type and quantity of data accessible to infants is not yet clear. However, the combination of intuitive physics and machine (deep) learning could be a significant step towards more human-like learning computational models.

Read More
vr & ar, cognitive psychology Dean Mai vr & ar, cognitive psychology Dean Mai

Virtual Reality as a Tool for Stress Inoculation Training in Special Operations Forces

Stress is a common factor in tactical fast-paced scenarios such as in firefighting, law enforcement, and military—especially among Special Operations Forces (SOF) units who are routinely required to operate outside the wire (i.e., in hostile enemy territory) in isolated, confined, and extreme (ICE) environments (albeit seldom such environment is long-duration by choice).

The maximal adaptability model of stress and performance. Adapted from Hancock and Warm (1989).

Stress is a common factor in tactical fast-paced scenarios such as in firefighting, law enforcement, and military—especially among Special Operations Forces (SOF) units who are routinely required to operate outside the wire (i.e., in hostile enemy territory) in isolated, confined, and extreme (ICE) environments (albeit seldom such environment is long-duration by choice). 

Human performance is inherently subjected to increasing levels of adverse effects due to several types of stressors—such as fatigue, noise, temperature (e.g., extreme heat or cold), high task or acute time-limited load—leading to negatively affected cognitive processes, which may subsequently affect the quality of attention, effective decision-making, information processing, situation awareness, one’s physical or mental well-being and overall mission success. In general, the underlying factors of decreased performance in ICE environments (i.e., astronauts or Antarctic expeditioners) include diverse range of stressor such as fatigue, sleep deprivation, acquired or inherent ability to cope with stress, perception of the risks associated with the physical environment, disruptions of circadian rhythms, and separation from a known social environment. Further, external medical help is usually unavailable in long-duration exploration missions when communication might be disrupted or when message transmission could take extended period of time (such as in space missions). This added isolation requires that cosmonauts can adapt to new and developing issues in all aspects of their mission, including mental health.

ICE environments in military settings can be characterized primarily by intensity in terms of life-threatening conditions (high-risk violent environment), mission complexity, isolation (may occur through unplanned enemy action, retrograde, terrain disorientation, or other environmental conditions), confinement (military captivity), and pace of operation (speed of performance and the tasks that need to be performed). Previous empirical work has shown that individuals who successfully develop the cognitive and situational skills that can help manage anxiety in a high-stress environment have an ability to withstand stress. Though cognitive skills and specific personality traits in some have been found to facilitate higher levels of performance under stress more naturally than in others, there is also sufficient evidence that resilience competencies can be developed and changed to mitigate the adverse effects of stress on performance, thereby reducing the likelihood of negative outcomes. The incorporation of both physical and psychological competencies—adaptability, concentration, perseverance, and overall tolerance to stress—via specialized training would be expected to positively affect the mission readiness of special force personnel in several ways, including enhanced situational and behavioral performance under stress, reduced attrition during basic and advanced training, and increased trainee retention. The organic development of such competencies within special forces fall under stress inoculation training (SIT) or stress exposure training (SET).

Despite the existence of various standards for pre-combat training under stress, considerably less attention has been placed on developing competencies (i.e., behavioral and cognitive skills) that facilitate successful performance in ICE environments. Technology, such as virtual reality (VR) or virtual simulation, shows promise as an emerging health safeguard tool to provide an alternative effective platform to support additional pre-combat stress inoculation training in special forces, specifically focusing on ICE environments.

Stress Inoculation Training and Stress Exposure Training

Stress inoculation training, or SIT, is one of various stress interference cognitive-behavioral therapies in the current use by organizations, both civilian and military, as a comprehensive approach to improve performance success rates under a wide range of stressful settings. Originated from several clinical psychology research disciplines, general stress inoculation training is designed to establish effective tolerance to stress through physical and cognitive skill training by providing appropriate levels of exposure to stressful stimuli in intense yet controlled environments. Empirical work has shown that individuals that are put through carefully designed realistic stressor frameworks in order to develop personal ways on how to deal with such situations, will acquire the confidence (or perception of confidence) to overcome increased levels of physical and psychological loads in the future. 

Generally, as proposed by Donald Meichenbaum, known for his role in the development of cognitive behavioral therapy (CBT) and for his contributions to the treatment of post-traumatic stress, stress inoculation training comprises of three phases consisting of conceptual education, skills acquisition and consolidation (physical capacities, motor skills, and cognitive abilities), and application and follow-through. 

In the conceptual education phase, the goal is two-fold: building a relationship between the trainer and trainee; and guiding an individual by increasing their understanding and perception of his or her stress response and overall existing coping skills. Various models of coping have been proposed and used to help the individual understand how maladaptive coping behaviors, like cognitive distortion, can negatively influence their stress levels. Clinical methodologies such as self-monitoring and modeling are used to help the patient become more adaptive to overcome their stressors while raising self-control and confidence. The person might be asked to build a list that differentiates between their stressors and their stress-induced exercises so that coping models can be adjusted accordingly. This stage is key in showing the individual that it is possible to provide an answer to his/her psychological triggers. This can include control of autonomic arousal, confidence-building, and basic mental skills such as the link between performance and psychological states, goal-setting, attention control, visualization, self-talk, and compartmentalization.

In the skills acquisition and consolidation phase, the goal is establish the coping techniques so they can be implemented in the next phase to regulate negative reactions and increase control over physiological responses. Some general skills in this phase can include l relaxation training, cognitive restructuring, emotional self-regulation, problem-solving, and communication skills. In general, the individual will develop a wide spectrum of personal techniques which they can then draw from in order to apply when coping with a stressful situation. 

In the application and follow-through phase, the goal is subject an individual to increasing levels of a particular stressor and practices applying the techniques they have developed to mitigate his or her stress response. Employing incremental exposure of one’s to stress, or systematic desensitization, leads subsequently to the individual’s ability of becoming more resilient towards stress. This can be established via modifying the levels of motor pattern complexity, program complexity, and physiological stress in the form of increased intensity, volume, and density.

Military organizations, SOF included, began to adapt the general structure of SIT in demonstrated considerable improvements in personnel performance. Originally designed by Driskell and Johnston (1998), stress exposure training, or SET, is a comprehensive approach for developing stress resilient skills and performance in high demand training applications. However, instead of cognitive-behavioral pathological therapy, SET provides an integrative and preemptive structure for normal training populations. Similarly to SIT, SET comprises of three analogous phases consisting of information provision, skill acquisition and application and practice.

In the information phase, the goal is acquire initial information on the human stress response and what overall nature of stressors participants should expect to encounter. In the skills acquisition phase, the goal is develop and refine physical, behavioral, technical, and cognitive skills. Along with specific skills training, successful tactical training and operational effectiveness requires physical fitness training. Physical fitness not only creates a foundation for task performance, it also builds two key qualities: resilience and toughness. Resilience is the ability to successfully tolerate and recover from traumatic or stressful events. It includes a range of physical, behavioral, social, and psychological factors. In the application and practice phase, the goal is putting previous preparatory phase to practice by testing skills under conditions that approximate the operational environment and that gradually attain the level of stress expected.

A research conducted twenty-four Marines who had a diagnosis of PTSD pre- and post-deployment involved PRESIT, a program also known as pre-deployment stress inoculation training as a preventive way to help deploying military personnel cope with combat-related stressors. The findings showed that the PRESIT group of Marines were able to reduce their physiological arousal by breathing exercises. Moreover, the study found that those who went through PRESIT had benefited from the training in terms of their amount of PTSD and how they were able to cope with their stressors in comparison to those who did not go through PRESIT (Lee et al., 2002).

One of the common non-clinical examples of SIT used in a pre-combat training is a basic swimming exercise designed to increase water confidence, commonly known as “drown-proofing”. In this exercise, trainees must learn to swim with both their hands and their feet bound and complete a variety of swimming maneuvers. This exercise is a SIT example that “…build[s] the student’s strength and endurance; ability to follow critical instructions with emphasis on attention to details and situational awareness; ability to work through crisis and high levels of stress in the water” (Robson and Manacapilli, 2014). 

In a similar manner to the clinical interventions designed to treat pathological psychiatric conditions, military personnel in SET is exposed to the stressors that could be part of a given situation, such as the mental and physical impacts of extreme fatigue or cold water conditioning, prophylactically or without developed psychiatric pathology for potential stressors and scenarios that are likely to encounter. Those stressors are progressive and cumulative—challenging enough, but not completely debilitating—with gradual build-up of anxiety. Each training activity is designed to establish the required technical skills (such as movement quality and positioning or control of stress responses), rather than hinder the development of those skills.

The aforementioned studies have shown that SIT can be implemented effectively in the military settings. However, it should be noted that SIT is not one-size-fits-all; the multitool nature of special operation units engaged in reconnaissance, search-and-rescue and direct-action missions, often under increased time pressures, draws a clear distinction between physical and cognitive performance readiness that of large-scale (requiring significant logistical planning) operations performed by air, ground or navy forces. Depending on the type of stressor (ongoing or time-limited), the resources and coping mechanisms will be different from person-to-person. An ongoing stressor is traumatic experience that can be expected to occur on a regular basis like being a first responder or a soldier in combat; a time-limited stressor, or an acute stressor, is a singular experience like surgery, occurring quickly quickly and is not likely to continue to happen. According to Meichenbaum, “SIT provides a set of general principles and clinical guidelines for treating distressed individuals, rather than a specific treatment formula or a set of “canned” interventions” (Meichenbaum, 2007). Yet the implementation of SIT in ICE environments, specifically for special operations forces training, is only at its inception today. As an early emerging area of practice, many psychological ramifications and benefits are yet to be fully examined and addressed, particularly around novel technology platforms involving virtual reality or mixed reality technologies (Riva, 2005).

Isolated, Confined and Extreme Environments

Generally, ICEs comprise of a wide variety of geographical places that present hostile and harsh physical and psychological conditions posing risks to human health and life. A myriad of physical environments and medical specialties can be included under ICEs, for example long-duration space missions, expedition, wilderness, diving, jungle, desert, cave and others. In these missions, a small group of scientists, astronauts and explorers chose to participate and being willingly exposed to such environments. Substantial body of research has discussed coping mechanisms by the use of emerging technology tools, specifically focusing on cognitive performance and stress resilience development that could be linked or affected by ICE environments. Specifically, through the use of VR, researchers have reported that astronauts were able to gain access to continuous psychiatric monitoring, cognitive exercise, timely training, and sensory stimulation to mitigate monotony of the working environment. Moreover, such technology tools can provide practical answers to psychosocial adaptation by enabling cooperative and leisurely activities for team members to play together to keep internal morale and collaboration while relieving stress and tension between its members.

Virtual Reality and Virtual Simulation Tools

The advancements in computer technologies and display technologies powered by graphics processing units (GPUs) have facilitated the emergence of systems capable of isolating a user from the real surrounding environment to simulate a computer-generated one, known as “virtual reality” experiences. More specifically, displays and environmental sensors create the illusion of being in a digitally rendered environment, either by using a head-mounted display device or entering into a computer-automated room where images are present all around; accessory outputs like spacial audio and handheld feedback controllers (or any other visual, auditory, tactile, vibratory, vestibular, and olfactory stimuli) can also further contribute to the increased levels of user’s immersion or presence in a non-physical world. Presence in the context of virtual reality applications is defined by Steuer (1992) as the “sense of being there” (as cited by Riva, 2008) or the sense of being physically present in a different world that surrounds the individual. Originally as a niche tool within the digital toolbox competing with a myriad of attention-economy products in the entertainment space, VR’s digital simulations have become realistic enough to enable use cases where dangerous or complex scenarios can be safely reenacted at low cost in a virtual environment—like digital therapeutics, training, planning, and design. 

Virtual reality exposure therapy (VRET)Virtual reality, going beyond practical commercial tools, has also found many applications in the area of psychology assisting both researchers in studying human behavior and patients in coping with phobias, post-traumatic stress disorder (PTSD), and substance use disorders. Computer generated 3D VR environments have been used experimentally in new fields of endeavor, including experimental systems and methods for assisting users overcome their phobias via virtual reality exposure therapy (VRET). The fundamental work of Barbara Rothbaum et al., where automated psychological intervention delivered by immersive virtual reality was found to be highly effective in reducing fear of heights, was followed by a substantial body of research work including VR systems that have been developed to assist people with overcoming a fear of flying by having them participate in a controlled virtual flying environment or helping patients reduce their experience of pain such as in burn victims by refocusing their attention away from the pain by having them engage in a 3D VR environment, such as a virtual snow world. The virtual environment created in such therapies is perceived as real enough by the user to generate measured physiological response—increased heartbeat, breathing, or perspiration—of their virtual experiences to the feared stimuli in a controlled setup, offering clinical assessment, treatment and research options that are not available via traditional methods. By confronting a scenario that intently maps onto the phobia, subjects are able to diminish the avoidance behavior through the processes of habituation and extinction (Riva, 2008). Beyond helping patients with fear of heights (acrophobia) or fear of flying, to date, VRET has been successfully used to address a myriad of specific phobias like claustrophobia, fear of driving, arachnophobia (fear of spiders), social anxiety, and for PTSD in Vietnam War combat veterans. More recently, the U.S. Army has developed Full Spectrum Warrior, a real-time tactics and combat simulation video game used for VR treatment aid of PTSD in Operation Iraqi Freedom/Operation Enduring Freedom (OIF/OEF) combat service men and women as well as those who have served in Afghanistan.

Similarly to assist in overcoming phobias, virtual reality has emerged as a powerful new tool to help individuals with substance use disorders. Virtual experiences have been shown to present several opportunities to improve patient treatment for substance use disorder, including tobacco, alcohol or illicit drugs. Through VR, patients are able to practice recovery techniques and cope with triggers in a safe and protected environment, allowing them to maintain sobriety and avoid relapsing. Beyond a treatment platform, VR was also found to assist in studying and measuring overall human behavior and cognition, helping researchers explore human nature in the control surroundings or custom designed settings.

In a similar manner, virtual reality emerges as a promising tool to complement SIT in the military ICE settings. “VR can enhance the effect of SIT by providing vivid and customizable stimuli” (Wiederhold and Wiederhold, 2008), while uniquely manifesting in each particular special forces pre-combat ICE environment training, or can be even individually-tailored to SOF personnel.

VR in Stress Inoculation Training

Today’s military organizations across the world—all three services (army, navy and air force)—have a long history of employing combat simulations for training exercises, playing an essential role in preparing soldiers and pilots for modern combat. VR is used often in air forces to train personnel, both aircrew and combat service support. The most well known use originated in flight simulators which were designed to train in dangerous situations without actually putting the individual or aircraft at risk (e.g., co-ordination with ground operations, emergency evacuation, aircraft control whilst under fire) and at substantially less cost. More recently, the US Air Force (USAF) has taken steps to implement a training scenario using VR that includes a visual simulation of the setting of an airfield to enable airmen to practice their role as if they were operational.

The goal of integrating VR in SIT is to enable, over time, repetitively practiced skills become automated, thereby requiring less attention to stress and being more resistant to stimuli disruption in a consequent real environment (Wiederhold and Wiederhold, 2008). It facilitates knowledge and familiarity with a stressful environment, practice task-specific and psychological, as well as build confidence in an individual’s capabilities. The U.S. Department of Defense spends an estimated $14 billion per year on Synthetic Training Environment (STE), a training that deploys digital environments to “provide a cognitive, collective, multi-echelon training and mission rehearsal capability for the operational, institutional and self-development training domains” (USAASC, 2019). This suggests that existing commercial tools could enable the SOF to move beyond traditional training simulators while improving the quality of the SIT itself, specifically designed for ICE environments.

Today, in post-combat use, VR is already implemented in aiding recovery from psychological trauma for people with of post-traumatic stress disorder and help researchers to create more objective measures of PTSD, such as with Virtual Iraq, which later was renamed to Bravemind. Bravemind is a virtual reality environment to provide prolonged exposure (PE) therapy to veterans suffering from post-traumatic stress. In this cognitive-behavioral intervention, the subject is virtually exposed to a variety of stimuli (i.e., visual, auditory, kinesthetic, and olfactory) with the goal of a subject being incrementally exposed the stressful triggers specific to him/her until adaptation to the traumatic experiences occurs. Moreover, preliminary findings suggest that in pre-deployment use such tools could be used to evaluate individuals who might be more exposed than others to the PTSD effects before combat. By teaching these coping skills preemptively, researchers hope to clinically identify and evaluate physiological reaction during the VR exposure to determine if the individual would require continued or prescribed care. Initial outcomes from open clinical trials using virtual reality have been promising, giving the therapist flexibility to expose the user only to environments he/she would be capable of confronting and processing (Wiederhold and Wiederhold, 2008). Observations in open clinical trials showed that those who were exposed to emotionally evocative scenarios and acquired coping mechanisms exhibited lower levels of anxiety than those in the control group.

It is argued that in a similar manner VR could be modified for SIT in ICE environments for SOF.

. . . such a VR tool initially developed for exposure therapy purposes, offers the potential to be “recycled” for use both in the areas of combat readiness assessment and for stress inoculation. Both of these approaches could provide measures of who might be better prepared for the emotional stress of combat. For example, novice soldiers could be pre-exposed to challenging VR combat stress scenarios delivered via hybrid VR/Real World stress inoculation training protocols as has been reported by Wiederhold & Wiederhold (2005) with combat medics. (Rizzo et al., 2006)

Researchers from a broad range of disciplines have proposed explanations that combining VR with SIT can be more effective than real world training systems, without incurring the costs of facing rare or dangerous experiences, excessive time expenditure or unique scenario adaption.

Given its ability to present immersive, realistic situations over and over again, the technology can give SOF trainers and recruiters the opportunity to potentially design expertise on conditions before they see them for the first time in real soldiers. Moreover, VR can also offer the ability to design individually-tailored scenarios to accommodate the “long tail” tactical challenges in ICE environments—psychosocial adaptation to military captivity; dealing with civilian population in the area of operation; enhancing performance that might occur due to improper case-by-case cooperation, coordination, communication, and/or psychosocial adaptation within a tactical team; and mitigating the risk of adverse cognitive or behavioural conditions and psychiatric disorders pre- and post-deployment—which more often can be encountered in the type of SOF units’ operations. 

Future Directions

Despite its potential benefits, the implementation and understanding of VR in the military settings is still not fully developed and focuses mainly on the general applications of SIT. By evaluating its performance in stressful environments, this writing argues that we might be able to make progress in successfully indicating physiological and psychological reactions during the VR exposure in ICE environments to determine if the individual can enhance his/her ability to cope with severe stress to ensure that the mission succeeds or survive.

Read More