Stopping Rule Selection (SRS) Theory
The critical step facing every decision maker is determining when it is appropriate to stop collecting evidence and make a decision. This is known as the stopping rule. Over the years, several unconnected explanations have been proposed that suggest nonoptimal approaches can account for some of the observable violations of the optimal stopping rule. Current research proposes a unifying explanation for these violations based on a new stopping rule selection (SRS) theory. The main innovation here is the assumption that a decision maker draws from a large set of different kinds of stopping rules and is not limited to using only one. The SRS theory hypothesizes that there is a storage area for stopping rules the so-called decision operative space (DOS) and a retrieval mechanism that is used to select stopping rules from the DOS. The SRS theory has shown itself to be a good fit to challenging data published in the relevant literature.
Figure 1: (A) The hypothesized decision operative space (DOS) for three stopping rules. Each point represents a single stopping rule with a stopping value. A straight line connects the same stopping rule with different stopping values. (B) A cast-net retrieval from the DOS. Dotted circles represent three different cast nets. A decision maker decides to throw a cast net, of a specified diameter, at a certain location of the DOS. In a single trial the cast net can capture several stopping rules.
Systems Factorial Technology (SFT) - a magnifying glass on human cognition
SFT is a suite of methodologies based on analysis of reaction time (RT) distributions (Townsend & Nozawa, 1995). The core idea is analogous to the backward engineering problem: An experimenter manipulates the speed of certain processes and measures the resulting RT distributions. It turns out that different underlying cognitive systems generate specific diagnostic signatures of RT distributions. SFT is designed to act as a magnifying glass, to allow us a closer look at the properties of cognitive systems that are not directly observable: These properties are (a) mental architecture, which defines the way processes are organized in serial, parallel or more complex networks; (b) stopping rules, which define when a system should stop accumulating evidence and proceed with the decision (e.g., the exhaustive stopping rule: do not decide until all processes are completed; the self-terminating stopping rule: decide at the first process which provides a critical evidence; (c) interdependency between processes, which defines whether there is facilitation or inhibition between processes; and (d) overall capacity of the system, which relates to the efficiency with which a cognitive system uses processing resources.
Identifying the above process properties, through the application of SFT, can help us better understand cognitive and perceptual behavior. For example, how do we recognize faces, and is there any accurate way to tell a friendly face from a terrorist face? When a plane's engines are on fire, how should the pilot retrieve emergency procedures from memory? How do we decide which car to buy, or which presidential candidate should get our vote? Consider a security guard trying to spot a suspect in a crowded hall. Using a serial mental architecture, the guard would sequentially compare memorized terrorists' faces to all the faces in that guard's visual field. After each comparison a separate decision is made. In this example the mental architecture refers to the set of serial comparisons. Alternatively, the search process could be conducted in parallel: The security guard could simultaneously compare the template face to all faces in his/her visual field. The deployment of SFT tools provides a means for determining which architecture is engaged by the security guard.
To acquire answers to these and similar questions, I am developing and testing quantitative models that provide a full account of both the time course of processing and the associated choice probabilities. Within the broader theme of information processing, my research focuses on several interlacing lines of work.
Development of Diagnostic Signatures for Testing Cognitive Models
To infer the processing order (serial, parallel) and stopping rule (limited, exhaustive) SFT uses the reaction time signatures, based on the two nonparametric statistics: the mean interaction contrast (MIC; Sternberg, 1969; Schweickert, 1978; Ashby & Townsend, 1980) and the survivor interaction contrast (SIC). The latter extension makes use of data at the distributional level rather than means and therefore permits analysis at a more powerful and detailed level (Townsend, 1990; Townsend & Nozawa, 1995; for extension to complex networks see Schweickert, Giorgini & Dzhafarov, 2000).
To calculate MIC signature from the observed data one needs to find a pattern of observed mean RT conditions (RTHH, RTLH, RTHL , RTLL ), the so-called MIC signature.
MIC= [RTSLL - RTLH] - [RTHL - RTHH]
Graphically, the signature is obtained when one plots observed mean RTs as a function of the factorial conditions. The dependent variable, i.e. observed mean RT, is on the y-axis. The first factor is presented on the x-axis, defined as process one crossed with the two levels of saliency (process 1 (L, H)). The second factor is added to the line plot as separate lines. The second factor defines the second process crossed with the two levels of saliency (process 2 (L, H)). For plotting the MIC signature the four mean RTs are calculated from the experimental conditions. The four means are plotted using a line plot. The mean RTs that differ on the first level of the first factor (HH and LH on one side, and HS and HH on the other side) are connected by a line (Figure 1**)
The MIC signatures provide significant improvement to learning underlying cognitive processes, as compared to the methods employing simple mean RT. An even stronger tool than MIC to distinguish between different mental architectures, is provided by the SFT with which one can gain more inferential power: the survivor interaction contrast function (SIC). SIC function operates on estimated survivor functions calculated from the experimental data. A survivor function is a probability function that tells us the survival rate of some components of interest, after some time t. Instead of converting a data set into a single response time mean value, each data set is converted into a survivor function, which can easily be done using standard statistical software packages. The additional information, contained in the shape of the survivor function, provides more diagnostic power then a single RT and MIC (Figure 2)
When two processes are involved, factorially combined with saliency levels, the SIC function is defined as follows:
SIC(t) = [SLL(t) SLH(t)] - [SHL(t) - SHH(t)].
Here SHL indicates the survivor function for the condition where the first process was of high salience (thus H), and the second process was of low salience (thus L). One can compute the survivor functions associated with each of the four types of factorial combinations of cues and saliency levels, which we denote SLL(t), SLH(t), SHL(t), and SHH(t). Because processing is presumably slower for low-saliency (L) than for high-saliency (H) processes, the following should hold: SSS(t) = SSF(t), SFS(t) = SFF(t), for all time t (Townsend & Nozawa, 1995).
Figure 2: the SFT signatures: MIC (on the left), SIC (in the middle), that are used to detect the various mental architectures (on the right) involving different organization of processing order (serial, parallel), stopping rule (self-terminating, exhaustive) or process independence (independent, dependent). The mental architectures show possible organization in processing of two facial features in a face recognition task.
Theoretical Advancements: A Synthesis of Mental-Architecture, Random-Walk, and Decision-Bound Approaches
Much of the lab's ongoing work concerns the development of a general model of information processing (Fific, Little, & Nosofsky 2010). This innovative approach, published in Psychological Review, represents the formal synthesis of more traditional views based on (a) identification of the cognitive system's architecture (Sternberg, 1969; Treisman & Gelade, 1980), (b) random-walk or diffusion processes (Ratcliff, 1978; Ratcliff, Van Zandt, & McKoon, 1999), and (c) models of classification of stimuli based on using decision bounds (Ashby & Townsend, 1986; Maddox & Ashby, 1993). Our approach synthesizes three earlier approaches. The first is the traditional information-processing approach which aims to identify an information-processing system in terms of its architecture (serial or parallel), stopping rule (self-terminating or exhaustive) and memory capacity. The second is an approach that characterizes cognitive systems using random-walk or diffusion-based processes. A cognitive system using a random walk combines a mechanism for a sequential evidence collection (sequential sampling), and a mechanism that makes a determination as to whether the evidence accumulation has reach a criterion (decision bound) needed to proceed with the decision. This approach provides a full account of RT distributions for both correct and incorrect responses. The third approach assumes that decision making relies on using decision bounds to divide a perceptual stimulus space into response regions.
Remarkably, there has not been much overlap among the three approaches. In collaboration with Dr. Robert Nosofsky and Dr. Little, Dr. Fific designed a unified framework combining information-processing architectures with random-walk processes, thereby remedying the deficits that each approach exhibits in isolation. Under the umbrella of SFT, we formalized the logical-rule models of classification RTs and tested a set of logical-rule models for predicting perceptual classification RTs and choice probabilities (Fific et al., 2010). The new computational logical-rule models can be used to assess different information-processing decision strategies employed by human operators. For example, we can examine which information source was processed first or second, whether processing of separate sources of information was conducted serially or in parallel, and whether the perceptual sources of information were treated independently or pooled together into a single source. The logical-rule approach that combines mental-architecture and random-walk approaches provides stronger assessments of these properties than when the information-processing approach is used in isolation. Another important breakthrough achieved by using the logical-rule approach is a rigorous delineation of the major classes of computational cognitive models involving rule-based strategies and the models based on global familiarity/exemplar properties (e.g. Nosofsky & Palmeri, 1997).
Validation of SFT
One of my major accomplishments is the validation of the SFT methodology in diverse cognitive tasks. The main thrust of this work has been to extend this methodology from simple detection tasks to more complex tasks, including memory and visual search, face categorization (Fific, 2006; Fific, Nosofsky, & Townsend, 2008; Fific, Townsend, & Eidels, 2008; Townsend & Fific, 2004), and stimulus classification, as well as to real-life complex decision making (Fific & Rieskamp, 2010). The SFT methods (Townsend & Nozawa, 1995) are used to nonparametrically distinguish between alternative information-processing architectures. Unlike traditional approaches, using only mean RTs, (Sternberg, 1969), SFT provides more inferential power by examining diagnostic signatures of RT distributions. The SFT methodology is a major breakthrough in the field of information processing because it overcomes the problem of serial-parallel model mimicking (Townsend, 1971, 1990). Within SFT it is possible not only to differentiate serial and parallel processing, but also to determine whether the processing is self-terminating or exhaustive. Additional tools in SFT address the issue of possible interdependencies among processing units.
Figure 4: Stimulus modalities employed by SFT.
Visual & Memory Search
In a visual-search study, we discovered that participants showed evidence of serial processing when search items were of minimal visual complexity, a result in line with that of several models of visual perception. When the visual complexity of the search items increased, SFT revealed a departure from pure serial and pure parallel processing. The most plausible explanation at present relies on a parallel architecture with interacting processes (Fific, Townsend, & Aidels., 2008). In a study on short-term-memory search we found that depending on the duration of presentation time, the exhibited SFT signatures changed from pure serial to pure parallel processing (Figure 5). Also we observed significant individual differences: Some participants preferred serial processing and some preferred parallel processing. A purely 'parallel' participant was a pilot working for a commercial airline. The knowledge gained from this study on short-term memory promises to have important implications for improving means of evaluating people's skills or aptitudes. It is a plausible idea to adapt the SFT methodology to evaluate cognitive skills of candidates for jobs that require ability to process information in certain order. It could expect that pilots should be able to multitask, that is to process information in parallel, when necessary. However, depending on the situation and the task, it could be required of an operator to process information strictly serially. In both cases the SFT is able to clearly delineate the two systems, serial and parallel, based on RT data.
The ability of SFT to capture striking individual differences makes it a valuable tool for modeling clinical populations (e.g., McFall & Townsend, 1998; Neufeld & McCarty, 1994; Vollik, 1994). In collaboration with Dr. Richard Neufeld, we outlined our research strategies for the application of SFT in a clinical population (Townsend, Fific, & Neufeld, 2007). We hypothesized that evidence for the presence (or absence) of the serial and parallel mental architectures in different cognitive tasks could be a highly diagnostic tool for distinguishing between normal and symptomatic participants.
Figure 5: A short-term memory data: (A) Data obtained for each participant are presented across rows. Survivor interaction contrasts (SIC) are presented in the first two columns, for each level of ISI (700 and 2000msec). Each obtained SIC (dotted line) is presented together with its corresponding model fit (solid line). The r2 statistics are shown in lower right corner for each model. (B) Corresponding mean interaction contrasts (MICs) are presented in the last two columns for each level of ISI (700 and 2000msec).
Integral-Separable Psychological Dimensions
In another set of validation tests, we hypothesized (Fific, Nosofsky, & Townsend, 2008) that SFT would reveal evidence of distinct mental processing architectures for the classification of separable-dimension stimuli and integral-dimension stimuli (Garner, 1974). Integral dimensions are those that combine into relatively unanalyzable, unitary wholes (e.g., brightness, saturation, and hue). By contrast, separable dimensions are those that remain psychologically distinct when combined (e.g., shape and color). We had expected that integral dimensions would give rise to coactive processing, while separable dimensions would call for either serial or parallel independent processing. Our reasoning was that in the case of integral-dimension stimuli, the perceptual system apparently "glues" the individual dimensions into whole objects at an early stage of processing. For the separable-dimension stimuli in an exhaustive categorization task, the SFT methodology revealed a serial or parallel architecture with an exhaustive stopping rule. As hypothesized for the integral-dimension stimuli, the SFT methodology provided clear evidence of coactivation. This research adds to the converging evidence for distinguishing separable-dimension and integral-dimension interactions (Figure 8).
Figure 8: Examples of separable and integral psychological dimension stimuli used in set of experiments employing SFT analysis.
Face Perception
Dr. Fific's most recent accomplishment has been the extension of the SFT application into the domain of holistic face perception. The established view is that faces are special in perception. They seem to be processed very rapidly, very accurately, and, most importantly, as a 'Gestalt' or whole. A face's wholeness obstructs us from seeing its detail. When we perceive a friend's face we are not usually aware of that person's lips or chin in isolation. A simple serial or parallel information-processing architecture cannot account for these unique face properties. So, we developed a model of a coactive architecture that gives rise to holistic face processing (Fific, 2006; Fific & Towsnend, 2010; Wenger & Townsend 2001). Coactivation occurs when some form of cross-talk exists between individual information-processing channels. Cross-talk results in separate processing channels basically summing or pooling their information into a common channel prior to any decision. For example, two information channels could be two facial feature detectros: one for detecting the Joe's eyes and the other one for detecting the Joe's lips. When processing is independent, each facial feature is processed separately. But in coactive processing, both channels are pooled into one decision channel, forming a higher order unit: the Joe's face.
In a set of experiments, I validated the SFT predictions that participants would exhibit the coactive architecture when identifying well-learned faces (Fific, 2006; Fific & Townsend, 2010). In a speeded classification task, the participants learned to classify briefly displayed faces into two face groups (Jets and Sharks). Participants exhibited a parallel architecture at the beginning of the lengthy training period and a coactive architecture toward the end. When we manipulated both the featural and configural face properties participants switched from coactive to parallel processing. To account for the observed data patterns, I designed a computational model of face perception (Fific, 2006; Fific & Townsend, 2010). The heart of the model is a device that modifies the level of analytic/holistic processing (Figure 1 and 2), which reveals the idea that face perception can be cognitively controlled over a range of processing types from analytic to feature based, to holistic.
Figure 6: The Gestalt-O-Meter,the metaphoric cognitive device capable of changing the level of perceptual holism. If more holism is added the face percept becomes more integral.
Figure 7: A cognitive mechanism for detecting two facial features (u1 and u2). Evidence for each feature is accumulated separately. When a feature is detected the signal is sent to the OR (or AND) gate. The red lines correspond to a cross-talk mechanism that is postulated to underlie the holistic unitization.
Reading
One area in which psychologists still disagree widely with regard to processing architecture, is that of reading. Some computational models of humn eye-movements while reading rest on the assumption that people visually attend to one word at a time, processing adjacent words serially. The EZ-reader model is based on this assumption, and could claim to be the most prominent model in its field at the moment. The SWIFT model for example however, assumes that attention is distributed across the visual span as we read.
A member of Dr. Mario's lab, Kyle Zimmer, has an interest in psycholinguistics that has led him to design and conduct a pilot experiment utilizing the SFT methodology to test these assuptions with diagnostic individual subject analyses. Preliminary results indicate that people are indeed capable of visually processing at least adjacent words in a self paced reading task in parallel, and perhaps even coactively. More testing is warranted and underway.
Judgment and Decision Making - testing different decision making strategies under different environments
During the course of Dr. Fific's work with the Max Planck Institute for Human Development's Adaptive Behavior and Cognition (ABC) group, he has developed a research program with several topics. The topics are centered on the application of the SFT methodology to uncover the processes in different decision strategies.
The current approaches to analysis of RT trends have provided a strong line of converging evidence toward the identification of the decision strategy being used (e.g., Broder & Gaissmaier, 2007). A major scientific debate (e.g., Gigerenzer & Goldstein, 1996) is going on between two approaches: The so-called fully rational approach advocates that for the best results, a decision maker should use all available information to make a decision. In the second approach, the best decisions are made using only part of the available information, cleverly prioritized, usually employing some lexicographic reasoning. This is called the boundedly rational approach (Simon, 1957). The class of bounded lexicographic strategies assumes that an object's attributes are compared in a sequential manner, a comparison process that stops when it finds a critical distinguishing attribute, with possible termination on each attribute. One typical example is the take-the-best heuristic. In contrast, the class of fully rational decision strategies involves many candidate models, such as Franklin's rule and Dawes's rule. The rational models naturally assume that decision strategies should conform to comparison of all attributes, while they are mute with respect to the processing order of the underlying cognitive architecture, that is, whether attributes are inspected in serial or parallel fashion.
However, strong tests of the processing properties of the fully rational and boundedly rational approaches are rare. The standard tests of decision strategies focus not on revealing underlying processes, but rather on predicting a choice probability output. The SFT methodology allows for comparison between the boundedly rational and fully rational decision strategies using RTs in addition to choice probability.
To address the issues of determining the order of processing (serial vs. parallel) and amount of processing (restricted vs. exhaustive information search), I applied a powerful theory-driven SFT methodology for testing RT predictions. SFT revealed quite distinctive patterns of RT results, clearly delineating between rational and boundedly rational decision strategies. Overall, subjects showed different RT response patterns in the different learning environments. Strong support for take-the-best and lexicographic strategies existed in the learning environment in which a single attribute could be used to make a correct decision. Parallel, exhaustive processing was proven to be at work in the compensatory environment in which all attributes were needed to make the correct decision (Fific & Rieskamp, 2010; Gaissmaier, Fific, & Rieskamp, 2010).
Figure 9*: Which bug is more poisonous? A) Fangs, legs and body as Bug's poisonous cues. The left bug's cues were manipulated to be poisonous, and the right bug's cues were non poisonous. B) Examples of the masking using transparent leafs to produce the saliency effect (L and H).
Rosetta Stone for Decision Making Strategies
This project aims at establishing an important bridge between the cognitive methods designed to trace elementary cognitive processes and the theoretical concepts developed in the domain of judgment and decision making (JDM). First, using the analogy of the Rosetta stone, we show the theoretical link which allows a more precise mapping between the related concepts across the two domains. Then, we introduce the reaction time technology developed to reveal an organization of mental processes in the cognitive domain. Finally, we show the application of the reaction time technology in a typical JDM task, such as a probabilistic inference task.
Traditionally, empirical studies in judgment and decision making (JDM) have been dominated by examining choice outcomes with only a weak motivation to measure underlying cognitive processes. More recently, however, JDM theories have become increasingly process-oriented; yet, these theories are difficult to test based on traditional choice outcome data alone. One method of examining cognitive processes is to analyze response times (RT). The significance of RT analysis in process tracing has been well documented in the area of cognitive psychology, where the central aim is to isolate plausible cognitive mechanisms by employing tests on RTs as well as choice outcomes (e.g., Ashby, 2000; Heath, 1992; Link, 1992; Nosofsky & Palmeri, 1997; Ratcliff, 1978; Fific, Little & Nosofsky, 2010).
The recent use of RT trend analyses within JDM provides a strong line of converging evidence toward the identification of decision strategies (e.g. Bergert & Nosofsky, 2007; Br"der & Gaissmaier, 2007; Persson & Rieskamp, 2009). These studies compare compensatory (additive) or non-compensatory (lexicographic) decision strategies. Lexicographic strategies (e.g., Take-The-Best, TTB) assume that matching attributes are compared sequentially and that this comparison process stops on a critical attribute that allows for distinction. Thus, the decision process can terminate on any attribute. In contrast, compensatory decision strategies involve many candidate models, such as Franklin's rule (FR), Dawes's rule (DR), and weighted additive (WADD) strategies. Such models assume that decision makers compare choice options on all attributes, but do not formally specify processing order-that is, whether attributes are inspected serially (one cue at a time) or in parallel (multiple cues simultaneously). We thus propose a methodology that can differentiate between serial and parallel processing, as well as between exhaustive (all attributes are used) vs. restricted (only a subset are used) information search.
Rather than relying solely on RT means or medians as in previous JDM work, we introduce a strong theory-driven methodology for testing RT predictions on entire distributions. Systems factorial technology (SFT) can diagnose the type of processing architecture that underlies performance in different cognitive tasks' for example, whether information processing is serial or parallel, and exhaustive or restricted (e.g., Schweickert, 1985; Townsend & Ashby, 1983; Townsend & Nozawa, 1995; Townsend & Wenger, 2004). SFT is based on a powerful combination of experimental design features (typically involving selective influence) and analytic techniques (e.g., analyzing specific patterns in interaction contrasts). The aim of our particular study is to determine: (a) how strategy selection is affected by different environments; and (b) the processing structure of the decision making strategies TTB, WADD, FR, and DR.
We applied SFT to a pair-comparison inference task in which a subject had to decide which of the two objects scored higher on a criterion. The objects differed on three independent cues. Two groups of subjects participated in either a compensatory or non-compensatory environment. In each environment, the subjects went through two phases: a learning phase where they received probabilistic feedback after each pair-comparison, and a critical test phase consisting of a two-cue pair-comparison task without feedback. In order to apply the SFT test, the cues' saliency levels were manipulated using transparent natural leaf masks (for the design details see Gaissmaier, Fific & Rieskamp, 2010). The SFT uncovered distinctive patterns of RT results that clearly delineated between compensatory and non-compensatory decision strategies. We found convincing support for TTB and lexicographic decision making in the non-compensatory environment. Subjects performed the task in strictly serial fashion processing the most valid cue first, then only processing the second cue if the first cue did not discriminate. In contrast, the analysis revealed parallel, exhaustive cue processing in the compensatory environment. Subjects processed the cues concurrently, and waited to make a choice until they had processed all the cues. The SFT test and RT patterns allowed for fine-grained insights into the processing structure of decision strategies, which could not have been achieved by solely analyzing choice outcomes, or by using cruder measures of RT.
Figure 10: The proposed Rosetta Stone for cognitive strategies. On the left-hand side, several cognitive strategies, in terms of the basic mental architectures, are described. On the right-hand side, several decision strategies are described and characterized with respect to their postulated mental processes (information search and processing order). The mid part is used to connect the cognitive strategies and decision strategies by way of SFT diagnostic mean RT signatures.
Mathematical and Computational Modeling
Central to Dr. Fific's work is the use of mathematical and computational models as tools for integrating and directing research. Dr. Fific's approach strongly supports quantification and rigorous hypothesis testing. Modeling provides a powerful scientific framework for evaluating theories of cognitive function. For example, models can help organize diverse sets of findings that might seem otherwise unrelated, and predictions derived from competing models can be used to guide empirical research (theory-driven methodology). During my career Dr. Fific will continue to develop computational models based on sequential sampling models, such as the random walk or diffusion process, that provide a full account of both RT distributions and choice probabilities (Fific et al., 2010). Dr. Fific's achievements and interests are in the area of advanced mathematical and statistical analysis, including model selection, nonlinear model fitting, optimization problems, bootstrapping, time-series analysis and Bayesian inference.