Friday, 26 June 2015

The Perturbation Experiment as a Way to Study Perception

When you study perception, your goal is to control the flow of information going into the system so that you can measure the resulting behaviour and evaluate how that information is being used. There are two ways to do this, one (sometimes) used by me, one used by, well, everyone else. In this post I'm going to compare and contrast the methods and describe why the perturbation method is what we should all be doing.

The standard method is to present experimentally isolated cues and test whether people can detect those cues. The perturbation experiment presents a 'full cue' environment but selectively interferes with the link between a single variable and the property it might be information about. These two different methods lead to very different ways of thinking and talking about perceptual abilities. 

Standard Psychophysics
The standard method involves presenting potential sources of information (or 'cues') in isolation, one at a time, and measuring a person's ability to detect that information by asking them to respond when they can see it. A standard experiment might therefore present people with some stimuli that vary in magnitude, and you vary this difficulty over time until you find the level at which people are at chance performance; this is the threshold, or the point at which people move, on average, from being able to detect to not being able to detect a signal. The main thing you are interested in figuring out is can people detect this cue. There are also variations where you ask people if they can discriminate the cue from alternatives under various conditions. 

Presenting cues in isolation provides a great deal of experimental control over the physical properties of the stimulus. You can then relate variation in those properties to psychological responses; the mapping is a psychophysical function. Over time and experiments you can identify a set of cues for something like distance that people can detect, and then you figure out how people combine these cues under various conditions in order to derive estimates of distance in a 'full cue environment'. 

A Problem

Presenting cues in isolation is problematic. Simply presenting some isolated signal to an organism and not asking the organism to use that information to coordinate a relevant behaviour tells you nothing about whether the organism can do that coordination. Sure, you might be able to detect a cue that co-varies a little with the thing you are trying to perceive, but do you ever rely on that cue in the real world?

This problem was identified and tested by Mon-Williams & Bingham (2008). They investigated whether people could use height-in-the-visual-field (HVF) as a cue to distance. HVF is a 2D cue to distance readily observed in a picture; things farther away are depicted higher up (see Figure 1).

Figure 1. Height in the visual field as a cue to distance. The cows at the top of the picture are farther away
As is clear from Figure 1, something else also changes, namely the apparent size of the cows. Mon-Williams & Bingham controlled this in an experiment where they put people in a dark room and presented them with objects painted with glow-in-the-dark paint. The objects were presented to people at different distances and heights in such a way that apparent size did not change. This isolates HVF as the only source of information about distance. Under these conditions, people's judgments of distance co-varied with HVF and not actual distance. The authors then repeated the experiment but this time presented the objects on a visible surface (a glow-in-the-dark painted grid). People immediately stopped using HVF and instead accurately judged the distance using the position in the grid.

The key here is that HVF was only used when it was literally the only thing that was available. It was not combined with the surface information in the second experiment. This result directly contradicts claims that we integrate cues in order to perceive the world (the kind of thing you have to do when interacting with impoverished, probabilistic information). Instead, people flexibly switch along a hierarchy and simply use the information they have learned is the best (i.e. the information that produces the most stable and functional behaviour). Studying cues one at a time leads us to the mistaken belief that we have to integrate cues, rather than detect information, in order to perceive our environments.

An alternative: the perturbation experiment

Being able to detect something is one thing; using it to do something is another, and it's actually the latter we psychologists are interested in. What we really want to know is what information people use in the performance of a task to achieve success in that task. 

Tasks are best described dynamically. Armed with this description, you can identify all the kinematic patterns that the task can create; these are the potential sources of perceptual information about the task. These kinematic patterns are informative about the dynamics because they can be specific to those dynamics, and thus detecting the pattern is functionally equivalent to detecting the dynamic

It is possible to break this link, however, using artificially created kinematic patterns (virtual reality). I can show you a dot on a screen that is moving according to the equation of a pendulum, and it will look like a pendulum swinging. The pattern you detect that makes it look that way was not, in this instance, created by an actual swinging pendulum. It was made by me and some software, but people respond to information, not the thing that created that information. Because there's no actual pendulum involved, I can now sneakily mess with the motion of the dot in ways I can't mess with a real pendulum. 

This approach leads to the perturbation experiment. I'm interested in identifying which information variables created by a task dynamic people are using to perceive elements of that task. I can now preserve all the 'cues' but break the link between the variable of interest and the dynamic property of the task. All the information remains (it's a 'full cue' set-up) but that information is no longer informative about the property of interest.

An example: perturbing phase perception

Coordinated rhythmic movement is governed by the perception of relative phase; to maintain a specific coordination, you must perceive that coordination and detect errors in order to correct them. Relative phase is typically perceived via the detection of the relative direction of motion; this is the information for the task property 'coordination'. We know this because of a perturbation experiment I ran in my dissertation (Wilson & Bingham, 2008). 

I created displays of rhythmically moving dots that moved at some relative phase. I then systematically perturbed all the kinematic consequences of the task dynamic described by the Bingham model that are available to be used by a person to perceive relative phase. These were relative speed, relative frequency, relative position and relative direction. At all times, all these variables were present in the display; in a given display type, though, the variable of interest was perturbed so as to be uninformative about relative phase

The logic of this experiment is that if people are using a given variable to perceive relative phase, making that variable uninformative about relative phase will completely break the person's ability to perceive relative phase. If they aren't using it, there will be no effect. We found that at 0° and 180°, perturbations of relative speed, frequency and position had no effect on 7/10 people, leaving relative direction as the only remaining source of information. (Perturbing relative direction proved impossible without removing the coordination aspect of the task, another hint that relative direction was the critical information). The remaining 3 people had their performance utterly disrupted in the position perturbation, suggesting they were using that variable as the information. We also tested trained observers of 90° and found the position perturbation hit everybody. Relative direction is the typical information source and this explains the basic pattern of movement stability, but relative position is a viable option. Some people use it 0° and 180° and everyone uses it at 90° after training (this switch is what makes the training work). 


Perception is for action; it's job is to support the creation and execution of functional behaviour. The functionality of behaviour is defined by the task demands, and the task also defines the information available to support functional behaviour. Studying perception therefore means understanding tasks and behaviours and assessing which task created variables are used to support which task required behaviours. None of this is possible in the 'isolated cue' design, and worse, that design leads us into making critical errors about how information is used. We do not combine cues: we learn to detect the best behaviour relevant information variable. We rely on it when it is present and we only switch when it's not available or when it's revealed by experience to not be the best information. People can use 'sub-optimal' information variables; there is individual variation, but this variation is simply the symptom of an as yet uncompleted search through the information space for the most stable variable specifying the property required. 


Mon-Williams, M. & Bingham, G.P. (2008). Ontological issues in distance perception: Cue use under full cue conditions cannot be inferred from use under controlled conditions. Perception & Psychophysics, 70(3), 551-561. Download

Wilson, A. D., & Bingham, G. P. (2008). Identifying the information for the visual perception of relative phase. Perception & Psychophysics, 70(3), 465-476. Download


  1. I'm wondering how you consider the so-called light-from-above prior in shape-from-shading experiments. In my view, it's a bit like the HVF you mentioned: some kind of fall-back solution one uses when ambiguity in information is high.

    1. IIRC, Gibson talks about shading and how the light source can be specified which reduces ambiguity in shape. But yes, there are no 'cues' per se; there's information that varies in stability and people use the one they've learned works best out of the currently available options.

  2. Regarding the perturbation procedure you're mentioning, it seems to me that it's only able to tell you what cue is necessary, but it does not tell you whether it's sufficient.
    I have found this pertubation approach used in the Computer Vision literature, notably in Portilla and Simoncelli's model of visual texture.
    Having said that, I am not convinced that it makes sense to look for "necessary cues" in Human Vision: what is used might entirely depend on the task...

    What is your opinion about that?

    1. What counts as information does depend on the task, but a good task analysis reveals all the information options. In the coordinated rhythmic movement, we know the only kinematics that can even in principle specify relative phase are relative frequency, relative speed, relative position and relative direction.

      "Necessary and sufficient" isn't quite the right framing. Relative direction is sufficient, but it's not necessary because relative position works too. Relative direction is probably more stable than relative position (at least at 0 and 180) but that's a separate issue.