A whole range of rules can govern the contingency between responses
and reinforcement - these different types of rules are referred to
as schedules of reinforcement. Most of these schedules of
reinforcement can be divided into schedules in which the contingency
depends on the number of responses and those where the contingency
depends on their timing.
Ratio Schedules.
If the contingency between responses and reinforcement depends on the
number of responses, the schedule is called a ratio schedule. The ”classic”
schedule, where one reinforcer is delivered for each response, is called
a continuous reinforcement schedule. The continuous reinforcement schedule
has a ratio of 1. A schedule where two responses had to be made for each
reinforcer has a ratio of 2, and so on.
A distinction is also made between schedules where exactly the same
number of responses have to be made for each reinforcer (fixed-ratio
schedules), and those where an average number of responses have to
be made for each reinforcer (a variable-ratio schedule). Two
examples:
1. A schedule where exactly 20 responses were required for each
reinforcer is called a fixed-ratio 20 or FR20 schedule.
2. A schedule where on average 30 responses are required for each
reinforcer is called a variable-ratio 30 or VR30 schedule.
Interval Schedules.
If the contingency between responses and reinforcement depends on time,
the schedule is called an interval schedule. Simply stated, the first
response made after a specific period of time is reinforced.
A distinction is also made between schedules where exactly the same
amount of time must pass before the response produces reinforcement
(fixed-interval schedules), and those where an average amount of
time must pass before the next response produces reinforcement
(variable-interval schedules). Two examples:
1. A schedule where the first response an animal makes after a light
is on for 20 seconds is called a fixed-interval 20 or FI20. This
means any response made during the 20 seconds would be ignored.
2. A schedule where the first response an animal makes after a light
is on for an average of 25 seconds is called a variable-interval 25
or VI25
These basic schedules of reinforcement provide predictable rates of
responding.
Understanding how these schedules work can be very helpful in making
decisions. For example, quitting something “cold turkey” (look at
the bottom pink line - that is extinction!) can sometimes be the
best route because a gradual fading tends to keep the behavior
around because we tend to persist when we know it has paid off
before. But, as we know, this property is beneficial for increasing
wanted behaviors - it is how we shape new behaviors and eventually
fade the continual reinforcement. But other times, it maintains
unattractive behaviors - even causing addiction.
As Schneider (2012) pointed out, “Consequences come on a schedule,
not whenever we want” (p.77). And real life tends to present more
interval schedules than ratio.
The limited hold can be added to an interval schedule. This means
that once enough time has passed for the reinforcer to be available,
there is only a finite time to respond. Essentially, it is a “time
limit”. You can imagine that this helps “speed up” a behavior.
There are many variations of the intermittent schedules defined
above. Differential reinforcement of rates of responding addresses
the rate at which people perform certain behaviors. Sometimes we act
too slowly; other times we act too hastily. Some of you may answer
questions hastily on an exam, making stupid mistakes. Other people
may exercise, but not at a rate conducive to weight loss. Other
times, we need to build “perseverance” - and we can use the
progressive schedule for this. Basically, the reinforcement is
thinned independent of the learner performance. This works, as long
as the task does not increase in difficulty too quickly.
Compound Schedules of Reinforcement
As with any simple building block, these basic schedules combine to
form compound schedules of reinforcement. We are under all of these
schedules most of the time.
Concurrent Schedules.
This is where reinforcement occurs when (a) two or more contingencies
of reinforcement (b) operate independently and simultaneously (c) for
two or more behaviors. Simply demonstrated in the natural environment
- a choice. We are on a concurrent schedule to do right or wrong. We
are on a concurrent schedule to check Facebook or write our discussion
post.
Multiple Schedules.
This is where reinforcement occurs in an alternating, usually random
sequence and can be identified by the discriminative stimulus. Simply
demonstrated in the natural environment - we tend to respond in the presence
of certain stimuli, but not others. Consider swearing. When we are visiting
grandmother, we may refrain from swearing because it is typically not
reinforced in this situation. However, when we are out with friends,
we may find our appropriate language deteriorates.
Chained Schedules.
Much like the multiple schedules, the chained schedule is a sequence
that relies on discriminative stimulus; however, the order is not random
- it is always the same. When you learn about behavior chains in the
upcoming course (you may already have a working knowledge of chaining),
it will come full circle. But a common example is tying your shoes.
Matching Law.
Let us go back to concurrent schedules, since there is an infinite number
of these schedules available at any given time in the environment. If
you recall, we mentioned that concurrent schedules could be equated to
“choice”. By paralleling “choice” with “preference”, we take a subjective
construct (preference) and allow it to be represented by an observable
behavior (choice).
How do we do this? By calculating the responses related to the
available choices.
“By simply recording how a client distributes their responses, we
can identify preference” (Reed & Kaplan, 2011, p.15).
As you have read in Reed and Kaplan (2011), Herrnstein
conceptualized the matching law by discovering a near perfect
correlation between reinforcement and behavior. You can see that on
page 16 of the article. Of course, the natural environment will
rarely (if ever) offer exactly two concurrent schedules with equal
quality. So, in application, the matching law had to expand. Neef,
Mace, Shea, and Shade (1992) discovered that preference is affected
by more than rate of reinforcement alone. Turns out, there are
several dimensions of a reinforcer: rate, quality, delay, and
effort. Applied in the natural environment, Reed and Kaplan
summarized that practitioners should:
“…make it less effortful for the learner to obtain high rates of
immediately available, high- quality rewards for the desired
behavior, relative to those associated with undesirable behavior”
(p. 18).
Understanding the context of contingencies ruled by the matching law
allows for differential reinforcement procedures to be effective -
without ever putting the inappropriate behavior on extinction!
We can see how the matching law, while considering rate, quality,
delay, and effort, affects our everyday lives. Consider companies
that have nailed it - such as Amazon. You may have a big wad of
generalized conditioned reinforcer in your pocket, ready to buy
something advertised online that you have been wanting. Are you
going to go to that company's website, create a username password,
enter your credit card number, and so forth? Or will you check
quickly to see if Amazon has the same item? Because with Amazon, you
are reinforced immediately (order placed!), the same item is
available, the delay is minimal (especially if you are Prime), and
the effort is as easy (on some occasions) as “one-click”. Sometimes,
we are even willing to pay a bit more for thed ecreased effort.
Brilliant.