Let's Take A Closer Look

Explaining complicated subject matter simply since 1986

Edward Thorndike is often considered to be the father of modern Educational Psychology. His Law of Effect posits that when our behavior produces a favorable outcome and the same stimulus is presented again, we are more likely to repeat that behavior because we have learned to expect the same positive result. The other side of the coin is when our actions lead to unsatisfactory outcomes, we are less likely to repeat the same behavior in the future. Both Laws of Effect work with pets, kids, employees, and students, so most readers can benefit from understanding this phenomenon, how it works, and how to employ it to get the behavioral outcomes you want.

Thorndike worked out his theories with his Puzzle Box

He would place a hungry cat inside a cage with food outside, clearly visible but out of the cat’s reach. Hungry, not caring for confinement, and naturally curious, the cat would search for a way to get out of the box and get at the food. As it moved around inside the box, sooner or later it would accidentally trigger the release mechanism, opening the cage and allowing it to go straight for the food. Thorndike would then put the cat back in the box and repeat the experiment. After several rewarding “accidents,” the cat would learn to trigger the release within seconds of being placed in the box because it had learned to connect stepping on the treadle with release from the cage and a food reward.

Thorndike’s conclusion

When responses are immediately followed by satisfaction, those responses become more likely to occur, and when responses are followed by dissatisfaction they become less likely to occur. He described behaviors that produce favorable consequences as “stamped in” while those producing unfavorable consequences are “stamped out.”

Thorndike’s work was the inspiration for another psychologist, Burrhus Frederic Skinner

BF Skinner is considered to be the father of operant conditioning, a method he devised that was based on the work of Thorndike, Pavlov, and others. Operant conditioning is a way of learning that uses rewards and punishments to change behavior by applying consequences. Behaviors that are rewarded will likely be repeated and behaviors that are punished will likely occur less often. Sound familiar?

In operant conditioning, positive and negative do not mean good or bad

Positive reinforcement means increasing the probability of good behavior by introducing stimuli. Negative reinforcement achieves the same effect by removing stimuli.

It gets a little tricky here because reinforcement and punishment can each be positive or negative, a situation best explained by a simple chart:

There you go: Behavior that is followed by a pleasant consequence is likely to be repeated and behavior followed by an unpleasant consequence is less likely to be repeated. Keep this in mind when dealing with pets, kids, employees, and students.

From his experiments, Skinner concluded there are three types of responses that follow behaviors:

  • Reinforcers. These responses from the environment can increase the likelihood of a behavior being repeated. They can be either positive or negative. 
  • Punishers. These are negative operants that make the likelihood of a behavior decrease.
  • Neutral responses. These focus attention but neither increase nor decrease the probability of a behavior being repeated.

The Famous Skinner Box

The Skinner Box was somewhat different than Thorndike’s Puzzle Box but both were devices used in operant conditioning training. Skinner experimented with cats, rats, ferrets, dogs, pigs, and primates, but his favorite lab animals were pigeons because they lived much longer.

Skinner reinforced the behavior of his pigeons by teaching them that every time they pecked a button, a food pellet would drop into a tray

The consequence of receiving food every time they pressed the button ensured the pigeons would repeat the action again and again until they were no longer hungry. Skinner first rewarded them with a food pellet when all they did was turn in the direction of the button. As soon as the pigeons learned this, Skinner withheld the food until they got close to the button, then gave them another pellet. Once pigeons figured that out, Skinner only rewarded them when they raised their head to the height of the button. The last step was giving food to the pigeons only when they pecked the button. This process of breaking a new behavior into a series of small steps is called shaping, where the subject is gradually guided toward the desired outcome. Here’s what the process looks like:

Skinner used lab rats to study negative reinforcement

He shocked the rats with a constant electric current running through the metal floor of the box. This sent the rats scurrying in an effort to avoid being zapped. The shocks would stop only when the rats learned to press the lever. In this experiment, he demonstrated the power of negative reinforcement (the removal of a painful stimulus).

Schedules of reinforcement

These are the procedures that determine when and how to reinforce behavior. There are several types of reinforcement, each having a distinctive effect on the speed of learning.

  • Fixed ratio reinforcement. Rewards depend on the number of times a desired behavior occurs. Anne H, my one-time research assistant, was keying in data by hand. It was tiring, repetitive work. She put a small dish of M&Ms on her desk and rewarded herself with one every time she completed entering data for 10 records. She was not a psychologist but knew how using this system gave her something positive to look forward to. It also broke one very large task (entering data from 500 interviews) into 50 small tasks of 10 each. She considered every set of 10 completed surveys as a milepost by which she could judge her progress.
  • Variable ratio reinforcement. Unlike the predictability of fixed interval reinforcement, variable ratio rewards have no rhythm and can come at any time. This is the way casinos program slot machines. Variable ratio reinforcement creates a steady and high rate of response because gamblers “know” a huge reward could come with the very next spin.
  • Fixed interval reinforcement. Hourly and salaried workers are paid on regular schedules so they know their paycheck arrives every week, every two weeks, or monthly.
  • Variable interval reinforcement. Pop quizzes, DUI checkpoints, and radar speed traps are things we will surely encounter, but we never know when.

Punishment

The operant conditioning definition of punishment is “Any change to the surrounding environment that reduces the probability of the behavior happening again.” Punishment efficiently decreases undesirable behaviors, but comes with real risks. It is more effective at reducing bad behavior than at increasing good behavior. Caution must be taken because punishment can easily become abusive.

Two of BF Skinner’s 21 books drew fire from critics

In 1948, he wrote Walden Two, a fictional account of a utopian society where people were led to be good citizens through a program of rewards and punishments. This frightened some people who saw it as an example of totalitarianism.

In 1971, he wrote Beyond Freedom and Dignity. Critics were angered because some of his views seemed to imply humans had no free will or individual consciousness. Skinner tried to set the record straight regarding any misinterpretations of his works with his 1974 book, About Behaviorism.

How can I use operant conditioning at home and work?

Bonus

Established in 1988, the B. F. Skinner Foundation promotes the science founded by Skinner and supports the practices derived from that science. The Foundation advances the understanding that a more humane world can be achieved by replacing coercive techniques with positive procedures.

Enter your email address to subscribe to this blog and receive notifications of new posts by email.
Loading