Sean Wing › Don’t Shoot The Dog by Karen Pryor

Don’t Shoot The Dog by Karen Pryor

Reinforcement is an event that (a) occurs during or upon completion of a behavior; and (b) increases the likelihood of that behavior occurring in the future. The two events are connected in real time - the behavior engenders the reinforcement - and then the behavior occurs more frequently.

The reinforcer has to occur the very instant the behavior is taking place. In the instant, in real time, you, the learner, need to know that what you’re doing right now has won you a prize.

If you make a chart to keep track of your progress in some self-training program, you will be more likely to maintain new habits if you solidly fill in a little square every day on the chart, rather than just putting a check mark in the square.

Schopenhauer once said that every original idea is first ridiculed, then vigorously attacked, and finally taken for granted.

A reinforcer is anything that, occurring in conjunction with an act, tends to increase the probability that the act will occur again.

A positive reinforcer is something the subject wants, such as food, petting, pr praise. A negative reinforcer is something the subject wants to avoid - a blow, a frown, an unpleasant sound.

Simply offering positive reinforcement for a behavior is the most rudimentary part of reinforcement training.

In order to be reinforcing, the item chosen must be something the subject wants.

In our culture a man who has become observant about positive reinforcement has a great advantage over other men.

While a negative reinforcement is a useful process, it’s important to remember that each instance of negative reinforcement also contains a punisher.

A reinforcer must occur in conjunction with the act it is meant to modify.

There is a difference between trying to do something and doing it. Wails of “I can’t” may sometimes be a fact, but they may also be symptoms of being reinforced too often merely for trying. In general, giving gifts, promise, compliments, or whatever for behavior that hasn’t occurred yet does not reinforce that behavior in the slightest. What it does reinforce is whatever was occurring at the time: soliciting reinforcement, most likely.

If the negative reinforcer doesn’t cease the instant the desired result is achieved, it is neither reinforcing nor information. It becomes, both literally and in terms of information there, “noise”.

A conditioned reinforcer is some initially meaningless signal - a sound, a light, a motion - that is deliberately presented before or during the delivery of a reinforcer.

Similarly one can and should lavish children (and spouses, parents, lovers, and friends) with love and attention, unrelated to any particular behavior; but one should reserve praise, specifically, as a conditioned reinforcer related to something real.

The trick to making “No!” effective is to establish it as a conditioned negative reinforcer.

In order to maintain an already-learned behavior with some degree of reliability, it is not only not necessary to reinforce it every time; it is vista that you do not reinforce it on a regular basis but instead switch to using reinforcement only occasionally, and on a random or unpredictable basis.

Raise criteria in increments small enough that the subject always has a realistic chance of reinforcement.

How fast you can raise the criteria is a function of how well you are communicating through your shaping procedure what your rules are for gaining reinforcement.

Every time you raise a criterion, you are changing the rules. The subject has to be given the opportunity to discover that though the rules have changed, reinforcers can easily continue to be earned by an increase in exertion. This can be learned only by experiencing reinforcement at the new level.

Train one aspect of any particular behavior at a time; don’t try to shape for two criteria simultaneously.

If the task can be broken down into separate components, which are then shaped separately, the learning will go much faster.

Often when we seem to show no progress in a skill, no matter how much we practice, it is because we are trying to improve two or more things at once. Practice is not shaping. Repetition, by itself, may ingrain mistakes just as easily as improvements. One needs to think: Does this behavior have more than one attribute? Is there some way to break it down and work on different criteria separately? When you address both of these questions, many problems solve themselves.

During shaping, put the current level of response onto a variable schedule of reinforcement before adding or raising the criteria.

A variable schedule of reinforcement simply means that sometimes you reinforce a behavior and sometimes you don’t.

With positive reinforcement, on the other hand, not only is it not necessary to reinforce every correct response for a lifetime but it is crucial to the learning process to skip an occasional reinforcer.

The heart of the shaping procedure consists of selectively reinforcing some responses rather than others, so that the response improves, little by little, until it reaches a new goal. All behavior is variable; when you skip an expected reinforcer, the next behavior is likely to be somewhat different. Thus the skipped reinforcer enables you to select stronger or better responses.

Learning to tolerate an intermittent schedule makes the behavior - and other subsequent behaviors - more resistant to extinction.

When introducing a new criterion or aspect of the behavior skill, temporarily relax the old ones.

What is once learned is not forgotten, but under the pressure of assimilating new skill levels, old well-learned behavior sometimes falls apart temporarily. Getting used to new requirements temporarily interfered with previously learned behavior.

Stay ahead of your subject.

Don’t change trainers in midstream.

If one shaping procedure is not eliciting progress, try another.

It’s important to understand the principles and not just learn recipes. Everyone has a “method.” The principles govern what truly works.

Don’t interrupt a training session gratuitously; that constitutes a punishment.

If a learned behavior deteriorates, review the shaping.

Quit while you’re ahead.

The Clever Hans phenomena has now become the name for any circumstance in which apparently amazing behavior, ranging from animal intelligence to psychic phenomena, is actually unconsciously cued by some often-minute or faded behavior of the experimenter that has become a discriminative stimulus for the subject’s behavior.

There are two training problems in this game: The first is that the distance the dog does after the Frisbee must be shaped. The second is that the game is a behavior chain: First the dog chases the Frisbee, then the dig catches the Frisbee, then the dog brings the Frisbee back for another throw. So each behavior must be trained separately, and the last behavior in the chain, retrieving must be trained first.

You can teach retrieving over very short distances - indoors, even - with something easy to hold: an old sock, maybe. Hunting dogs almost do it spontaneously. Other breeds, such as bulldogs and boxers, may have to be carefully shaped to drop or give back the item, since they tend to prefer playing tug-of-war.

When the dog will carry things to you on cue and give them p, you shape catching the Frisbee. First you get the god all excited about the Frisbee, waving it around his face. You let him take it and have him give it back a few times, praising him madly for returning it, of course. Then you hold it in the air, let him have it when he leaps for it, and make him give it back. Then you toss it momentarily into the air and make a big fuss when he catches it. When he has the idea, you can start shaping the first behavior of the chain, the chase, by tossing the Frisbee up and out from you a few feet so the dog has to move off after it, to catch it. And now you are on your way to having a great Frisbee dog.

Humans learn to use punishment primarily to gain for ourselves the reward of being dominant.