I think that the most common reason individuals abandon positive reinforcement and clicker-based training is a lack of understanding of how to influence behavior. They have no idea how to properly change behavior. Because shaping does not work in their practice, they revert to what does, which might be corrections, enticing, or giving up. That makes perfect sense, even if shape can bring them where they want to go. They won’t understand unless they have success with molding for themselves.
Shaping is described as the training of new behaviors via differential reinforcement by methodically rewarding successive approximations toward the desired behavior. (The target behavior is the desired behavior.) To shape a behavior, first define the end behavior you wish to teach, and then examine the dog’s existing behavior. What else does he do that is even somewhat similar to the ultimate behavior?
Some terminology debate. The right phrase is shaping. Some individuals refer to the preceding definition as free shaping. The contrast they draw is that free shaping implies that you do not begin with any previously learned or enticed behavior, but rather shape the whole thing, while shaping implies that you begin with some previously learned behavior, lure, prompt, or otherwise initiate the behavior. In fact, shaping is a phrase for the explanation I provided above. What is often referred to as free shaping is just shaping. Shaping that begins with previously learnt, enticed, or targeted behavior is just shaping with something else going on in front of it. There is just one phrase to memorize. Shaping.
Assume you want to teach your dog to roll over, but he seldom does it on his own, so recording the action would take days of observation. I recently accomplished this with Pan, my Chinese Crested mix, so I’ll use that as an example. I determined that the desired behavior was for Pan to swiftly lie down on his side on the floor, roll over on his back, finish rolling onto the other side, and then get up onto his feet.
However, Pan did not do this on a regular basis. So I observed him for a bit and discovered that he often lies down on his chest. So thatâ€TMs what I began with. Pan received a click and reward as soon as he landed on his chest. But what now? He did it again, and I clicked a couple times. A consecutive approximation is when he lies on his chest. It is a habit that can get me a little closer to my goal of rolling all the way over.
Pan sank on his chest and stared at me after I clicked that act a few times. Nothing was done by me. I was expecting him to offer me a little more. What I was hoping for at this point was a bit of an extinction burst from him. When reinforcement that the learner has learned to anticipate does not occur, an extinction burst occurs. We utilize this to our advantage in shaping since the animal’s behavior will become more varied and intense as he seeks to show you, “SEE? Look at what I’m doing! “Do you not wish to click?” Usually, anything in that variable behavior may be clicked to get you closer to the goal behavior. When you click it, you stop clicking the object that was previously receiving clicks and allow the old behavior to drift away.
Pan lay on his chest and stared at me for a few seconds. He got up and lay straight back down, clearly thinking I hadn’t realized he’d done this lovely thing I enjoyed so much before. That did not work. So he wiggled a little to the side. CLICK! That was effective. That was the result of a fresh successive approximation. I only let him bask in his achievement a few of times before abandoning him. He rolled over onto his side. This was a fresh estimate. As each new estimate gained traction, the preceding one lost clicks and goodies. The prior approximation was put on the verge of extinction. The new approximation was rewarded with clicks and rewards. This is known as differential reinforcement.
Pan slumped over onto his side and lay there at this point. And as soon as he was doing it easily, it was no longer useful… But he knew that trying new things was the way to treat-land, so he peered behind him… CLICK!
To be entirely honest, I was expecting him to roll onto his back at this point. This is part of the shaping process. You must be willing to accept anything he offers, which takes practice. It is simple to practice. Simply do it. Remember that your conduct is being formed as he learns. You will learn what to look for, what to click on, and how to become a better clicker trainer. It’s okay if you miss that small movement of the head over the shoulder. Next time, catch it. It’s not a huge deal if you mistakenly click the incorrect item. Give the reward and continue training. You’ll give him a LOT more opportunities to do it properly.
If Pan had stopped succeeding and become stuck, I would simply go back to the last approximation with which he had been successful and train from there.
It wasn’t long until he was sliding onto his back, and once there, he rapidly… and possibly by mistake… turned onto his other side. Treat yourself by clicking here! I had to be prepared to accept a larger estimate as well as come up with minor approximations that could be integrated into the intended behavior.
So I now have a dog that nearly rolls over. And here is where many trainers run into difficulty. Should I continue training since things are going well, or should I stop? I made the decision to resign. I had other commitments, and he had been quite successful. So I ended the session.
Aside from the distribution of reinforcers, I had not touched Pan throughout that session.
Is it always necessary to give up on a triumph like that? It can’t harm, right? I’m not sure the data is strong enough to conclude, “Yes, you must always leave on a great note!” However, I believe it makes the trainer more eager to get started the following day, and it may have the similar impact on La Pooch. As a result, I like to stop my sessions when everyone is satisfied.
So we picked off where we left off the following day, but Pan didn’t immediately start rolling nearly all the way across. So be it. No huge deal. I backed up and flopped onto his side. Then I tipped onto his back and clicked. Then I moved my mouse all the way across. It didn’t take him long to come back up to speed. We were back where we left off the day before in around 5-6 clicks.
So, what now? He walked over to the other side and lay down, staring at me. And jerked his head. I wiggled. Finally, he raised his head… CLICK! He stood up on his feet at the finish of the roll over in only a few clicks… that was my goal behavior! Yeah!!! I offered him a massive jackpot. (Do jackpots truly perform better than regular-sized treats in strengthening behavior? I’m not sure. They make me happy. Because I am a member of the training team, my view on the process is also important.)
Stimulus Control- Getting it Right on Time
However, the training was not completed. I have to add the cue now. I didn’t want Pan to roll over at random. I simply wanted him to roll over when I said “roll over.” So, as Pan was ready to finish rolling over, I said, “Roll over,” and followed up with the click and reward for the completed behavior. I delivered the signal when he was halfway through the roll over. Gradually, I said the cue earlier and earlier until I could say it before he executed the action and he dropped and rolled. Yes. Now I had a dog that would roll over whenever I uttered it.
Did I, or didn’t I? In front of the fireplace, I had a dog that would turn over when I yelled “roll over!” When I told him to roll over in front of the TV in the family room, he stood on his rear legs and danced, sat, rested on his chest, and spun around in a circle. He tried everything except turn over.
I was unsurprised! Dogs may be pretty particular at times. As far as he was concerned, the entire “turn over” notion was tied to the fireplace area in the library. I was training there since the lighting was ideal for videotaping. So my poochikins had to learn that “turn over” required the same conduct in the family room as it did in the library. This necessitated another backup and restart. Of course, just as when I had stopped exercising for the night, this training session flew by.
It will most likely take numerous small shaping sessions in various places of the home until Pan discovers that the words “turn over” always command the same response regardless of where we are. We will declare his behavior has generalized when he reacts to the cue “turn over” with the target behavior I taught him in new settings where it has not been expressly trained.
Shaping is at the core of effective clicker training… or for clicker-free positive reinforcement training It requires practice, but practice is inexpensive.