A famous dog trainer once said, “The only dog that doesn’t behave is a dead dog.” That’s not a threat. It’s a simple statement of fact. Our dogs are always making choices and acting on them – whether or not we choose to acknowledge that. Even when our dogs are not doing anything, perhaps waiting in a “Stay” or a “Down”, they are still actively refraining from other behaviours. They could be doing something else but they don’t. You might wonder how our dogs choose those behaviours.
Behavioural science defines two general categories of behaviour. Respondent behaviours develop because of an association with a particular situation or stimulus. The fact that our dogs know to “Sit” or “Come” in response to those specific words is an example of Respondent behaviour. Operant behaviours develop because of “cause and effect.” Literally, the dog learns to repeat or not repeat a behaviour based on what response that behaviour produces. We teach our puppies to sit by giving them a food treat when they put their butt on the floor. Since the food is something the puppy wants, it will make them more likely to sit again in hopes of getting another food treat.
Respondent or Classical Conditioning is great for putting names to behaviours with our dogs. When we want our dog to lie down or come to us, we need some kind of unique signal that we want a specific behaviour. When we’ve said “Sit!” enough times while our dog sits, an association forms and we have a cue or command.
But in order to have a behaviour to put a name to, we have to teach the dog what “sitting” would look like. Operant Conditioning works great for showing a dog that when they do a particular thing, they will either be rewarded (encouraged) or punished (discouraged) as a result. Rewarding the behaviours we want can help to teach a dog remarkably quickly.
All rewards are not the same
To encourage a behaviour, we want to provide rewards. But what exactly is a “reward” to our dogs? We might think we know but it’s really up to my dog. Does she like the reward I’m offering? Do I need to reward her every single time she performs the behaviour? Once she knows what the behaviour is, can I stop rewarding? Can I use different rewards and get the same results? Does it matter how quickly I provide the rewards? There seem to be many questions about rewards.
Since rewards can have different effects on behaviour, science has created some detailed terminology to help us understand the mechanics of providing rewards. While we don’t need to get into all of the technical details, it is useful to understand some basics of reinforcement mechanics when working with our dogs.
Reinforcement Rate schedules are concerned with when we reward our dog. Do we reward them every time they give us the desired behaviour correctly or do we skip some behaviours before rewarding again?
Differential schedules are concerned with why we reward our dog. Did they do the behaviour as required to earn the reward? Did they do it fast enough? Did they do it long enough? We reward or not based on the criteria we set for a particular behaviour.
Using that process of “did they do it right” (Differential) and “do I reward it this time” (Reinforcement Rate) we can encourage or train a behaviour. How well that behaviour is learned or how quickly will depend on how we apply those two factors.
Slot machines and soda pop machines
People seem to want to get away from consistently rewarding their dog. It is a common debate in dog training whether it is more effective to provide a Continuous rate of reinforcement (rewarding every behaviour when it is performed) or an Intermittent rate of reinforcement (rewarding only some of the cases when the behaviour is performed). One of the analogies used by dog trainers to describe this compares a slot machine (which only pays off at random intervals) versus a soda pop machine (which delivers every time the money is deposited). Some trainers suggest that Intermittent schedules will create more durable or persistent behaviours like addicting slot machines. But that’s not necessarily true.
Although results from psychology labs have shown Intermittent reinforcement to produce faster and more persistent behaviour with some pigeons and rats in various studies, these results don’t necessarily translate to our work with dogs. Often the animals used in these studies have strictly controlled diets which would make them much more committed to getting food rewards than the average dog. What the studies don’t consider when looking at Intermittent versus Continuous reinforcement is the dependability of the person delivering the rewards.
Do we consider whether humans “trust” soda pop machines more than they do slot machines? The fact is not everyone will use a slot machine. I don’t play slot machines because I don’t find the potential rewards worth taking the risk of spending my money. I am certain that there are dogs out there making the same choice. If your Intermittent reinforcement is too intermittent, it becomes too risky and your dog may stop wanting to respond.
Is it really “Intermittent”?
By definition, “Intermittent” reinforcement means that sometimes we will provide no reward at all in response to our dog giving us the right behaviour. That would mean no praise, no petting, no acknowledgement for having done the behaviour. But is this really what dog owners do? I don’t think so.
World renowned animal trainer Bob Bailey is fond of saying “Pavlov is always on our shoulder” when talking about animal training. And this is a case where Respondent or Classical Conditioning (pioneered by Pavlov) is working in the background whether we realize it or not. Our dogs are always forming associations, both favourable and unfavourable, to various things in their environment.
One example of this is the Premack Principle. In 1959, psychologist David Premack showed that performing a more familiar (and more often rewarded) behaviour can act as a reinforcer for a less familiar behaviour. The opportunity to perform the familiar behaviour had some value to the subjects in the experiments.
The reinforcing properties of those familiar behaviours came about because of the associations the subjects had made with being rewarded for the behaviours in the past. The same kinds of associations can be created with things like saying “Good Dog!” or throwing a toy for our dog or physical affection. If we pair those things with regular rewards (food), they can take on rewarding values of their own.
If Rate schedules are when we reward and Differential schedules are why we reward, then the relative values of the rewards we use affects how much we reinforce with each reward we deliver. This is where things can get tricky. Not every dog values the same things in the same way. Some dogs may prefer the taste of chicken to the taste of liver and vice versa. Similarly, some dogs may really enjoy chasing a ball where others just don’t care.
To complicate matters, the number of times we use neutral things like saying “Good Dog” or petting our dog in conjunction with food can cause these actions to take on value themselves. How much value they take on can only be determined by watching your dog’s reaction to them. Generally, the more you pair the neutral thing with a high value reward, the greater value that previously neutral thing will take on.
The reason “Good Dog!” works so well for most people is that it is almost a reflex for us to praise our dog as we deliver a reward. We do it so often that we are mostly unaware of it. The danger is that we mistakenly think that the words themselves have some intrinsic value. They don’t. It is only because they have been associated with other high value rewards in the past that they have any value at all. If we stop pairing “Good Dog!” with a high value treat, over time the words by themselves will start to lose their value and be less effective.
Formulas for success
In working with our dogs, we have two things to do. First we have to teach them the behaviours we want and then we have to maintain their understanding and willingness to perform those behaviours. Using rewards of various kinds can be tremendously useful in doing both of these things. But using rewards effectively can be tricky. There are practical aspects like not always having food with us and also understanding when to use which kind of reinforcement for the job at hand.
We have some simple guidelines in our training to helps us make good training choices:
Reinforcement rate – The consistency of our rewards depends on how well our dogs know the behaviour. New behaviours or behaviours that are being refreshed get consistent reinforcement each time the behaviour is offered. More practiced and well known behaviours can get more intermittent rewards but always with an eye toward making sure that we are maintaining the level of performance we want. Decreasing performance means we need to increase our reinforcement rate.
Reward values – Similar to what we do with our rate of rewards, we use the highest value rewards (food treats) for the least familiar behaviours. When teaching something new or refreshing a slacking behaviour we use the rewards that our dogs love the most. Coupling the high value with a high rate of reinforcement gives us the best results. We reserve those associated or conditioned rewards for well practiced, well known behaviours.
The combination of our rate of rewards and the value of rewards is adjusted based on what we are doing with our dogs. A lower rate of reward with higher value rewards can produce good results. A higher rate of reward with a lower value reward can also produce good results. Of course, a high rate of reward used with a high value reward produces the strongest results. And this is where the “art” of behavioural science comes into play. No two dogs will respond to reward schedules and values in exactly the same way.
The best trainers know their dogs. They understand what their dog values and what rates of reinforcement are needed for different levels of behaviour. A good understanding of the mechanics of reinforcement can help you to be a more effective trainer but only if you are willing to really see how your dog is responding to your rewards. Many people mistakenly believe that their dog should work a certain way (like training without food or just for praise). Our dogs work the way they work and it’s up to us to discover how to work with our own dogs. Training each dog is an adventure in itself. Use the science and be a better trainer!
Until next time, have fun with your dogs!
The first Canine Nation ebooks are now available –
“Dogs: As They Are” & “Teaching Dogs: Effective Learning”
Photo credits –