I frequently run into trainers that tell me that they “use Positive Reinforcement.” They don’t. They think they do, but they don’t. They can’t. It’s actually not possible to use Positive Reinforcement or any of the other quadrants of the Operant Conditioning model defined in the behavioural research of B.F. Skinner, Keller Breland, and many others. It would be like saying that a criminal used “murder” to kill his victim.
Perhaps we should start with a brief introduction for those who aren’t familiar with the Operant Conditioning (OC) model. Operant Conditioning is a behaviour modification technique and it defines behaviour using two variables – whether the behaviour increases or decreases as a result of the training and whether something was added to the trainee’s environment or was removed. It seems to be an unfortunate accident of history that some confusing terms were used to define these two variables.
In the OC model, any behaviour that becomes more intense, more frequent, or more likely as a result of the training is said to be “Reinforced” and any behaviour that becomes less intense, less frequent, or less likely as a result of the training is said to have been “Punished.” So behaviour increases indicate Reinforcement and behaviour decreases indicate that Punishment of the behaviour has occurred. To make matters even more fuzzy, if you added something to the trainee’s environment (like food or petting), that is called “Positive” in the OC model but it really means “additive.” And if you remove something from the environment, that is called “Negative” in the OC model but it really means “subtractive.”
Confusion right from the start
It’s unfortunate that the initial researchers of Operant Conditioning chose to use words that already had common meanings slightly different from the way they used them. “Positive” usually means something good or pleasant and “negative” means kind of the opposite. “Reinforcement” is a pretty good term for describing what it does but “Punishment” is full of all kinds of meaning that doesn’t really line up with what OC means to communicate.
Here’s an example. If I say that I had trained my dog with “Positive Punishment”, someone unfamiliar with the OC model might think that I had reprimanded my dog in a cheerful or upbeat way. And they would not be surprised if the behaviour I “punished” didn’t actually decrease. That’s because the common understanding of the words “positive” and “punishment” don’t carry precise definitions.
It’s this linguistic fogginess that I believe is causing all kinds of problems in the dog world these days. The confusion goes beyond just the definition of the terms, there is also the problem of how the terms are used. The OC model can only work if you use it AFTER your dog’s behaviour has changed in some way. In other words, you can’t determine what aspect of Operant Conditioning has had an effect until after you have some change in your dog’s behaviour to examine.
Getting the horse in front of the cart
The basis of Operant learning is dictated by results. Whatever happens as a result of a dog’s behaviour will determine if they will be more or less likely to repeat the behaviour. If the result was the addition of something the dog wanted, it is likely (but not certain!) that the behaviour they just did will become more likely in the future. And that is an important distinction. We might intend to reinforce a behaviour but the results do not necessarily follow the intentions of the trainer.
Some of the confusion comes from the fact that it is very likely that the rewards we provide to try to reinforce or increase a behaviour are often successful. So it becomes a kind of short hand to say that we “use Positive Reinforcement” when we mean that we gave our dog a reward in response to a behaviour. The same is true if we remove our attention from our dog in order to decrease a behaviour; we may fall into shorthand by saying we “used Negative Punishment” to decrease the behaviour.
But here’s the rub – what if what we did in our training doesn’t actually have the effect on our dog that we intended? What if I give my dog a yummy treat every time she comes to me when I call her name but sometimes she responds to the call and other times she doesn’t. If her recall never gets better, she doesn’t come more frequently or more consistently, have I “used Positive Reinforcement” or did I just give my dog a treat? According to the strict definitions of science, if the behaviour didn’t increase, it could not be Positive Reinforcement! The same is true if I try to stop my dog from jumping up by yelling at her and pushing her down. I could say that I have used Positive Punishment (because I added the yelling and pushing to try to reduce the jumping behaviour) but if my dog continues to jump up with the same frequency, then I haven’t “punished” the jumping up in behavioural terms although it was my intention to “punish” her for jumping up! Do you see the confusion here?
Professional miscommunication
The last 15 years or so has seen a tremendous push to change how we think about dogs and dog training. A large part of that effort has been the introduction of animal and behavioural sciences to the dog training and dog owning communities. In our haste to advance, we may just have caught ourselves in a trap of our own making.
Perhaps we have simplified things too far. So far, in fact, that discussions and debates are arising about the usefulness and validity of Operant Conditioning. Many times these discussions can seem like arguments over religion – the equivalent of “how many angels can dance on the head of a pin?” In all of it, one very important point can get overlooked.
The usefulness of Operant Conditioning, at least as I learned it, for dog training is that it is a remarkably effective forensic tool. That is, it is a great system for helping us analyze how behaviours come about or diminish. Once I choose to begin referring to the OC terms and quadrants as if they are my intentions, that usefulness begins to break down. The reason is simple. Just like the examples I provided above, sometimes our intentions in training do not match the results. If I make the claim that I will use Positive Reinforcement to train a behaviour and that behaviour does not increase, then I have created a paradox. I may have indeed added rewards to our training session (the additive property of Positive in OC) but the behaviour did not increase so it cannot be Reinforcement!
Trainers often use the shorthand of saying that they will employ Positive Reinforcement for a simple and logical reason. They have used certain training techniques in the past and these have produced behaviours by Positive Reinforcement. Even so, that past experience does not guarantee that future uses for different behaviours or with different dogs will produce the same results.
Am I just splitting hairs?
Well, yes and no. I have a number of dog training colleagues with whom I can have conversations where we use this kind of shorthand without confusion. We are all on the same page and all have the same basic understanding of behavioural science and animal learning. We all know what we mean.
But here is that trap that I referred to earlier. Not everyone in the dog world is on the same page regarding behavioural science and animal learning. So when someone like myself, who has been using this kind of training for 13 years, has a conversation with someone new to the science and methods, things can get confusing. Unfortunately, most if not all of the disagreements I see between trainers can be traced back to different levels of understanding of the science or the incorrect use the of terms for the concepts they are trying to discuss.
The sciences of Operant Conditioning, Classical Conditioning, Ethology, Animal Learning, and Biology are complex and all of them should inform how we live and work with our dogs to a greater or lesser degree. But we do ourselves no favors by using shortcuts and shorthand in our efforts to make our point quickly. Sometimes it just takes time and effort to explain what we mean.
A counter-production of experts
I have talked in this column before about the need for the use of scientific terms in their proper context and using their proper definition. The positive training community is growing rapidly and, like a game of “Telephone”, as these concepts and techniques get passed from one dog trainer to another, it seems that the meanings are getting jumbled up. And nothing good will come of many people using the same words to mean different things.
The most obvious example of this problem that I see is Operant Conditioning. It is a framework designed to be used to determine how and why behaviour happened. That’s past tense! So all of this talk about why you can’t use Positive Reinforcement to teach good performance in agility or that we should never use Negative Reinforcement to address behaviour issues should just STOP.
Operant Conditioning is neutral. It tells us what happened so we can adjust how we are training and be more effective. I may intend to use Positive Reinforcment to teach my dog but if I do my training in an area she finds scary, I may actually end up with less of the behaviour than when I started – and that’s Positive Punishment.
As good trainers, we should be using and talking about Operant Conditioning in the proper context. One that helps us to define what happened from the observed results of our training session. Not what we intended or wanted to happen. Misrepresenting the science doesn’t do anything good. We confuse each other, we get frustrated when we think the science doesn’t work, and the critics of this kind of training have more to criticize.
Until next time, have fun with your dogs!
Be sure to check out our Canine Nation ebooks in the Canine Nation store and Dogwise. Join our conversation on Facebook in the Canine Nation Forum!
The NEW Canine Nation ebook is now available –
“Relationships: Life with Dogs”
Photo credits –
Training with science – Anne-Marie Visser copyright 2005 from Flickr
Teaching trainers – Andrea Arden copyright 2011 from Flickr
Diagram attribution –
Operant Conditioning Simplified – Eric Brad
Operant Conditioning Expanded – William S. Altman from this website
“The most obvious example of this problem that I see is Operant Conditioning. It is a framework designed to be used to determine how and why behaviour happened. That’s past tense!”
The reason you are getting flak on this article is because you ARE wrong, whether you intended to be or not. The Operant Conditioning model, is based on “Conditioning”, as in “to condition”, a verb. You are splitting hairs and trying to pick apart the perception of what this is, according to people who don’t know what it is.
Conditioning a behavior is “conditioning a behavior” through controlling the consequence of the behavior. It is not a “study” of what happened after the fact. That is observation, which is a completely different thing.
You CAN use “positive reinforcement” to attempt to reinforce a behavior. Whether or not you are successful in that particular attempt is a separate matter, as is whether or not you reinforced the desired behavior, or in fact inadvertently reinforced an undesirable behavior, but that does not make the model invalid, only that it wasn’t applied correctly.
Thanks for your comment Jason.
I think this line from your comment illustrates my point very well –
“You CAN use “positive reinforcement” to attempt to reinforce a behavior.” See that word “attempt”? That’s the problem. If you “attempt” to modify a behaviour and you make no change, you have not reinforced it. Period. End of story.
Having said that, you are certainly entitled to your views on the matter.
Also, please note that I am not in any way saying that the Operant Conditioning model is not valid. Quite the contrary! Not only is it valid, but, used correctly, it operates whether the trainer chooses to acknowledge its validity or not! It’s kind of like the laws of motion or the law of gravity. It is just THERE. You can choose to believe in it or not. But, as the popular meme says, science doesn’t give a shit what you believe!
Thanks for your thoughts!
Eric
Thank you for a very interesting read. I agree that most who use the term ‘positive reinforcement’ do not fully understand what it means, and are often trying to distinguish themselves from more traditional corrective methods. They tend to use ‘positive’ in the sense of its nice, kind, good and not painful or cruel, s opposed to the additive sense. I also think some use it to make it sound like they’re more knowledgeable than they are. I think the use of technical jargon can put the public off and simply use the term ‘reward based training’which they do understand (although have studied Learning Theory myself).
It’s very good to read this useful post on dog training.
I have a concern however. How do you train a younger dog?
There are lots of great books to get you started with your dog. I would recommend THIS book as a great start. There are lots of others.
I enjoy your work
This is a very convoluted load of dog poop. If you use positive reinforcement to get a certain behavior and you don’t get it you have not created a paradox except in a sense of linguistic gymnastics. You have simply failed in your reading of and reaction to the dog. Negative reinforcement is a made up concept. A dog is corrected not negatively reinforced, for not doing a behavior it has been taught. The article shows a tremendous lack of understanding of the nature of dog training.
Thank you, Dr. Payne, for your insightful, respectful, well-researched, and detailed rebuttal on a topic you obviously understand better than I do.
Seriously, anyone who still thinks you “correct” a dog is woefully behind in their reading of contemporary scientific research on canine cognitive abilities.
Thanks for taking the time to tell us about your thoughts.
Eric
Hi Eric,
I fear this article could further confound the understanding of how anyone can plan a training session, evaluate and then plan the very next session (tracking progress..). When training/teaching in such a way, we are purposefully *applying* Skinnerian quadrants.
To say trainers can’t “use positive reinforcement” (target a behavior to be positively reinforced) because we can’t tell that the session will go as planned assumes that trainers can neither observe and analyze behavior nor adjust (and abort) training sessions. This is not true.
We use (present tense), will use (future tense) and have used positive reinforcement.
Hi Chad –
Thanks for reading and thanks for your comments.
I believe that you actually said it best in your comment when you say you, “target a behavior to be positively reinforced.” We can be certain during the planning phases and even execution that we will use the “Positive” part of Positive Reinforcement. What remains in doubt until after the training is whether or not “Reinforcement” of that targeted behaviour has taken place.
You, like many others, will use the terms and language of science and training as you see fit. Each of us with operate within our particular communities more or less effectively. If this article has caused people to look more closely at the meaning and mechanics of Operant Conditioning and it’s application in dog training, I am satisfied.
It’s not really about policing the language but keeping us mindful of how the science and the processes work. Given your demonstrated understanding of modern dog training, I have no fear that your students or those you work with will have any trouble using Operant Conditioning effectively. Good trainers will communicate the ideas and the process. That’s more important that syntax.
Thanks again,
Eric
Um, not really. In psychology, we say we use positive reinforcement because we plan carefully, and call both the method and the result by the same name. We plan by knowing what is likely to be a pleasurable stimulus and timing it in such a way that it reinforces a behavior. This is different from giving a reward, we are using the known stimulus/response link to modify behavior. Skinner was decades ago, his terms have been coopted and his theories built upon. Positive reinforcement is the core of applied behavior analysis and other effective treatments for autism and related syndromes.
Plus, even if this were not the case, it would be a semantic argument that has little purpose. As long as you are using the term correctly (positive=presented stimulus, reinforcement=increase frequency), trainer or psychologist, people will know what you mean. Isn’t that the essence of language? Language depends only on understanding, not on anyone’s approval of the words used. When words like this are used frequently enough to mean something, they mean that thing whether or not they started out that way (look up “begging the question” or “enormity” in current dictionaries vs. 20 year old dictionaries).
Thanks for reading and thanks for the comments!
I guess we’d all better make sure we all know what we mean when we use the same words then, eh? With any luck, this article shed a little light on that and made it that much easier to assume we all know what we’re talking about.
All the best,
Eric
Nicely written reminder about the proper use of terminology and the unintended consequences of confusion. Do you have recommendations for appropriate language to use when describing training methodologies?
This article is the kind of argumentative nit-picky competitive ugliness that makes people afraid to dip their toes in the waters of science-based training. Someone is going to tell them they are wrong and everyone they know is wrong and then leave them in utter confusion. You absolutely can use the technique of positive reinforcement as a predictable process to change behavior. If you equip yourself with appetitive reinforcers for a training session, you just might be using positive reinforcement to modify behavior. But this article is about punishing people for not being as semantically “scientific” as the author thinks they should be. I am sure that the dogwhisperers and horsewhisperers will appreciate you driving people away from science-based training.
Hi Patricia –
Thank you for reading and considering.
I’m sorry you feel this way about this article. I think you might find that nearly all of the other 150+ articles I’ve written here at Life As A Human do, in fact, advocate and promote the use of science-based training.
I find it somewhat ironic that you suggest my motives in writing this are “about punishing people” and you open your comments by labelling it “argumentative”, “nit-picky”, and “competitive.” That said, we are all entitled to our interpretation. I guess Im just curious how you knew what my intentions were in writing it.
Thanks for taking the time to read and to offer your thoughts.
Eric
This is a great article on a topic thats bothered me for some time now.
However,
You say that this is “past tense” and that you don’t “use” behaviorism to train. its more of a framework for understanding.
But then you go onto make comments like “critics of this type of training” and ” So when someone like myself, who has been using this kind of training for 13 years” which seem to indicate the opposite of the assertion you are making.
If this science is not a training method “to be used” but rather a framework of understanding…. then what exactly are you saying you’ve been “using” for 13 years and what are the “critics” being critical of?
I actually very much agree with you. I have been saying for a while now that there is this odd sort of dogmatic allegiance to the “behaviorists” when it seems like no one actually understands what Behaviorism is. And we all have to back away from the argumentative, one side of the fence or the other stance, and realize that there are these different schools of thought in the science of behavior, learning, and psychology which we should use as tools for understanding. But they are not tools for training. they can simply inform our training.
But like I said… it seems that you are also falling into the linguistic trap that you so well pointed out. Either that or I’m misunderstanding you somewhere.
Either, or, some clarification would be helpful.
Stuart Says… I look forward to the response to your comments, I am a Behavior Analyst.
Great article, Thank you! I agree with many points you made, that are “disturbing” me for a long time now, as in reading/dealing in reality/argumenting in the “trainers”world. For me there is in plus a great gap, being french speaking ( switzerland) which addes for cultural driven appreciation of a paradigm.
In fact, I do not think we are or should work in the spirit of changing a behaviour….we should aim to give the dog a possibility to experiment other EMOTIONS in a difficult/ challenging situation to that dog, in that moment and situation….whatever the overwhelming stimulus is (over reacting in accepted/inaccepted from our society dog behaviours). In so far, I totally agree, that labelling any kind of training -more over based on quadrants who look retrospectively on a complex interactive networking, whatsoever science based it is- is unfortunate and misguiding.
Excellent Eric,
And these quadrants are also misused to demonize methods and the people who use them. Thorndike (Theory of Effect) and Skinner would be spinning in their graves seeing how these concepts have been moralised. They were never conceived to have anything to do with morals or ethics. That’s not to say, that one cannot apply ones own moral and ethical standards to training, but not with something that was never conceived to be a measure of that.
The next problem is, that while one can filter out all contributing possible unwanted influences in the lab and keep research very tidy, real life just isn’t that tidy. So you may have overlapping consequences going on, in other words -R and +R at the same time.
Nice job, Eric!
What an interesting article. I would also add that while, in scientific terms, it isn’t ‘positive reinforcement’ if we don’t see the behaviour we desire; this is not because we didn’t intend to offer positive reinforcement but rather we’ve reinforced something, but not the thing we intended to reinforce! Learning the theory parrot fashion is one thing, learning to apply it and be fully aware of all the variables in the individual animal, environment and your own ability to reinforce the right thing,at the right moment with the right intention is entirely another!
I think you make some good points here. I’m not involved in agility training so I am not familiar with talk about why you can’t use positive reinforcement to teach performance in agility (?) I do training classes for pet dogs and I use positive reinforcement and force-free methods. My students are not trainers and usually think of “positive” as “good”. I explain that pertaining to the quadrants of operant conditioning, positive means adding something (good or bad) and negative means removing something (good or bad) to increase or decrease a behavior. I point out that it is only reinforcement if the behavior increases and only punishment if the behavior decreases. Therefore, if the behavior we want doesn’t increase, we haven’t reinforced it. The problem is not with the theory, it is with the application of it. The reward is only a reinforcer if the dog thinks it is valuable and responds by repeating the behavior.
Maybe it is unfortunate that positive and negative have so many different definitions (Webster’s has about eight for positive depending on the context). I have named my blog “My Positive Dog Training Blog” which is actually using the adjective “positive” as defined as “having a good effect, favorable”. Victoria Stillwell’s “Positively” Training also uses the adverb for “favorably”.
Hi Linda –
Thanks for your comments!
I agree with you that it is the application or use of the terms that is often at fault and not the science of learning. It works just they way it is described and it’s the confusion in the language that sometimes makes people think it doesn’t work. I’m glad to hear that you make those distinctions for your students and repeat them as necessary to help them learn!
And yes, you can be a “positive” trainer in the classic definition of the word in that you are providing a “positive, upbeat, and fun” place to work with dogs! Nothing wrong with that at all!
Thanks for your comments.
Eric