Topics covered: Second derivative test; boundaries and infinity
Instructor: Prof. Denis Auroux
Lecture Notes - Week 4 Summary (PDF)
The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or to view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu. So, today we are going to continue looking at critical points, and we'll learn how to actually decide whether a typical point is a minimum, maximum, or a saddle point. So, that's the main topic for today. So, remember yesterday, we looked at critical points of functions of several variables. And, so a critical point functions, we have two values, x and y. That's a point where the partial derivatives are both zero. And, we've seen that there's various kinds of critical points. There's local minima. So, maybe I should show the function on this contour plot,there is local maxima, which are like that. And, there's saddle points which are neither minima nor maxima. And, of course, if you have a real function, then it would be more complicated. It will have several critical points. So, this example here, well, you see on the plot that there is two maxima. And, there is in the middle, between them, a saddle point. And, actually, you can see them on the contour plot. On the contour plot, you see the maxima because the level curves become circles that now down and shrink to the maximum. And, you can see the saddle point because here you have this level curve that makes a figure eight. It crosses itself. And, if you move up or down here, so along the y direction, the values of the function will decrease. Along the x direction, the values will increase. So, you can see usually quite easily where are the critical points just by looking either at the graph or at the contour plots. So, the only thing with the contour plots is you need to read the values to tell a minimum from a maximum because the contour plots look the same. Just, of course, in one case, the values increase, and in another one they decrease. So, the question -- -- is, how do we decide -- -- between the various possibilities? So, local minimum, local maximum, or saddle point. So, and, in fact, why do we care? Well, the other question is how do we find the global minimum/maximum of a function? So, here what I should point out, well, first of all, to decide where the function is the largest, in general you'll have actually to compare the values. For example, here, if you want to know, what is the maximum of this function? Well, we have two obvious candidates. We have this local maximum and that local maximum. And, the question is, which one is the higher of the two? Well, in this case, actually, there is actually a tie for maximum. But, in general, you would have to compute the function at both points, and compare the values if you know that it's three at one of them and four at the other. Well, four wins. The other thing that you see here is if you are looking for the minimum of this function, well, the minimum is not going to be at any of the critical points. So, where's the minimum? Well, it looks like the minimum is actually out there on the boundary or at infinity. So, that's another feature. The global minimum or maximum doesn't have to be at a critical point. It could also be, somehow, on the side in some limiting situation where one variable stops being in the allowed rang of values or goes to infinity. So, we have to actually check the boundary and the infinity behavior of our function to know where, actually, the minimum and maximum will be. So, in general, I should point out, these should occur either at the critical point or on the boundary or at infinity. So, by that, I mean on the boundary of a domain of definition that we are considering. And so, we have to try both. OK, but so we'll get back to that. For now, let's try to focus on the question of, you know, what's the type of the critical point? So, we'll use something that's known as the second derivative test. And, in principle, well, the idea is kind of similar to what you do with the function of one variable, namely, the function of one variable. If the derivative is zero, then you know that you should look at the second derivative. And, that will tell you whether it's curving up or down whether you have a local max and the local min. And, the main problem here is, of course, we have more possible situations, and we have several derivatives. So, we have to think a bit harder about how we'll decide. But, it will again involve the second derivative. OK, so let's start with just an easy example that will be useful to us because actually it will provide the basis for the general method. OK, so we are first going to consider a case where we have a function that's actually just quadratic. So, let's say I have a function, W of (x,y) that's of the form ax^2 bxy cy^2. OK, so this guy has a critical point at the origin because if you take the derivative with respect to x, well, and if you plug x equals y equals zero, you'll get zero, and same with respect to y. You can also see, if you try to do a linear approximation of this, well, all these guys are much smaller than x and y when x and y are small. So, the linear approximation, the tangent plane to the graph is really just w=0. OK, so, how do we do it? Well, yesterday we actually did an example. It was a bit more complicated than that, but let me do it, so remember, we were looking at something that started with x^2 2xy 3y^2. And, there were other terms. But, let's forget them now. And, what we did is we said, well, we can rewrite this as (x y)^2 2y^2. And now, this is a sum of two squares. So, each of these guys has to be nonnegative. And so, the origin will be a minimum. Well, it turns out we can do something similar in general no matter what the values of a, b, and c are. We'll just try to first complete things to a square. OK, so let's do that. So, in general, well, let me be slightly less general, and let me assume that a is not zero because otherwise I can't do what I'm going to do. So, I'm going to write this as a times x^2 plus b over axy. And then I have my cy^2. And now this looks like the beginning of the square of something, OK, just like what we did over there. So, what is it the square of? Well, you'd start with x plus I claim if I put b over 2a times y and I square it, then see the cross term two times x times b over 2a y will become b over axy. Of course, now I also get some y squares out of this. How many y squares do I get? Well, I get b^2 over 4a^2 times a. So, I get b2 over 4a y^2. So, and I want, in fact, c times y^2. So, the number of y^2 that I should add is c minus b^2 over 4a. OK, let's see that again. If I expand this thing, I will get ax^2 plus a times b over 2a times 2xy. That's going to be my bxy. But, I also get b^2 over 4a^2 y^2 times a. That's b^2 over 4ay^2. And, that cancels out with this guy here. And then, I will be left with cy^2. OK, do you see it kind of? OK, if not, well, try expanding this square again. OK, maybe I'll do it just to convince you. But, so if I expand this, I will get A times, let me put that in a different color because you shouldn't write that down. It's just to convince you again. So, if you don't see it yet, let's expend this thing. We'll get a times x^2 plus a times 2xb over 2ay. Well, the two A's cancel out. We get bxy plus a times the square of that's going to be b^2 over 4a^2 y^2 plus cy^2 minus b^2 over 4ay^2. Here, the a and the a simplifies, and now these two terms simplify and give me just cy^2 in the end. OK, and that's kind of unreadable after I've canceled everything, but if you follow it, you see that basically I've just rewritten my initial function. OK, is that kind of OK? I mean, otherwise there's just no substitute. You'll have to do it yourself, I'm afraid. OK, so, let me continue to play with this. So, I'm just going to put this in a slightly different form just to clear the denominators. OK, so, I will instead write this as one over 4a times the big thing. So, I'm going to just put 4a^2 times x plus b over 2ay squared. OK, so far I have the same thing as here. I just introduced the 4a that cancels out, plus for the other one, I'm just clearing the denominator. I end up with (4ac-b^2)y^2. OK, so that's a lot of terms. But, what does it look like? Well, it looks like, so we have some constant factors, and here we have a square, and here we have a square. So, basically, we've written this as a sum of two squares, well, a sum or a difference of two squares. And, maybe that's what we need to figure out to know what kind of point it is because, see, if you take a sum of two squares, that you will know that each square takes nonnegative values. And you will have, the function will always take nonnegative values. So, the origin will be a minimum. Well, if you have a difference of two squares that typically you'll have a saddle point because depending on whether one or the other is larger, you will have a positive or a negative quantity. OK, so I claim there's various cases to look at. So, let's see. So, in fact, I claim there will be three cases. And, that's good news for us because after all, we want to distinguish between three possibilities. So, let's first do away with the most complicated one. What if 4ac minus b^2 is negative? Well, if it's negative, then it means what I have between the brackets is, so the first guy is obviously a positive quantity, while the second one will be something negative times y2. So, it will be a negative quantity. OK, so one term is positive. The other is negative. That tells us we actually have a saddle point. We have, in fact, written our function as a difference of two squares. OK, is that convincing? So, if you want, what I could do is actually I could change my coordinates, have new coordinates called u equals x b over 2ay, and v, actually, well, I could keep y, and that it would look like the difference of squares directly. OK, so that's the first case. The second case is where 4ac-b^2 = 0. Well, what happens if that's zero? Then it means that this term over there goes away. So, what we have is just one square. OK, so what that means is actually that our function depends only on one direction of things. In the other direction, it's going to actually be degenerate. So, for example, forget all the clutter in there. Say I give you just the function of two variables, w equals just x^2. So, that means it doesn't depend on y at all. And, if I try to plot the graph, it will look like, well, x is here. So, it will depend on x in that way, but it doesn't depend on y at all. So, what the graph looks like is something like that. OK, basically it's a valley whose bottom is completely flat. So, that means, actually, we have a degenerate critical point. It's called degenerate because there is a direction in which nothing happens. And, in fact, you have critical points everywhere along the y axis. Now, whether the square that we have is x or something else, namely, x plus b over 2a y, it doesn't matter. I mean, it will still get this degenerate behavior. But there's a direction in which nothing happens because we just have the square of one quantity. I'm sure that 300 students means 300 different ring tones, but I'm not eager to hear all of them, thanks. [LAUGHTER] OK, so, this is what's called a degenerate critical point, and [LAUGHTER]. OK, so basically we'll leave it here. We won't actually try to figure out further what happens, and the reason for that is that when you have an actual function, a general function, not just one that's quadratic like this, then there will actually be other terms maybe involving higher powers, maybe x^3 or y^3 or things like that. And then, they will mess up what happens in this valley. And, it's a situation where we won't be able, actually, to tell automatically just by looking at second derivatives what happens. See, for example, in a function of one variable, if you have just a function of one variable, say, f of x equals x to the five, well, if you try to decide what type of point the origin is, you're going to take the second derivative. It will be zero, and then you can conclude. Those things depend on higher order derivatives. So, we just won't like that case. We just won't try to figure out what's going on here. Now, the last situation is if 4ac-b^2 is positive. So, then, that means that actually we've written things. The big bracket up there is a sum of two squares. So, that means that we've written w as one over 4a times plus something squared plus something else squared, OK? So, these guys have the same sign, and that means that this term here will always be greater than or equal to zero. And that means that we should either have a maximum or minimum. How we find out which one it is? Well, we look at the sign of a, exactly. OK? So, there's two sub-cases. One is if a is positive, then, this quantity overall will always be nonnegative. And that means we have a minimum, OK? And, if a is negative on the other hand, so that means that we multiply this positive quantity by a negative number, we get something that's always negative. So, zero is actually the maximum. OK, is that clear for everyone? Yes? Sorry, yeah, so I said in the example w equals x^2, it doesn't depend on y. So, the more general situation is w equals some constant. Well, I guess it's a times (x b over 2a times y)^2. So, it does depend on x and y, but it only depends on this combination. OK, so if I choose to move in some other perpendicular direction, in the direction where this remains constant, so maybe if I set x equals minus b over 2a y, then this remains zero all the time. So, there's a degenerate direction in which I stay at the minimum or maximum, or whatever it is that I have. OK, so that's why it's called degenerate. There is a direction in which nothing happens. OK, yes? Yes, yeah, so that's a very good question. So, there's going to be the second derivative test. Why do not have derivatives yet? Well, that's because I've been looking at this special example where we have a function like this. And, so I don't actually need to take derivatives yet. But, secretly, that's because a, b, and c will be the second derivatives of the function, actually, 2a, b, and 2c. So now, we are going to go to general function. And there, instead of having these coefficients a, b, and c given to us, we'll have to compute them as second derivatives. OK, so here, I'm basically setting the stage for what will be the actual criterion we'll use using second derivatives. Yes? So, yeah, so what you have a degenerate critical point, it could be a degenerate minimum, or a degenerate maximum depending on the sign of a. But, in general, once you start having functions, you don't really know what will happen anymore. It could also be a degenerate saddle, and so on. So, we won't really be able to tell. Yes? It is possible to have a degenerate saddle point. For example, if I gave you x^3 y^3, you can convince yourself that if you take x and y to be negative, it will be negative. If x and y are positive, it's positive. And, it has a very degenerate critical point at the origin. So, that's a degenerate saddle point. We don't see it here because that doesn't happen if you have only quadratic terms like that. You need to have higher-order terms to see it happen. OK. OK, so let's continue. Before we continue, but see, I wanted to point out one small thing. So, here, we have the magic quantity, 4ac minus b^2. You've probably seen that before in your life. Yet, it looks like the quadratic formula, except that one involves b^2-4ac. But that's really the same thing. OK, so let's see, where does the quadratic formula come in here? Well, let me write things differently. OK, so we've manipulated things, and got into a conclusion. But, let me just do a different manipulation, and write this now instead as y^2 times a times x over y squared plus b(x over y) plus c. OK, see, that's the same thing that I had before. Well, so now this quantity here is always nonnegative. What about this one? Well, of course, this one depends on x over y. It means it depends on which direction you're going to move away from the origin, which ratio between x and y you will consider. But, I claim there's two situations. One is, so, let's try to reformulate things. So, if a discriminate here is positive, then it means that these have roots and these have solutions. And, that means that this quantity can be both positive and negative. This quantity takes positive and negative values. One way to convince yourself is just to, you know, plot at^2 bt c. You know that there's two roots. So, it might look like this, or might look like that depending on the sign of a. But, in either case, it will take values of both signs. So, that means that your function will take values of both signs. The value takes both positive and negative values. And, so that means we have a saddle point, while the other situation, when b^2-4ac is negative -- -- means that this equation is quadratic never takes the value, zero. So, it's always positive or it's always negative, depending on the sign of a. So, the other case is if b^2-4ac is negative, then the quadratic doesn't have a solution. And it could look like this or like that depending on whether a is positive or a is negative. So, in particular, that means that ax over y2 plus bx over y plus c is always positive or always negative depending on the sign of a. And then, that tells us that our function, w, will be always positive or always negative. And then we'll get a minimum or maximum. OK, we'll have a min or a max depending on which situation we are in. OK, so that's another way to derive the same answer. And now, you see here why the discriminate plays a role. That's because it exactly tells you whether this quadratic quantity has always the same sign, or whether it can actually cross the value, zero, when you have the root of a quadratic. OK, so hopefully at this stage you are happy with one of the two explanations, at least. And now, you are willing to believe, I hope, that we have basically a way of deciding what type of critical point we have in the special case of a quadratic function. OK, so, now what do we do with the general function? Well, so in general, we want to look at second derivatives. OK, so now we are getting to the real stuff. So, how many second derivatives do we have? That's maybe the first thing we should figure out. Well, we can take the derivative first with respect to x, and then again with respect to x. OK, that gives us something we denote by partial square f over partial x squared or fxx. Then, there's another one which is fxy, which means you take the derivative with respect to x, and then with respect to y. Another thing you can do, is do first derivative respect to y, and then with respect to x. That would be fyx. Well, good news. These are actually always equal to each other. OK, so it's the fact that we will admit, it's actually not very hard to check. So these are always the same. We don't need to worry about which one we do. That's one computation that we won't need to do. We can save a bit of effort. And then, we have the last one, namely, the second partial with respect to y and y fyy. OK, so we have three of them. So, what does the second derivative test say? It says, say that you have a critical point (x0, y0) of a function of two variables, f, and then let's compute the partial derivatives. So, let's call capital A the second derivative with respect to x. Let's call capital B the second derivative with respect to x and y. And C equals fyy at this point, OK? So, these are just numbers because we first compute the second derivative, and then we plug in the values of x and y at the critical point. So, these will just be numbers. And now, what we do is we look at the quantity AC-B^2. I am not forgetting the four. You will see why there isn't one. So, if AC-B^2 is positive, then there's two sub-cases. If A is positive, then it's local minimum. The second case, so, still, if AC-B^2 is positive, but A is negative, then it's going to be a local maximum. And, if AC-B^2 is negative, then it's a saddle point, and finally, if AC-B^2 is zero, then we actually cannot compute. We don't know whether it's going to be a minimum, a maximum, or a saddle. We know it's degenerate in some way, but we don't know what type of point it is. OK, so that's actually what you need to remember. If you are formula oriented, that's all you need to remember about today. But, let's try to understand why, how this comes out of what we had there. OK, so, I think maybe I actually want to keep, so maybe I want to keep this middle board because it actually has, you know, the recipe that we found before the quadratic function. So, let me move directly over there and try to relate our old recipe with the new. OK, you are easily amused. OK, so first, let's check that these two things say the same thing in the special case that we are looking at. OK, so let's verify in the special case where the function was ax^2 bxy cy^2. So -- Well, what is the second derivative with respect to x and x? If I take the second derivative with respect to x and x, so first I want to take maybe the derivative with respect to x. But first, let's take the first partial, Wx. That will be 2ax by, right? So, Wxx will be, well, let's take a partial with respect to x again. That's 2a. Wxy, I take the partial respect to y, and we'll get b. OK, now we need, also, the partial with respect to y. So, Wy is bx 2cy. In case you don't believe what I told you about the mixed partials, Wyx, well, you can check. And it's, again, b. So, they are, indeed, the same thing. And, Wyy will be 2c. So, if we now look at these quantities, that tells us, well, big A is two little a, big B is little b, big C is two little c. So, AC-B^2 is what we used to call four little ac minus b2. OK, ooh. [LAUGHTER] So, now you can compare the cases. They are not listed in the same order just to make it harder. So, we said first, so the saddle case is when AC-B^2 in big letters is negative, that's the same as 4ac-b2 in lower case is negative. The case where capital AC-B2 is positive, local min and local max corresponds to this one. And, the case where we can't conclude was what used to be the degenerate one. OK, so at least we don't seem to have messed up when copying the formula. Now, why does that work more generally than that? Well, the answer that is, again, Taylor approximation. Aww. OK, so let me just do here quadratic approximation. So, quadratic approximation tells me the following thing. It tells me, if I have a function, f of xy, and I want to understand the change in f when I change x and y a little bit. Well, there's the first-order terms. There is the linear terms that by now you should know and be comfortable with. That's fx times the change in x. And then, there's fy times the change in y. OK, that's the starting point. But now, of course, if x and y, sorry, if we are at the critical point, then that's going to be zero at the critical point. So, that term actually goes away, and that's also zero at the critical point. So, that term also goes away. OK, so linear approximation is really no good. We need more terms. So, what are the next terms? Well, the next terms are quadratic terms, and so I mean, if you remember the Taylor formula for a function of a single variable, there was the derivative times x minus x0 plus one half of a second derivative times x-x0^2. And see, this side here is really Taylor approximation in one variable looking only at x. But of course, we also have terms involving y, and terms involving simultaneously x and y. And, these terms are fxy times change in x times change in y plus one half of fyy(y-y0)^2. There's no one half in the middle because, in fact, you would have two terms, one for xy, one for yx, but they are the same. And then, if you want to continue, there is actually cubic terms involving the third derivatives, and so on, but we are not actually looking at them. And so, now, when we do this approximation, well, the type of critical point remains the same when we replace the function by this approximation. And so, we can apply the argument that we used to deduce things in the quadratic case. In fact, it still works in the general case using this approximation formula. So -- The general case reduces to the quadratic case. And now, you see actually why, well, here you see, again, how this coefficient which we used to call little a is also one half of capital A. And same here: this coefficient is what we call capital B or little b, and this coefficient here is what we called little c or one half of capital C. And then, when you replace these into the various cases that we had here, you end up with the second derivative test. So, what about the degenerate case? Why can't we just say, well, it's going to be a degenerate critical point? So, the reason is that this approximation formula is reasonable only if the higher order terms are negligible. OK, so in fact, secretly, there's more terms. This is only an approximation. There would be terms involving third derivatives, and maybe even beyond that. And, so it is not to generate case, they don't actually matter because the shape of the function, the shape of the graph, is actually determined by the quadratic terms. But, in the degenerate case, see, if I start with this and I add something even very, very small along the y axis, then that can be enough to bend this very slightly up or slightly down, and turn my degenerate point in to either a minimum or a saddle point. And, I won't be able to tell until I go further in the list of derivatives. So, in the degenerate case, what actually happens depends on the higher order derivatives. So, we will need to analyze things more carefully. Well, we're not going to bother with that in this class. So, we'll just say, well, we cannot compute, OK? I mean, you have to realize that in real life, you have to be extremely unlucky for this quantity to end up being exactly 0. [LAUGHTER] Well, if that happens, then what you should do is maybe try by inspection. See if there's a good reason why the function should always be positive or always be negative, or something. Or, you know, plot it on a computer and see what happens. But, otherwise we can't compute. OK, so let's do an example. So, probably I should leave this on so that we still have the test with us. And, instead, OK, so I'll do my example here. OK, so just an example. Let's look at f of (x, y) = x y 1/xy, where x and y are positive. So, I'm looking only at the first quadrant. OK, I mean, I'm doing this because I don't want the denominator to become zero. So, I'm just looking at the situation. So, let's look first for, so, the question will be, what are the minimum and maximum of this function? So, the first thing we should do to answer this question is look for critical points, OK? So, for that, we have to compute the first derivatives. OK, so fx is one minus one over x^2y, OK? Take the derivative of one over x, that's negative one over x^2. And, we'll want to set that equal to zero. And fy is one minus one over xy^2. And, we want to set that equal to zero. So, what are the equations we have to solve? Well, I guess x^2y equals one, I mean, if I move this guy over here I get one over x^2y equals one. That's x^2y equals one, and xy^2 equals one. What do you get by comparing these two? Well, x and y should both be, OK, so yeah, I agree with you that one and one is a solution. Why is it the only one? So, first, if I divide this one by that one, I get x over y equals one. So, it tells me x equals y. And then, if x equals y, then if I put that into here, it will give me y^3 equals one, which tells me y equals one, and therefore, x equals one as well. OK, so, there's only one solution. There's only one critical point, which is going to be (1,1). OK, so, now here's where you do a bit of work. What do you think of that critical point? OK, I see some valid votes. I see some, OK, I see a lot of people answering four. [LAUGHTER] that seems to suggest that maybe you haven't completed the second derivative yet. Yes, I see someone giving the correct answer. I see some people not giving quite the correct answer. I see more and more correct answers. OK, so let's see. To figure out what type of point is, we should compute the second partial derivatives. So, fxx is, what do we get what we take the derivative of this with respect to x? Two over x^3y, OK? So, at our point, a will be 2. Fxy will be one over x^2y^2. So, B will be one. And, Fyy is going to be two over xy^3. So, C will be two. And so that tells us, well, AC-B^2 is four minus one. Sorry, I should probably use a different blackboard for that. AC-B2 is two times two minus 1^2 is three. It's positive. That tells us we are either a local minimum or local maximum. And, A is positive. So, it's a local minimum. And, in fact, you can check it's the global minimum. What about the maximum? Well, if a maximum is not actually at a critical point, it's on the boundary, or at infinity. See, so we have actually to check what happens when x and y go to zero or to infinity. Well, if that happens, if x or y goes to infinity, then the function goes to infinity. Also, if x or y goes to zero, then one over xy goes to infinity. So, the maximum, well, the function goes to infinity when x goes to infinity or y goes to infinity, or x and y go to zero. So, it's not at a critical point. OK, so, in general, we have to check both the critical points and the boundaries to decide what happens. OK, the end. Have a nice weekend.