Skip to main content

Active Calculus - Multivariable

Section 11.7 Directional Derivatives and the Gradient

Subsection 11.7.1 Introduction

The partial derivatives of a multivariable function tell us the instantaneous rate at which the function’s output changes as we hold all but one input variable constant (allowing the remaining input variable to change). It is natural to wonder how we can measure the rate at which a function changes in a direction other than parallel to a coordinate axes. In this section, we investigate this question and will connect the rates of change in other directions to the rates of change given by the partial derivatives. The Preview Activity investigates these concepts in terms of a contour map representing elevation in terms of location.
We can use cardinal directions to specify the direction of displacement vectors. These directions can be described by a compass rose. The compass rose given in Figure 11.7.1 is an example of a sixteen point compass rose. Directions of the form ESE are read as “east-southeast” and point in the direction halfway between east and southeast.
A sixteen-point compass rose
A sixteen-point compass rose
Figure 11.7.1. An example of a sixteen-point compass rose (from By Brosen~commonswiki - Own work, CC BY 2.5)

Preview Activity 11.7.1.
Below is a contour plot showing the elevations for a region of a nearby park. We will be referring to \(h(x,y)\) as the function of two variables that gives the elevation as a function of \(x\)-coordinate (location in the east-west or horizontal direction) and \(y\)-coordinates (location in the north-south or vertical direction).
A contour plot with three locations marked
A contour plot on which three points (\(A\text{,}\) \(B\text{,}\) and \(C\)) are marked.
(a)
Using the contour plot and treating the elevation as a multivariable function \(h(x,y)\text{,}\) state whether each of the following is positive, negative, or zero. Write a sentence to justify your reasoning.
  1. \(\displaystyle h_x(A)\)
  2. \(\displaystyle h_y(A)\)
  3. \(\displaystyle h_x(B)\)
  4. \(\displaystyle h_y(B)\)
  5. \(\displaystyle h_x(C)\)
  6. \(\displaystyle h_y(C)\)
(b)
Suppose you are standing at point \(B\text{.}\) Do you expect the elevation to increase, decrease, or remain constant if you take a step in the northeast direction? Write a sentence to explain your reasoning.
(c)
Suppose you are standing at point \(B\text{.}\) Do you expect the elevation to increase, decrease, or remain constant if you take a step in the southeast direction? Write a sentence to explain your reasoning.
(d)
Suppose you are standing at point \(A\text{.}\) Do you expect the elevation to increase, decrease, or remain constant if you take a step in the southwestdirection? Write a sentence to explain your reasoning.
(e)
Suppose you are standing at point \(C\text{.}\) In what direction would you take a step to move in the steepest downhill direction? Write a sentence to explain your reasoning.
(f)
Suppose you are standing at point \(A\text{.}\) In what direction would you take a step to move in the steepest uphill direction? Write a sentence to explain your reasoning.
(g)
Suppose you are standing at point \(B\text{.}\) Rank the following directions in order of steepness (from most steep and uphill to level to most steep downhill): northeast, north, east, west, south
The Preview Activity demonstrates how visual information such as a contour plot or a surface plot can be helpful in determining information about how quickly the output of a function changes as its inputs change in different directions. In this section, we will focus on how to calculate this kind of directional derivative using algebraic or numerical representations of functions and relate this calculation to other ways we have described change for a function.

Subsection 11.7.2 Directional Derivatives

We begin by determining how to measure the rate of change for a function of two variables in a direction that is not parallel to one of the coordinate directions. We use the classic calculus approach to set up this measurement: approximate the measurement, quantify how the approximation changes on a smaller scale, and finally use a limit to find the value. To measure the rate of change in the output of a function, we use a difference quotient:
\begin{equation*} \text{rate of change} = \frac{\text{change in output}}{\text{change in input}}\text{.} \end{equation*}
Let \(f\) be a function of the variables \(x\) and \(y\text{.}\) We wish to measure the rate of change of \(f\) at the base point \((x_0,y_0)\) in the direction of the vector \(\langle a,b\rangle\text{.}\) To do this, we write the difference quotient using the points \((x_0+a,y_0+b)\) and \((x_0,y_0\)):
\begin{equation*} \frac{f(x_0+a,y_0+b)-f(x_0,y_0)}{\sqrt{a^2+b^2}}\text{.} \end{equation*}
The denominator comes from using the distance formula to measure the change in the input as distance in the \(xy\)-plane. Figure 11.7.2 shows the plot of a surface given by \(z=f(x,y)\) with the points used in the difference quotient labeled.
Figure 11.7.2. A plot of \(f(x,y)\) with the measurements of change in input and output labeled for a change of inputs given by \(\langle a,b\rangle\)
The difference quotient above is the first step in the classic calculus approach and provides a good start to approximating the rate of change for the output of \(f\) in the direction \(\langle a,b\rangle\text{.}\) However, it is not clear how this approximation changes when we look at smaller scales. Specifically, we need to look at what happens to this approximation as we shrink the step size \(\sqrt{a^2+b^2}\) to \(0\) while maintaining the same direction of change. In order to maintain our direction and look at smaller step sizes, we must separate the length of the vector \(\langle a,b \rangle \) from its direction. To do this, we use the unit vector in the direction of \(\langle a,b \rangle\text{.}\)
Let \(\vu=\langle u_1,u_2\rangle \) be the unit vector in the same direction as \(\langle a,b \rangle\text{.}\) Vectors in the same direction as \(\langle a,b \rangle\) can be written in the form \(t \langle u_1,u_2\rangle\text{,}\) where \(t\) is the step size and \(\vu=\langle u_1,u_2\rangle \) is the unit vector. Taking a step of length \(t\) in the direction \(\langle u_1,u_2 \rangle\text{,}\) the difference quotient can be written as
\begin{equation*} \frac{f(x_0+t (u_1),y_0+t(u_2))-f(x_0,y_0)}{t}\text{.} \end{equation*}
We will look at how this difference quotient changes at smaller scales, i.e., when the step size \(t\) gets smaller. This accomplishes the second step of the classic calculus approach because it quantifyies how the approximation works on smaller scales. We can now take the limit as \(t \to 0\) to define the directional derivative.

Definition 11.7.3.

Let \(f(x,y)\) be a function of two variables. The derivative of \(f\) at the point \((x,y)\) in the direction of the unit vector \(\vu = \langle u_1, u_2 \rangle\) is denoted \(D_{\vu}f(x,y)\) and is given by
\begin{equation} D_{\vu}f(x,y) = \lim_{t \to 0} \frac{f(x+u_1 t, y+u_2 t) - f(x,y)}{t}\tag{11.7.1} \end{equation}
for those values of \(x\) and \(y\) for which the limit exists.
The notation \(D_{\vu}f(x,y)\) can also be read as “the directional derivative of \(f\) in the direction \(\vu\) at the point \((x,y)\text{.}\)” While this may seem like a mouthful, each piece of this phrase is necessary to specify a directional derivative: we must specify the function, the location at which to measure the rate of change, and the direction in which the inputs change. When we evaluate the directional derivative \(D_{\vu} f(x,y)\) at a point \((x_0, y_0)\text{,}\) the result \(D_{\vu} f(x_0,y_0)\) tells us the instantaneous rate at which \(f\) changes at \((x_0, y_0)\) per unit increase in the direction of the vector \(\vu\text{.}\)
We can connect equation (11.7.1) to our work on traces earlier in this chapter. When initially exploring surface plots, we simplified our approach by looking at slices that were parallel to one of the coordinate axes, obtained by holding the other input variable constant. This allowed us to use ideas from single-variable calculus to understand more about the surface and to motivate the definitions of partial derivatives. Similarly, we can cut the surface in the \(\langle u_1,u_2\rangle \) direction through the base point and then think about the directional derivative as a single-variable calculus problem. In particular, we can parameterize the trace along the surface by noticing our inputs correspond to the line in the xy-plane given by \(\langle x_0+ u_1 t, y_0 + u_2 t\rangle\text{,}\) which is shown in red on Figure 11.7.4.
Figure 11.7.4. A plot of \(f(x,y)\) with a trace shown from the point \((x_0,y_0)\) in the direction \(\langle u_1,u_2 \rangle\)
When \(t=0\text{,}\) the location is \((x_0,y_0)\text{.}\) The trace along the surface is given by \(\vr(t)=\langle x_0+ u_1 t, y_0 + u_2 t , f(x_0+ u_1 t, y_0 + u_2 t) \rangle\text{.}\) This trace is shown in blue in Figure 11.7.4. We can also plot the trace of this surface in the direction of \(\langle u_1,u_2 \rangle\) as a function of \(t\) using a two-dimensional plot. This plot is shown in Figure 11.7.5. We can visualize the directional derivative as the slope of a tangent line at the base point. This tangent line is drawn in black on Figure 11.7.5. The scalar \(D_{\vu} f(x_0,y_0)\) tells us the slope of the line tangent to the surface in the direction of \(\vu\) at the point \((x_0,y_0,f(x_0,y_0))\text{.}\)
A curve with a tangent line
A two-dimensional plot of the trace \(f(x_0+tu_1,y_0+tu_2)\) with the horizontal axis labeled \(t\text{.}\) A tangent line to the trace is also shown.
Figure 11.7.5. A plot of the trace of \(f\) along the direction \(\langle u_1,u_2\rangle\) with tangent line drawn at point corresponding to \((x_0,y_0)\)

Subsection 11.7.3 Efficiently Computing the Directional Derivative

We want to find a way to evaluate directional derivatives without resorting to evaluating the limit definition in equation (11.7.1) every time. We will look at this algebraic approach from two perspectives. First, we will use the chain rule from Section 11.6 to define a simple algebraic rule which will allow us to efficiently calculate directional derivatives. Second, we will use the linearization of a locally linear function to compute directional derivatives.
We are interested in the instantaneous rate of change of \(f\) at a point \((x_0,y_0)\) in the direction of a unit vector \(\vu = \langle u_1, u_2 \rangle\text{.}\) In particular, the input variables \(x\) and \(y\) are changing according to
\begin{equation*} x = x_0 + u_1t \quad \text{ and } \quad y = y_0 + u_2t\text{.} \end{equation*}
A trace along the surface in the direction of \(\langle u_1,u_2\rangle\) is given by \(\vr(t)=\langle x_0+ u_1 t, y_0 + u_2 t , f(x_0+ u_1 t, y_0 + u_2 t) \rangle\text{.}\) Observe that \(\frac{dx}{dt} = u_1\) and \(\frac{dy}{dt} = u_2\) for all values of \(t\text{.}\) Since \(\vu\) is a unit vector in the \(xy\)-plane, a unit change in the parameter \(t\) corresponds to moving one unit in the \(\vu\) direction. This allows us to use the multivariable chain rule to calculate the directional derivative as a measure of the instantaneous rate of change of \(f\) in the direction \(\vu\text{.}\)
The output of \(f\) along the given direction can be written as a composition of functions, \(f(t)=f(\vr(t))=f(x(t), y(t))\text{,}\) which means we can apply the multivariable chain rule:
\begin{align*} D_{\vu}f(x_0,y_0) \amp = \frac{d}{dt}\left[ f(\vr(t)) \right] \\ \amp = f_x(x_0,y_0)\frac{dx}{dt} + f_y(x_0,y_0)\frac{dy}{dt} \\ \amp = f_x(x_0,y_0) u_1 + f_y(x_0,y_0) u_2 \end{align*}
This allows us to compute the directional derivative at an arbitrary point according to the following formula.
To use equation (11.7.2), we must have a unit vector \(\vu = \langle u_1, u_2 \rangle\) in the direction of motion. In the event that we have a direction prescribed by a non-unit vector, we must first scale the vector to have length 1.

Example 11.7.7.

We will look at an example that is algebraically and geometrically simple but allows us to use Theorem 11.7.6. We will consider the rate of change for the function \(f(x,y)=2.5-\frac{(x-1)^2}{2}-\frac{(y+1)^2}{9}\) at the input point \((2,2)\) in several directions. Because \(f(2,2)=2.5-\frac{(2-1)^2}{2}-\frac{(2-2)^2}{9}=2.5-0.5-1=1\text{,}\) this point lies on the level curve with value \(1\) as shown in Figure 11.7.8.
A contour plot
A contour plot that shows several concentric ellipses. The point \((2,2)\) is marked as is a vector pointing left and up from this point.
Figure 11.7.8. A contour plot of \(f(x,y)=2.5-\frac{(x-1)^2}{2}-\frac{(y+1)^2}{9}\) with the point \((2,2)\) highlighted
We will compute the directional derivative of \(f\) at the point \((2,2)\) in the direction of \(\langle-2,-1\rangle\text{.}\) Since \(\langle-2,1\rangle\) is not a unit vector, we use \(\vu=\frac{1}{\sqrt{5}}\langle -2,1\rangle\) as the unit vector in this direction. To use equation (11.7.2), we will need the partial derivatives of \(f\text{:}\)
\begin{equation*} f_x= -\frac{2(x-1)}{2} \quad \text{ and } \quad f_y=-\frac{2(y+1)}{9} \end{equation*}
At \((2,2)\text{,}\) we have \(f_x(2,2)=-1\) and \(f_y(2,2)=-\frac{2}{3}\text{.}\) By equation (11.7.2), the directional derivative of \(f\) at \((2,2)\) in the direction \(\vu\) is
\begin{equation*} Df_\vu (2,2)=f_x(2,2) u_1 + f_y(2,2) u_2 = (-1)(-\frac{2}{\sqrt{5}})+\left(-\frac{2}{3}\right)(\frac{1}{\sqrt{5}})=\frac{4}{3\sqrt{5}} \text{.} \end{equation*}
We can interpret this as saying that for a small step in the direction of \(\vu\) at \((2,2)\text{,}\) we would expect the output of \(f\) to increase by \(\frac{4}{3\sqrt{5}}\) times the step size.
This may be a bit difficult to see based on the contour plot in Figure 11.7.8. However, the surface plot combined with a plot of the path of inputs in the direction \(\vu\) gives a better visualization. Figure 11.7.9 shows a surface plot of \(z=f(x,y)\) with the trace in the \(\vu\)-direction shown in blue and the direction vector \(\vu\) shown in red on the \(xy\)-plane. The black line segment shows how \(Df_\vu(2,2)\) is positive because the \(z\)-coordinate of this tangent line increases as we move away from \((2,2)\) in the \(\vu\)-direction.
Figure 11.7.9. A plot of \(z=f(x,y)\) with trace in the direction \(\vu\) through \((2,2,1)\) shown including the tangent line
Remember that the directional derivative is a local measurement and only applies in a small neighborhood of a point and only in the direction \(\vu\text{.}\) In other words, \(Df_\vu(2,2)\) does not describe the the rate of change at other points along the blue curve, only the rate of change along the blue curve for a small step away from our base point in the red direction. With plots like Figure 11.7.8, you may be tempted to use information far away from the point of interest to try to figure out the rate of change. However, like most calculus measurements, we must only use plots to provide information on what is happening near a specific point.
If we compute \(Df_\vj(2,2)\text{,}\) the rate of change for the output of \(f\) in the direction \(\vj=\langle 0,1\rangle\) at \((2,2)\text{,}\) using equation (11.7.2), then we have
\begin{equation*} Df_\vj (2,2)=f_x(2,2) u_1+f_y(2,2) u_2=(-1)(0)+(-\frac{2}{3})(1)=-\frac{2}{3} \text{.} \end{equation*}
This corresponds to the partial derivative of \(f\) with respect to \(y\text{.}\)
Figure 11.7.10 allows you to adjust the direction of the trace shown. By default, it shows the direction vector \(\vj\text{,}\) the cooresponding trace, and tangent line to the surface in the \(\vj\) direction. Use the slider at the top of Figure 11.7.10 to change the direction in which you want to evaluate the directional derivative. As you do so, examine how the steepness of the black tangent segment varies depending on the direction.
Figure 11.7.10. An intereactive plot of \(z=f(x,y)\) with trace in the direction \(\vu\) through (2,2,1) shown including the tangent line
In the following activity, we use algebraic techniques to calculate and interpret the of directional derivative of a function.

Activity 11.7.2.
In this activity, we will use equation (11.7.2) to calculate and interpret directional derivatives for the function \(f(x,y) = 3xy-x^2y^3\text{.}\)
(a)
Calculate \(f_x(x,y)\) and \(f_y(x,y)\text{.}\)
(b)
Use equation (11.7.2) to determine \(D_{\vi} f(x,y)\) and \(D_{\vj} f(x,y)\text{.}\) Write a couple of sentences to describe what familiar functions \(D_{\vi} f\) and \(D_{\vj} f\) are.
Hint.
Remember that \(\vi\) is the unit vector in the positive \(x\)-direction and \(\vj\) is the unit vector in the positive \(y\)-direction.
(c)
Use equation (11.7.2) to find the derivative of \(f\) in the direction of the vector \(\vv = \langle 2, 3 \rangle\) at the point \((1,-1)\text{.}\)
Hint.
Remember that a unit direction vector is needed.
(d)
Find the derivative of \(f\) in the direction of the vector \(\vv = \langle 4, 6 \rangle\) at the point \((1,-1)\text{.}\)
(e)
Use equation (11.7.2) to find the derivative of \(f\) in the direction of the vector \(\vv = \langle -2, -3 \rangle\) at the point \((1,-1)\text{.}\) Write a couple of sentences to explain why this result is different from your answer to the previous two tasks, even though the direction vectors are parallel.
Equation (11.7.2) arose through consideration of the definition for the directional derivative as a composition of functions and application of the chain rule. When considering multivariable functions that are locally linear, recall that we can evaluate the limit of such a function near a point by analyzing the linearization or tangent plane. On a small scale, there is virtually no difference between the locally linear function and the linearization as demonstrated by Figure 11.5.1.
The equation of the plane tangent to the graph of \(f\) at the point \((x_0,y_0,f(x_0,y_0))\) is
\begin{equation*} z = f(x_0,y_0) + f_x(x_0,y_0)(x-x_0) + f_y(x_0,y_0)(y-y_0)\text{.} \end{equation*}
If \(f\) is a locally linear function, then the directional derivative \(D_{\vu}f(x_0,y_0)\) is the same as the rate of change along the tangent plane in the direction of \(\vu\text{.}\) This is convenient because on the tangent plane, the rate of change is the same regardless of step size, a fact that is not true on the original surface \(z=f(x,y)\text{.}\) We now consider the quotient
\begin{align*} \amp\frac{z(x_0+u_1,y_0+u_2)-z(x_0,y_0)}{\sqrt{u_1^2+u_2^2}}\\ \amp=\frac{\left(f(x_0,y_0) + f_x(x_0,y_0)(x_0+u_1-x_0) + f_y(x_0,y_0)(y_0+u_2-y_0)\right)-f(x_0,y_0)}{1}\\ \amp= f_x(x_0,y_0) (u_1) + f_y(x_0,y_0)(u_2)\text{,} \end{align*}
which is exactly the result in equation (11.7.2).

Subsection 11.7.4 The Gradient

We have used the chain rule and linearization to see see that the instantaneous rate of change a function \(f = f(x,y)\) in the direction of a unit vector \(\vu = \langle u_1, u_2 \rangle\) is given by
\begin{equation} D_{\vu}f(x_0,y_0) = f_x(x_0,y_0)u_1 + f_y(x_0,y_0)u_2\text{.}\tag{11.7.3} \end{equation}
You may recognize the form of (11.7.3) as being similar to a dot product. We can also view this equation as stating that the rate of change along the linearization in a given direction can be expressed as a linear combination of the rates of change in the coordinate directions by using partial derivatives with the weight of each partial derivative specified by the components of the unit vector \(\vu\text{.}\)
We can think about  equation (11.7.3) in a way that will have geometric meaning related to the dot product. The directional derivative \(D_{\vu}f(x_0,y_0)\) is the dot product of \(\left\langle f_x(x_0,y_0), f_y(x_0,y_0) \right\rangle\) and \(\vu=\langle u_1,u_2\rangle\text{.}\) Notice that the vector \(\left\langle f_x(x_0,y_0), f_y(x_0,y_0) \right\rangle\) comes from simple calculations that tell us how the function \(f\) is changing near the input \((x_0,y_0)\text{.}\)

Definition 11.7.11.

The gradient of \(f\) is vector formed by partial derivatives of \(f\text{.}\) The gradient of \(f\) is denoted
\begin{equation*} \nabla f(x_0,y_0) = \left\langle f_x(x_0,y_0), f_y(x_0,y_0) \right\rangle\text{.} \end{equation*}
We read \(\nabla f\) as “the gradient of \(f\text{,}\)” “grad \(f\)” or “del \(f\)”.
 1 
The symbol \(\nabla\) is called nabla, which comes from a Greek word for a certain type of harp that has a similar shape.
Notice that \(\nabla f\) varies from point to point, and also provides an alternate formulation of the directional derivative.
In the following activity, we investigate some of what the gradient tells us about the behavior of a function \(f\text{.}\)

Activity 11.7.3.
Let \(f(x,y) = \frac{1}{3}(x^2-y^2)\text{.}\) Some contours for this function are shown in Figure 11.7.13.
A contour plot
A contour plot
Figure 11.7.13. A contour plot of \(f(x,y)=\frac{1}{3}(x^2-y^2)\)
(a)
Find the gradient \(\nabla f (x,y)\text{.}\)
(b)
For each of the following points \((x_0,y_0)\text{,}\) evaluate the gradient \(\nabla f(x_0,y_0)\) and sketch the gradient vector with its tail at \((x_0,y_0)\text{.}\) Some of the vectors are too long to fit onto the plot. To draw them to scale, you should scale each vector by a factor of \(1/2\text{.}\)
  • \(\displaystyle (x_0,y_0) = (2,0)\)
  • \(\displaystyle (x_0,y_0) = (0,2)\)
  • \(\displaystyle (x_0,y_0) = (2,2)\)
  • \(\displaystyle (x_0,y_0) = (2,1)\)
  • \(\displaystyle (x_0,y_0) = (-3,2)\)
  • \(\displaystyle (x_0,y_0) = (-2,-4)\)
  • \(\displaystyle (x_0,y_0) = (0,0)\)
(c)
Write a few sentences about how the direction of the gradient at each of these points is related to the the contour passing through that point.
(d)
Does the output of \(f\) increase or decrease in the direction of \(\nabla f(x_0,y_0)\text{?}\) Use examples from the points above to write a couple of sentences that justify your answer.
Hint.
You may wish to think about the contour plot as a topographical map and whether your elevation would increase or decrease if you took a step in the direction of the gradient.
Because \(\nabla f(x_0,y_0)\) is a vector, we can consider its direction and length separately. The next subsection discusses the information conveyed by these geometetric aspects about the behavior of \(f\) near \((x_0,y_0)\text{.}\)

Subsection 11.7.5 Vector Properties of the Gradient

Key Idea 11.7.12 shows how we can separate the directional derivative of a function of two variables into two separate parts: the gradient vector evaluated at the point of interest and the unit vector in the direction we want to change the inputs of the function. Recall from equation (9.3.1) that the dot product of two vectors depends on the lengths of the vectors and the angle between the vectors. If \(\theta\) is the angle between \(\nabla f(x_0,y_0)\) and \(\vu\) (where \(\vu\) is a unit vector), then combining (11.7.4) and (9.3.1) gives
\begin{align*} D_{\vu}f(x_0,y_0) \amp= \nabla f(x_0,y_0)\cdot\vu \\ \amp= \vecmag{\nabla f(x_0,y_0)} \vecmag{\vu} \cos(\theta) \text{.} \end{align*}
Remember that \(\vecmag{\vu}=1\) because \(\vu\) is a unit vector. Hence, the directional derivative is the length of the gradient vector times the cosine of the angle between \(\nabla f(x_0,y_0)\) and \(\vu\text{:}\)
\begin{equation} D_{\vu}f(x_0,y_0) =\vecmag{\nabla f(x_0,y_0)} \cos(\theta)\tag{11.7.5} \end{equation}
Equation (11.7.5) will be extremely useful in interpreting the gradient geometrically.
Figure 11.7.14. The sign of \(D_{\vu} f(x_0,y_0)\) is determined by \(\theta\)
Because the magnitude of a vector is always non-negative, the sign of a directional derivative depends on \(\cos(\theta)\text{.}\) Figure 11.7.14 graphically shows examples of the following statements (from left to right):
  • If the angle between the gradient and \(\vu\) is a right angle, then the directional derivative is zero.
  • If the angle between the gradient and \(\vu\) is acute, then the directional derivative is positive.
  • If the angle between the gradient and \(\vu\) is obtuse, then the directional derivative is negative.
The first statement explains why the gradient is perpendicular to the level curve through the point of interest. Because a level curve is the set of points for which the function has a particular output value, the output will not change along the level curve. Thus, the directional derivative in a direction tangent to the level curve must be zero. We can expand this explanation to the other statements as well. The output of \(f\) increases in any direction that makes an acute angle with the gradient vector and the output of \(f\) decreases in any direction that makes an obtuse angle with the gradient.
Here are some questions about how the directional derivative changes at a point that we will address:
  • What direction corresponds to the largest value for the directional derivative?
  • What is the largest value possible for the directional derivative at \((x_0,y_0)\text{?}\)
  • What direction corresponds to the smallest (or most negative) value for the directional derivative?
  • What is the smallest value possible for the directional derivative at \((x_0,y_0)\text{?}\)
All of these questions can be answered by considering equation (11.7.5). Because the length of the gradient vector does not change when changing direction, the \(\cos(\theta)\) factor tells us when we have the maximum and minimum values. Because of how we define the angle between vectors, we only need to consider values of \(\theta\) between 0 and \(\pi\text{.}\) Therefore, the largest value of the directional derivative occurs when \(\theta=0\) and \(\cos(\theta)\) attains its maximum value of \(1\text{.}\) This means that when the direction vector \(\vu\) is in the same direction as the gradient, the directional derivative is maximized. Furthermore, the largest value of the directional derivative is the length of the gradient. In equation form, we can express that if \(\vu\) is in the same direction as \(\nabla f(x_0,y_0)\text{,}\) then
\begin{equation*} D_{\vu}f(x_0,y_0) = \vecmag{\nabla f(x_0,y_0)} \cos(0) =\vecmag{\nabla f(x_0,y_0)}\text{.} \end{equation*}
By a parallel argument, the smallest (or most negative) value the directional derivative can take is when \(\theta=\pi\text{.}\) In this case, the direction vector is in the opposite direction of the gradient vector. Thus if \(\theta=\pi\text{,}\) then
\begin{equation*} D_{\vu}f(x_0,y_0) = \vecmag{\nabla f(x_0,y_0)} \cos(\pi) = -\vecmag{\nabla f(x_0,y_0)} \end{equation*}
because \(\cos(\pi)=-1\text{.}\)
We summarize our most recent work by stating important facts about the gradient.

The Meaning of the Gradient as a Vector.

Let \(f\) be a differentiable function and \((x_0,y_0)\) a point for which \(\nabla f(x_0,y_0) \ne \vzero\text{.}\)
  • The gradient points in a direction perpendicular to the level curve \(z=k\) where \(k=f(x_0,y_0)\text{.}\)
  • The gradient \(\nabla f(x_0,y_0)\) points in the direction of greatest rate of increase for \(f\) at \((x_0,y_0)\text{,}\) and the instantaneous rate of change of \(f\) in that direction is the length of the gradient vector.
  • If \(\vu = \frac{1}{\vecmag{\nabla f(x_0,y_0)}} \nabla f(x_0,y_0)\text{,}\) then \(\vu\) is a unit vector in the direction of greatest increase of \(f\) at \((x_0,y_0)\text{,}\) and \(D_{\vu} f(x_0,y_0) = \vecmag{\nabla f(x_0,y_0)}\text{.}\)
  • The gradient \(\nabla f(x_0,y_0)\) points in the opposite direction of greatest rate of decrease for \(f\) at \((x_0,y_0)\text{,}\) and the instantaneous rate of change of \(f\) in that direction is the length of the gradient vector times \(-1\text{.}\)
  • If \(\vu = -\frac{1}{\vecmag{\nabla f(x_0,y_0)}} \nabla f(x_0,y_0)\text{,}\) then \(\vu\) is a unit vector in the direction of greatest decrease of \(f\) at \((x_0,y_0)\text{,}\) and \(D_{\vu} f(x_0,y_0) = -\vecmag{\nabla f(x_0,y_0)}\text{.}\)
Note that the third and fifth bullets above are algebraic statements of the second and fourth bullets, respectively.
Exercises 11–13 look at how the directional derivative changes as a function of the direction used or of the point where the directional derivative is being evaluated.

Subsection 11.7.6 Applications

The gradient has many natural applications. For example, situations often arise where we are interested in knowing the direction in which a function is increasing or decreasing most rapidly. This type of question is natural when constructing a road through the mountains or planning the flow of water across a landscape. In the next activity, we examine how the gradient can help with navigation to the top of a mountain in foggy conditions.

Activity 11.7.4.
This activity considers directional derivatives and gradients for a function that measures elevation. Suppose you are hiking in a foggy park and can only see a few feet in front of you. There is nothing blocking you from walking in any particular direction, but because of the fog you cannot see where the highest point on the mountain is. You want to try to find the top of the mountain, but you don’t have a map, trail, or line of sight to other landmarks. You do have a compass, which works in the fog. This allows you to identify the cardinal directions.
In order to use calculational tools from multivariable calculus, you overlay an \(xy\)-coordinate system on the park. The \(x\)-coordinate measures east/west location, with eastward movement associated with increasing \(x\)-values. The \(y\)-coordinate measures north/south location, with northward movement associated with increasing \(y\)-values. The elevation in meters above sea level of the park’s terrain at the location \((x,y)\) is given by a function \(h(x,y)\text{.}\)
(a)
Let \(P_1\) be your current location in the foggy park. You use your compass to find the east and north directions. At \(P_1\text{,}\) you find that the ground rises 1 meter per 50 meters traveled to the east and the ground rises 2.5 meters per 50 meters traveled to the north.
Use this information to find \(\nabla h (P_1)\text{.}\)
(b)
Use your answer to the previous part to say what direction is “uphill” at \(P_1\) and state the rate of elevation increase in this direction.
(c)
You decide to walk uphill from your location \(P_1\) in order to try to find the top of the mountain. After walking in the same direction for a while, you are at a point \(P_2\) and notice that you are no longer walking in the steepest direction. You again locate east and north and measure the steepness of the mountain in these directions. You find that the ground rises 1.5 meters per 75 meters traveled to the east and the ground goes down 0.5 meters per 100 meters traveled to the north.
Use this new information to calculate \(\nabla h (P_2)\text{,}\) find the uphill direction, and find how steep the mountain is in the uphill direction at \(P_2\text{.}\)
(d)
Suppose you continue this method of walking in the uphill direction for a while, finding the new uphill direction, and walking in the new uphill direction. Do you think you must eventually find the top of the mountain? How you will know that you have reached the top of the mountain? Remember that you can’t see very far in front of you. Write a few sentences to explain your reasoning.
The technique described in the previous activity has many applications related to maximizing or minimizing functions. For example, consider a two-dimensional version of how a heat-seeking missile might work. (This application is borrowed from United States Air Force Academy Department of Mathematical Sciences.) Suppose that the temperature surrounding a fighter jet can be modeled by the function \(T\) defined by
\begin{equation*} T(x,y) = \frac{100}{1+(x-5)^2 + 4(y-2.5)^2}, \end{equation*}
where \((x,y)\) is a point in the plane of the fighter jet and \(T(x,y)\) is measured in degrees Celsius. Some contours and gradients \(\nabla T\) are shown on the left in Figure 11.7.15.
Figure 11.7.15. Contours and gradient for \(T(x,y)\) and the missile’s path.
A heat-seeking missile will always travel in the direction in which the temperature increases most rapidly; that is, it will always travel in the direction of the gradient \(\nabla T\text{.}\) If a missile is fired from the point \((2,4)\text{,}\) then its path will be that shown on the right in Figure 11.7.15.
This type strategy is sometimes called gradient ascent and has uses in economics and machine learning.

Subsection 11.7.7 Summary

  • The directional derivative of \(f\) at the point \((x,y)\) in the direction of the unit vector \(\vu = \langle u_1, u_2 \rangle\) is
    \begin{equation*} D_{\vu}f(x,y) = \lim_{h \to 0} \frac{f(x+u_1h, y+u_2h) - f(x,y)}{h} \end{equation*}
    for those values of \(x\) and \(y\) for which the limit exists. In addition, \(D_{\vu}f(x,y)\) measures the slope of the graph of \(f\) when we move in the direction \(\vu\text{.}\) Alternatively, \(D_{\vu} f(x_0,y_0)\) measures the instantaneous rate of change of \(f\) in the direction \(\vu\) at \((x_0,y_0)\text{.}\)
  • The gradient of a function \(f=f(x,y)\) at a point \((x_0,y_0)\) is the vector
    \begin{equation*} \nabla f(x_0,y_0) = \left\langle f_x(x_0,y_0), f_y(x_0,y_0)\right\rangle \end{equation*}
  • The directional derivative in the direction \(\vu\) may be computed by
    \begin{equation*} D_{\vu}f(x_0,y_0) = \nabla f(x_0,y_0)\cdot \vu \end{equation*}
  • At any point where the gradient is nonzero, the gradient is orthogonal to the contour through that point and points in the direction in which \(f\) increases most rapidly; moreover, the slope of \(f\) in this direction equals the length of the gradient \(\vecmag{\nabla f(x_0,y_0)}\text{.}\) Similarly, the direction opposite of the gradient is the direction of greatest decrease, and that rate of decrease is the negative length of the gradient.

Exercises 11.7.8 Exercises

1.

Consider the function \(f(x,y,z) = xy + yz^2 + xz^3\text{.}\)
Find the gradient of \(f\text{:}\)
\(\langle\), , \(\rangle\)
Find the gradient of \(f\) at the point (-2, 4, 4).
\(\langle\), , \(\rangle\)
Find the rate of change of the function \(f\) at the point (-2, 4,4) in the direction \(\mathbf u = \langle 5/\sqrt{54}, 5/\sqrt{54}, 2/\sqrt{54} \rangle\text{.}\)

2.

If \(f \left( x, y \right) = 1 x^{2} + 4 y^{2}\text{,}\) find the value of the directional derivative at the point \(\left( -2, 1 \right)\) in the direction given by the angle \(\theta = \frac{2 \pi}{6}\text{.}\)

3.

Find the directional derivative of \(\displaystyle f(x,y,z) = 4xy+z^{2}\) at the point \((1,2,-2)\) in the direction of the maximum rate of change of \(f\text{.}\)
\(f_{\lt B>\lt I>u\lt /B>\lt /I>} \, (1,2,-2) = D_{\lt B>\lt I>u\lt /B>\lt /I>} \, f(1,2,-2) =\)

4.

The temperature at any point in the plane is given by \(\displaystyle T(x,y) = \frac{190}{x^{2}+y^{2}+2}\text{.}\)
(a) What shape are the level curves of \(T\text{?}\)
(b) At what point on the plane is it hottest?
What is the maximum temperature?
(c) Find the direction of the greatest increase in temperature at the point \((-3,3)\text{.}\)
What is the value of this maximum rate of change, that is, the maximum value of the directional derivative at \((-3,3)\text{?}\)
(d) Find the direction of the greatest decrease in temperature at the point \((-3,3)\text{.}\)
What is the value of this most negative rate of change, that is, the minimum value of the directional derivative at \((-3,3)\text{?}\)

5.

The temperature at a point (x,y,z) is given by \(\displaystyle T(x,y,z) = 200e^{-x^2 -y^2/4 - z^2/9}\text{,}\) where \(T\) is measured in degrees Celsius and x,y, and z in meters. There are lots of places to make silly errors in this problem; just try to keep track of what needs to be a unit vector.
Find the rate of change of the temperature at the point (1, 1, 1) in the direction toward the point (4, 5, 5).
In which direction (unit vector) does the temperature increase the fastest at (1, 1, 1)?
\(\langle\), ,\(\rangle\)
What is the maximum rate of increase of \(T\) at (1, 1, 1)?

6.

If \(\displaystyle f(x,y,z) = 2zy^{2}\text{,}\) then the gradient at the point \((4,6,4)\) is
\(\nabla f (4,6,4) =\)

7.

The concentration of salt in a fluid at \((x,y,z)\) is given by \(F(x,y,z) = x^{2}+y^{4}+2x^{2}z^{2}\) mg/cm\({}^3\text{.}\) You are at the point \((-1,1,1)\text{.}\)
(a) In which direction should you move if you want the concentration to increase the fastest?
direction:
(Give your answer as a vector.)
(b) You start to move in the direction you found in part (a) at a speed of \(3\) cm/sec. How fast is the concentration changing?
rate of change =

8.

At a certain point on a heated metal plate, the greatest rate of temperature increase, 3 degrees Celsius per meter, is toward the northeast. If an object at this point moves directly north, at what rate is the temperature increasing?
degrees Celsius per meter

9.

Suppose that you are climbing a hill whose shape is given by \(z = 720 - 0.04 x^2 -0.06 y^2\text{,}\) and that you are at the point (30, 80, 300).
In which direction (unit vector) should you proceed initially in order to reach the top of the hill fastest?
\(\langle\),\(\rangle\)
If you climb in that direction, at what angle above the horizontal will you be climbing initially (radian measure)?

10.

Are the following statements true or false?
  1. If \(\vec{u}\) is a unit vector, then \(f_{\vec{u}} (a,b)\) is a vector.
  2. If \(f(x,y)\) has \(f_x(a,b) = 0\) and \(f_y(a,b) = 0\) at the point \((a,b)\text{,}\) then \(f\) is constant everywhere.
  3. The gradient vector \(\nabla f(a,b)\) is tangent to the contour of \(f\) at \((a,b)\text{.}\)
  4. Suppose \(f_x(a,b)\) and \(f_y(a,b)\) both exist. Then there is always a direction in which the rate of change of \(f\) at \((a,b)\) is zero.
  5. \(\nabla f(a,b)\) is a vector in 3-dimensional space.
  6. If \(\vec{u}\) is perpendicular to \(\nabla f(a,b)\text{,}\) then \(f_{\vec{u}} \, (a,b) = \langle 0, 0 \rangle\text{.}\)
  7. \(f_{\vec{u}} \, (a,b) = || \nabla f(a,b) ||\text{.}\)
  8. \(f_{\vec{u}} \, (a,b)\) is parallel to \(\vec{u}\text{.}\)

Directional Derivative Sense-Making.

The next three exercises consider how the directional derivative changes as a function of the direction used or of the point where the directional derivative is computed. The goal of these exercises is to understand the directional derivative as a geometric measurement using visually-based sense-making prompts in terms of the location and directions being used. In particular, Exercise 11 asks students to visually assess whether the directional derivative is positive/negative/zero at a few locations and directions. Exercise 12 fixes the location on a surface and steps students about how to construct a function that gives value of the directional derivative as a function of the direction. Exercise 13 examines the relationships between the value of the directional derivative and ideas like the slope of a secant line or dependence on the length of the direction vector. These exercises are adapted from materials developed by Rafael Martinez-Planell.
11.
Figure 11.7.16 shows a graph of \(z=f(x,y)\) that you should use for the following:
  1. On the \(xy\)-plane of the figure below, draw the direction vector \(\langle 1,1\rangle \) starting at the point \((2,-1)\text{.}\)
  2. Draw a piece of the tangent line to the graph of \(f\) at the point \((2,-1,f(2,-1))\) which is in the direction \(\langle 1,1\rangle \text{.}\) The slope of the tangent line in the \(\langle 1,1\rangle \) direction is called the directional derivative of \(f\) at \((2,-1)\) in the \(\langle 1,1\rangle \) direction and is denoted \(D_{\langle 1,1\rangle} f (2,-1)\text{.}\)
  3. Is \(D_{\langle 1,1\rangle} f (2,-1)\) positive, negative, or zero? Justify your answer.
  4. On the \(xy\)-plane of (a new plot of) the figure below, draw the direction vector \(\langle -\frac{1}{2},3\rangle \) starting at the point \((2,-1)\text{.}\)
  5. Draw a piece of the tangent line to the graph of \(f\) at the point \((2,-1,f(2,-1))\) which is in the direction \(\langle -\frac{1}{2},3 \rangle \text{.}\) The slope of the tangent line in the \(\langle -\frac{1}{2},3 \rangle \) direction is called the directional derivative of \(f\) at \((2,-1)\) in the \(\langle -\frac{1}{2},3 \rangle \) direction and is denoted \(D_{\langle -\frac{1}{2},3 \rangle} f (2,-1)\text{.}\)
  6. Is \(D_{\langle -\frac{1}{2},3 \rangle} f (2,-1)\) positive, negative, or zero? Justify your answer.
Figure 11.7.16. A plot of \(f(x,y)\) with the measurements of change in input and output labeled for a change of inputs given by \(\langle a,b\rangle\)
12.
Let \(f\) be the function whose graph appears in Figure 11.7.16. For each value of \(t\) the vector \(\langle \cos(t),\sin(t)\rangle \) is a direction vector. The value of \(D(t)=D_{\langle \cos(t),\sin(t)\rangle} f (1,-1)\) a scalar.
  1. What direction is the vector \(\langle \cos(t),\sin(t)\rangle \) in for
    1. \(\displaystyle t=0\)
    2. \(\displaystyle t=\frac{\pi}{2}\)
    3. \(\displaystyle t=\frac{3\pi}{2}\)
    4. \(\displaystyle t=\pi\)
  2. Is the value of \(D(t)=D_{\langle \cos(t),\sin(t)\rangle} f (1,-1)\) positive, negative, or zero for:
    1. \(\displaystyle t=0\)
    2. \(\displaystyle t=\frac{\pi}{2}\)
    3. \(\displaystyle t=\frac{3\pi}{2}\)
    4. \(\displaystyle t=\pi\)
  3. Draw the graph of \(y=D(t)\) for values of \(t\) in the interval \([0,2\pi]\text{.}\) You should think about how the value of \(D(t)\) changes between the values you considered above to help you draw the graph of \(D(t)\text{.}\) Your graph doesn’t have to have exact values but should correctly identify where \(D(t)\) is positive, negative, and zero.
  4. Draw the graph of \(y=G(t)=D_{\langle 1,1 \rangle} f (t,-t)\) for values of \(t\) in the interval \((0,2]\text{.}\) You should probably go through the same process you did above of looking at what \(G(t)\) will be for a few values of \(t\) between 0 and 2, then think about how \(G(t)\) will vary for values in between. The function \(G(t)\) is different than \(D(t)\text{.}\)
13.
The graph of \(z=f(x,y)\) is as given below. In this problem, use geometric arguments to justify your answers.
  1. Draw the line segment that goes from \((1,-1,f(1,-1))\) to \((2,0,f(2,0))\text{.}\) How does the slope of this line segment in that direction compare (smaller, equal, larger) with the value of \(D_{\langle 1,1 \rangle} f (1,-1)\text{?}\)
  2. How does \(D_{\langle 1,1 \rangle} f (1,-1)\) compare with \(D_{\langle 0.01,0.01 \rangle} f (1,-1)\text{?}\) Justify your answer.
  3. Which is closer to \(D_{\langle 1,1 \rangle} f (1,-1)\text{,}\) the slope of the line segment that goes from \((1,-1,f(1,-1))\) to \((2,0,f(0,2))\) or the slope of the line segment that goes from \((1,-1,f(1,-1))\) to \((1.01,-0.99,f(1.01,-0.99))\text{?}\) Explain your answer.

14.

Let \(E(x,y) = \frac{100}{1+(x-5)^2 + 4(y-2.5)^2}\) represent the elevation on a land mass at location \((x,y)\text{.}\) Suppose that \(E\text{,}\) \(x\text{,}\) and \(y\) are all measured in meters.
  1. Find \(E_x(x,y)\) and \(E_y(x,y)\text{.}\)
  2. Let \(\vu\) be a unit vector in the direction of \(\langle -4,3 \rangle\text{.}\) Determine \(D_{\vu} E(3,4)\text{.}\) What is the practical meaning of \(D_{\vu} E(3,4)\) and what are its units?
  3. Find the direction of greatest increase in \(E\) at the point \((3,4)\text{.}\)
  4. Find the instantaneous rate of change of \(E\) in the direction of greatest decrease at the point \((3,4)\text{.}\) Include units on your answer.
  5. At the point \((3,4)\text{,}\) find a direction \(\vw\) in which the instantaneous rate of change of \(E\) is 0.

15.

Find all directions in which the directional derivative of \(f(x,y) = ye^{-xy}\) is 1 at the point \((0,2)\text{.}\)

16.

Find, if possible, a function \(f\) such that
\begin{equation*} \nabla f = \left\langle \sin(yz), xz\cos(yz)+2y, xy\cos(yz)+\frac{5}{z} \right\rangle\text{.} \end{equation*}
If not possible, explain why.

17.

Let \(f(x,y) = x^2+3y^2\text{.}\)
  1. Find \(\nabla f(x,y)\) and \(\nabla f(1,2)\text{.}\)
  2. Find the direction of greatest increase in \(f\) at the point \((1,2)\text{.}\) Explain. A graph of the surface defined by \(f\) is shown at left in Figure 11.7.17. Illustrate this direction on the surface.
  3. A contour diagram of \(f\) is shown at right in Figure 11.7.17. Illustrate your calculation from (b) on this contour diagram.
    Figure 11.7.17. Left: Graph of \(f(x,y) = x^2+3y^2\text{.}\) Right: Contours.
  4. Find a direction \(\vw\) for which the derivative of \(f\) in the direction of \(\vw\) is zero.

18.

The properties of the gradient that we have observed for functions of two variables also hold for functions of more variables. In this problem, we consider a situation where there are three independent variables. Suppose that the temperature in a region of space is described by
\begin{equation*} T(x,y,z) = 100e^{-x^2-y^2-z^2} \end{equation*}
and that you are standing at the point \((1,2,-1)\text{.}\)
  1. Find the instantaneous rate of change of the temperature in the direction of \(\vv=\langle 0, 1, 2\rangle\) at the point \((1,2,-1)\text{.}\) Remember that you should first find a unit vector in the direction of \(\vv\text{.}\)
  2. In what direction from the point \((1,2,-1)\) would you move to cause the temperature to decrease as quickly as possible?
  3. How fast does the temperature decrease in this direction?
  4. Find a direction in which the temperature does not change at \((1,2,-1)\text{.}\)

19.

Figure 11.7.18 shows a plot of the gradient \(\nabla f\) at several points for some function \(f=f(x,y)\text{.}\)
Figure 11.7.18. The gradient \(\nabla f\text{.}\)
  1. Consider each of the three indicated points, and draw, as best as you can, the contour through that point.
  2. Beginning at each point, draw a curve on which \(f\) is continually decreasing.