The formalism behind integration by substitution












29














When you are doing an integration by substitution you do the following working.
$$begin{align*}
u&=f(x)\
Rightarrowfrac{du}{dx}&=f^{prime}(x)\
Rightarrow du&=f^{prime}(x)dx&(1)\
Rightarrow dx&=frac{du}{f^{prime}(x)}\
end{align*}$$




My question is: what on earth is going on at line $(1)$?!?




This has been bugging me for, like, forever! You see, when I was taught this in my undergrad I was told something along the lines of the following:



You just treat $frac{du}{dx}$ like a fraction. Similarly, when you are doing the chain rule $frac{dy}{dx}=frac{dy}{dv}timesfrac{dv}{dx}$ you "cancel" the $dv$ terms. They are just like fractions. However, never, ever say this to a pure mathematician.



Now, I am a pure mathematician. And quite frankly I don't care if people think of these as fractions or not. I know that they are not fractions (but rather is the limit of the difference fractions as the difference tends to zero). But I figure I should start caring now...So, more precisely,




$frac{du}{dx}$ has a meaning, but so far as I know $du$ and $dx$ do not have a meaning. Therefore, why can we treat $frac{du}{dx}$ as a fraction when we are doing integration by substitution? What is actually going on at line $(1)$?











share|cite|improve this question
























  • This should be valid in basic calculus, probably not in the theory of differential forms. I'm not very sure.
    – user122283
    Feb 6 '14 at 1:31






  • 1




    When doing ordinary integration over $Bbb R$, we are writing $f(g(x))$ as $f(u)$ and $g'(x)dx=du$. This "$dx$" and $"du"$ might have more abstract generalized meanings, but $u$-substitution is really just the chain rule and the fundamental theorem of calculus. We know how to find anti-derivatives of things of the form $f(g(x))g'(x)$, because we know how to find derivatives of things of the form $f(g(x))$. You might call into question the notation, but the mathematics behind it is rock solid.
    – PVAL-inactive
    Feb 6 '14 at 1:43






  • 2




    I don't feel qualified to give a full answer, but what's going on is some deep theorems with strong hypotheses, involving pushforward measures for Lebesgue integrals, or more simply a differentiable change of variables if you're just talking about Riemann integrals. The differential notation was constructed in some sense in order to have the "fraction" cancelling property, which is why it's "OK" to think of them like that if all you care about is evaluating an integral. In truth, though, (1) is just an abuse of notation.
    – Joshua Pepper
    Apr 3 '14 at 9:25










  • @JoshuaPepper If you don't feel qualified to give a full answer, could you perhaps suggest a book I could look up? Would it just be in Rudin?...(I forget if his "Principles..." looks at Riemann integrals, but I think it does?)
    – user1729
    Apr 3 '14 at 9:31










  • @JoshuaPepper Also, any suggestions for better tags would be appreciated...
    – user1729
    Apr 3 '14 at 9:45
















29














When you are doing an integration by substitution you do the following working.
$$begin{align*}
u&=f(x)\
Rightarrowfrac{du}{dx}&=f^{prime}(x)\
Rightarrow du&=f^{prime}(x)dx&(1)\
Rightarrow dx&=frac{du}{f^{prime}(x)}\
end{align*}$$




My question is: what on earth is going on at line $(1)$?!?




This has been bugging me for, like, forever! You see, when I was taught this in my undergrad I was told something along the lines of the following:



You just treat $frac{du}{dx}$ like a fraction. Similarly, when you are doing the chain rule $frac{dy}{dx}=frac{dy}{dv}timesfrac{dv}{dx}$ you "cancel" the $dv$ terms. They are just like fractions. However, never, ever say this to a pure mathematician.



Now, I am a pure mathematician. And quite frankly I don't care if people think of these as fractions or not. I know that they are not fractions (but rather is the limit of the difference fractions as the difference tends to zero). But I figure I should start caring now...So, more precisely,




$frac{du}{dx}$ has a meaning, but so far as I know $du$ and $dx$ do not have a meaning. Therefore, why can we treat $frac{du}{dx}$ as a fraction when we are doing integration by substitution? What is actually going on at line $(1)$?











share|cite|improve this question
























  • This should be valid in basic calculus, probably not in the theory of differential forms. I'm not very sure.
    – user122283
    Feb 6 '14 at 1:31






  • 1




    When doing ordinary integration over $Bbb R$, we are writing $f(g(x))$ as $f(u)$ and $g'(x)dx=du$. This "$dx$" and $"du"$ might have more abstract generalized meanings, but $u$-substitution is really just the chain rule and the fundamental theorem of calculus. We know how to find anti-derivatives of things of the form $f(g(x))g'(x)$, because we know how to find derivatives of things of the form $f(g(x))$. You might call into question the notation, but the mathematics behind it is rock solid.
    – PVAL-inactive
    Feb 6 '14 at 1:43






  • 2




    I don't feel qualified to give a full answer, but what's going on is some deep theorems with strong hypotheses, involving pushforward measures for Lebesgue integrals, or more simply a differentiable change of variables if you're just talking about Riemann integrals. The differential notation was constructed in some sense in order to have the "fraction" cancelling property, which is why it's "OK" to think of them like that if all you care about is evaluating an integral. In truth, though, (1) is just an abuse of notation.
    – Joshua Pepper
    Apr 3 '14 at 9:25










  • @JoshuaPepper If you don't feel qualified to give a full answer, could you perhaps suggest a book I could look up? Would it just be in Rudin?...(I forget if his "Principles..." looks at Riemann integrals, but I think it does?)
    – user1729
    Apr 3 '14 at 9:31










  • @JoshuaPepper Also, any suggestions for better tags would be appreciated...
    – user1729
    Apr 3 '14 at 9:45














29












29








29


12





When you are doing an integration by substitution you do the following working.
$$begin{align*}
u&=f(x)\
Rightarrowfrac{du}{dx}&=f^{prime}(x)\
Rightarrow du&=f^{prime}(x)dx&(1)\
Rightarrow dx&=frac{du}{f^{prime}(x)}\
end{align*}$$




My question is: what on earth is going on at line $(1)$?!?




This has been bugging me for, like, forever! You see, when I was taught this in my undergrad I was told something along the lines of the following:



You just treat $frac{du}{dx}$ like a fraction. Similarly, when you are doing the chain rule $frac{dy}{dx}=frac{dy}{dv}timesfrac{dv}{dx}$ you "cancel" the $dv$ terms. They are just like fractions. However, never, ever say this to a pure mathematician.



Now, I am a pure mathematician. And quite frankly I don't care if people think of these as fractions or not. I know that they are not fractions (but rather is the limit of the difference fractions as the difference tends to zero). But I figure I should start caring now...So, more precisely,




$frac{du}{dx}$ has a meaning, but so far as I know $du$ and $dx$ do not have a meaning. Therefore, why can we treat $frac{du}{dx}$ as a fraction when we are doing integration by substitution? What is actually going on at line $(1)$?











share|cite|improve this question















When you are doing an integration by substitution you do the following working.
$$begin{align*}
u&=f(x)\
Rightarrowfrac{du}{dx}&=f^{prime}(x)\
Rightarrow du&=f^{prime}(x)dx&(1)\
Rightarrow dx&=frac{du}{f^{prime}(x)}\
end{align*}$$




My question is: what on earth is going on at line $(1)$?!?




This has been bugging me for, like, forever! You see, when I was taught this in my undergrad I was told something along the lines of the following:



You just treat $frac{du}{dx}$ like a fraction. Similarly, when you are doing the chain rule $frac{dy}{dx}=frac{dy}{dv}timesfrac{dv}{dx}$ you "cancel" the $dv$ terms. They are just like fractions. However, never, ever say this to a pure mathematician.



Now, I am a pure mathematician. And quite frankly I don't care if people think of these as fractions or not. I know that they are not fractions (but rather is the limit of the difference fractions as the difference tends to zero). But I figure I should start caring now...So, more precisely,




$frac{du}{dx}$ has a meaning, but so far as I know $du$ and $dx$ do not have a meaning. Therefore, why can we treat $frac{du}{dx}$ as a fraction when we are doing integration by substitution? What is actually going on at line $(1)$?








calculus integration notation






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Apr 4 '14 at 11:34

























asked Apr 3 '14 at 9:12









user1729

17.2k64085




17.2k64085












  • This should be valid in basic calculus, probably not in the theory of differential forms. I'm not very sure.
    – user122283
    Feb 6 '14 at 1:31






  • 1




    When doing ordinary integration over $Bbb R$, we are writing $f(g(x))$ as $f(u)$ and $g'(x)dx=du$. This "$dx$" and $"du"$ might have more abstract generalized meanings, but $u$-substitution is really just the chain rule and the fundamental theorem of calculus. We know how to find anti-derivatives of things of the form $f(g(x))g'(x)$, because we know how to find derivatives of things of the form $f(g(x))$. You might call into question the notation, but the mathematics behind it is rock solid.
    – PVAL-inactive
    Feb 6 '14 at 1:43






  • 2




    I don't feel qualified to give a full answer, but what's going on is some deep theorems with strong hypotheses, involving pushforward measures for Lebesgue integrals, or more simply a differentiable change of variables if you're just talking about Riemann integrals. The differential notation was constructed in some sense in order to have the "fraction" cancelling property, which is why it's "OK" to think of them like that if all you care about is evaluating an integral. In truth, though, (1) is just an abuse of notation.
    – Joshua Pepper
    Apr 3 '14 at 9:25










  • @JoshuaPepper If you don't feel qualified to give a full answer, could you perhaps suggest a book I could look up? Would it just be in Rudin?...(I forget if his "Principles..." looks at Riemann integrals, but I think it does?)
    – user1729
    Apr 3 '14 at 9:31










  • @JoshuaPepper Also, any suggestions for better tags would be appreciated...
    – user1729
    Apr 3 '14 at 9:45


















  • This should be valid in basic calculus, probably not in the theory of differential forms. I'm not very sure.
    – user122283
    Feb 6 '14 at 1:31






  • 1




    When doing ordinary integration over $Bbb R$, we are writing $f(g(x))$ as $f(u)$ and $g'(x)dx=du$. This "$dx$" and $"du"$ might have more abstract generalized meanings, but $u$-substitution is really just the chain rule and the fundamental theorem of calculus. We know how to find anti-derivatives of things of the form $f(g(x))g'(x)$, because we know how to find derivatives of things of the form $f(g(x))$. You might call into question the notation, but the mathematics behind it is rock solid.
    – PVAL-inactive
    Feb 6 '14 at 1:43






  • 2




    I don't feel qualified to give a full answer, but what's going on is some deep theorems with strong hypotheses, involving pushforward measures for Lebesgue integrals, or more simply a differentiable change of variables if you're just talking about Riemann integrals. The differential notation was constructed in some sense in order to have the "fraction" cancelling property, which is why it's "OK" to think of them like that if all you care about is evaluating an integral. In truth, though, (1) is just an abuse of notation.
    – Joshua Pepper
    Apr 3 '14 at 9:25










  • @JoshuaPepper If you don't feel qualified to give a full answer, could you perhaps suggest a book I could look up? Would it just be in Rudin?...(I forget if his "Principles..." looks at Riemann integrals, but I think it does?)
    – user1729
    Apr 3 '14 at 9:31










  • @JoshuaPepper Also, any suggestions for better tags would be appreciated...
    – user1729
    Apr 3 '14 at 9:45
















This should be valid in basic calculus, probably not in the theory of differential forms. I'm not very sure.
– user122283
Feb 6 '14 at 1:31




This should be valid in basic calculus, probably not in the theory of differential forms. I'm not very sure.
– user122283
Feb 6 '14 at 1:31




1




1




When doing ordinary integration over $Bbb R$, we are writing $f(g(x))$ as $f(u)$ and $g'(x)dx=du$. This "$dx$" and $"du"$ might have more abstract generalized meanings, but $u$-substitution is really just the chain rule and the fundamental theorem of calculus. We know how to find anti-derivatives of things of the form $f(g(x))g'(x)$, because we know how to find derivatives of things of the form $f(g(x))$. You might call into question the notation, but the mathematics behind it is rock solid.
– PVAL-inactive
Feb 6 '14 at 1:43




When doing ordinary integration over $Bbb R$, we are writing $f(g(x))$ as $f(u)$ and $g'(x)dx=du$. This "$dx$" and $"du"$ might have more abstract generalized meanings, but $u$-substitution is really just the chain rule and the fundamental theorem of calculus. We know how to find anti-derivatives of things of the form $f(g(x))g'(x)$, because we know how to find derivatives of things of the form $f(g(x))$. You might call into question the notation, but the mathematics behind it is rock solid.
– PVAL-inactive
Feb 6 '14 at 1:43




2




2




I don't feel qualified to give a full answer, but what's going on is some deep theorems with strong hypotheses, involving pushforward measures for Lebesgue integrals, or more simply a differentiable change of variables if you're just talking about Riemann integrals. The differential notation was constructed in some sense in order to have the "fraction" cancelling property, which is why it's "OK" to think of them like that if all you care about is evaluating an integral. In truth, though, (1) is just an abuse of notation.
– Joshua Pepper
Apr 3 '14 at 9:25




I don't feel qualified to give a full answer, but what's going on is some deep theorems with strong hypotheses, involving pushforward measures for Lebesgue integrals, or more simply a differentiable change of variables if you're just talking about Riemann integrals. The differential notation was constructed in some sense in order to have the "fraction" cancelling property, which is why it's "OK" to think of them like that if all you care about is evaluating an integral. In truth, though, (1) is just an abuse of notation.
– Joshua Pepper
Apr 3 '14 at 9:25












@JoshuaPepper If you don't feel qualified to give a full answer, could you perhaps suggest a book I could look up? Would it just be in Rudin?...(I forget if his "Principles..." looks at Riemann integrals, but I think it does?)
– user1729
Apr 3 '14 at 9:31




@JoshuaPepper If you don't feel qualified to give a full answer, could you perhaps suggest a book I could look up? Would it just be in Rudin?...(I forget if his "Principles..." looks at Riemann integrals, but I think it does?)
– user1729
Apr 3 '14 at 9:31












@JoshuaPepper Also, any suggestions for better tags would be appreciated...
– user1729
Apr 3 '14 at 9:45




@JoshuaPepper Also, any suggestions for better tags would be appreciated...
– user1729
Apr 3 '14 at 9:45










6 Answers
6






active

oldest

votes


















7














Consider evaluating $int (3x^2 + 2x) e^{x^3 + x^2} , dx$ (as in this Khan Academy video).



Often teachers will say, let $u = x^3 + x^2$, and note that "$du = (3x^2 + 2x) dx$".
Therefore, they say,
begin{align}
int (3x^2 + 2x) e^{x^3 + x^2} , dx &= int e^u du \
&= e^u + C \
&= e^{x^3 + x^2} + C.
end{align}



However, this explanation is confusing because there's no such thing as $du$ or $dx$.



A more clear (in my opinion) and perfectly rigorous explanation is just to notice that our integral has the form $int f(g(x)) g'(x) dx$, and use the rule
begin{equation}
int f(g(x)) g'(x) dx = F(g(x)) + C
end{equation}
where $F$ is an antiderivative of $f$. This rule is clearly true, because it's nothing more than the chain rule in reverse. There's no need to use any "infinitesimals" or anything.






share|cite|improve this answer

















  • 1




    +1 Right, I actually use the second way you describe, its completely clear and not confusing.
    – Sawarnik
    Apr 3 '14 at 10:17












  • +1 This is pretty much what I have to say. You could also note that $int f(g(x))g^prime(x) dx = int f(g(x))frac{dg}{dx}dx = int f(g)dg$ is where the $dg=frac{dg}{dx}dx$ notation comes from, and could be considered the definition of the latter.
    – Joshua Pepper
    Apr 3 '14 at 10:40












  • Yes, this explanation is nice. The unconfortable fact remains, though, that "what the teachers say" is what we (all?) end doing.
    – leonbloy
    Apr 3 '14 at 15:48





















6














Recall that $u$-substitution is really the inverse rule of the chain rule, just like integration by parts is the inverse rule of the product rule. The essence of the chain rule is that



$$ frac{mathrm{d}y}{mathrm{d}x} = frac{mathrm{d}y}{mathrm{d}u}frac{mathrm{d}u}{mathrm{d}x},$$



which is why we like to write derivatives as ratios - often, when they look like they cancel, they really "do cancel," so to speak.



A better way of writing $u$-substitution is to say that $dfrac{mathrm{d}u}{mathrm{d}x} = f'(x)$, though we might as well notate this as $u'(x)$, since that's what we're really doing. Then



$$ int g(u(x))u'(x) mathrm{d}x = color{#F01C2C}{int g(u(x)) frac{mathrm{d}u}{mathrm{d}x}mathrm{d}x = int g(u) mathrm{d}u} = int g(u) mathrm{d}u,$$



where I've notated the important equality in red. The step in red is visibly related to the chain rule: the part that looks like it cancels really does cancel. $diamondsuit$



The theme here is that this is valid because of the chain rule, and the notation is chosen to support the cancellation effects. The fact that people go around separating this very convenient notation is largely for different reasons, and/or because they are implying a good amount of knowledge of "differentials."



We can even more directly relate this to the chain rule by giving a proof. Consider the function



$$ F(x) = int_{0}^x g(t)mathrm{d}t.$$



Consider the function $F(u(x))$ and differentiate it:



$$ begin{align}
F(u(x))' &= F'(u(x)) u'(x) = frac{mathrm{d}F}{mathrm{d}u}frac{mathrm{d}u}{mathrm{d}x}\
&=frac{mathrm{d}}{mathrm{d}u}int_{0}^{u(x)} g(u(t))mathrm{d}t cdot u'(x)\
&= g(u(x))u'(x).
end{align}$$



The the second fundamental theorem of calculus says that



$$begin{align}
int_a^b g(u(x))u'(x)mathrm{d}x &= F(u(b)) - F(u(a)) \
&= int_{a}^{b} g(u(t))u'(t)mathrm{d}t \
&=int_{a}^{b}g(u(t))frac{mathrm{d}u}{mathrm{d}t}mathrm{d}t.
end{align}$$



Of course, we also know that $displaystyle F(u(b)) - F(u(a)) = int_{u(a)}^{u(b)} g(t) mathrm{d}t = int_{u(a)}^{u(b)} g(u) mathrm{d}u$.






share|cite|improve this answer





















  • Why is the date of this answer Feb 6 when the question was asked 8 days ago? ಠ_ಠ
    – Superbus
    Apr 12 '14 at 1:09










  • @Lucius: Magic ;p No, actually this answer was merged into this question from an earlier version.
    – davidlowryduda
    Apr 12 '14 at 1:13






  • 2




    This color is so cute: #F01C2C that I'm gonna steal from you!
    – Lucas Zanella
    May 24 '14 at 4:31





















3














One way to interpret $df$ (for $f ,:, mathbb{R} to mathbb{R}$ for simplicity)is to view it as a map $$
df ,:, mathbb{R}to left(mathbb{R} to mathbb{R}right) ,:, c mapsto left(x mapsto xF_cright) text{.}
$$
In plain english, $df$ is map which maps each point in $mathbb{R}$ to a linear function $mathbb{R} to mathbb{R}$. For each $c$, the linear map $(df)(c) = x mapsto xF_c$ is the best linear approximation of $f$ at point $c$. We know, of course, that this means nothing other than that $F_c = f'(c)$ - after all, that's one way to define the derivative - as the slope of the best linear approximation at point $c$.



So what is $frac{du}{dv}$, then? It's a quotient of maps, and if you interpret it simply point-wise, you get $$
frac{du}{dv} = frac{(c,x) mapsto xU_c}{(c,x) mapsto xV_c}
= (c,x) mapsto frac{xU_c}{xV_c} = (c,x) mapsto frac{U_c}{V_c} text{.}
$$
This doesn't depend on $x$ anymore, so we may re-interpret it as a function $mathbb{R} to mathbb{R}$, and if $u=u(v)$ and $v$ is an independent variable, then $U_c = u'(c)$ and $V_c = 1$, so we get $frac{du}{dv} ,:, mathbb{R} to mathbb{R} ,:, c mapsto u'(c)$, i.e. $frac{du}{dv} = u'$.






share|cite|improve this answer



















  • 2




    Aaaah - is the key word "differential"?
    – user1729
    Apr 3 '14 at 10:14






  • 1




    By the way, usually when we notate a function value, we use mapsto instead of to, as in: the function $f:Ato B$ defined by $xmapsto x^2$, or for a function of functions: $g:Ato(Bto C)$ defined as $g=xmapsto(ymapsto xcdot y)$.
    – Mario Carneiro
    Apr 3 '14 at 17:16










  • @MarioCarneiro Ah, will try to remember that, thanks.
    – fgp
    Apr 6 '14 at 12:13



















0














When applying notions like chain rule and substitution we treat derivatives just like fractions, but the rules are slighly bent, since for multi variable chain rule:



if $frac{partial f(g(t),h(t))}{partial t}= frac{partial f}{partial g}frac{partial g}{partial t}+frac{partial f}{partial h}frac{partial h}{partial t}$, but if we cancel these down we get $frac{partial f(g(t),h(t))}{partial t}=2frac{partial f(g(t),h(t))}{partial t}$.



But in one variable just like above, everything runs smoothly, and it is goodd to note the things like "$dx$" are infinitesimely small changes in x, so when we consider $du/dx$, we consider both "$du$" and "$dx$" as they become infinitesimely small, so we can manipulate them like fractions.






share|cite|improve this answer





















  • I know that the notation represents the limit of some fractions, but I suppose my question is "why can we manipulate this limit like it was a fraction?!?" (or rather, the "notation representing this limit...")
    – user1729
    Apr 3 '14 at 9:44












  • We can because when we consider these objects, they are changing, becoming smaller and smaller, so they act like non zero objects, allowing us to manipulate them like fractions. But like I said there are "paradoxical" examples.
    – Ellya
    Apr 3 '14 at 10:06



















0














Consider the geometrical interpretation you have a right square with lengths $Delta x$ and $Delta u$ and $f'$ is actually $f'=k=tan(alpha)$ so you get $f'=k=tan(alpha)=frac{Delta u}{Delta x}$. Now let $Delta x rightarrow 0$ and you get a definition of derivation... So du and dx have a meaning and doing something like $dx=frac{du}{f'(x)}$ does have sense.






share|cite|improve this answer























  • I know that the notation represents the limit of some fractions, but I suppose my question is "why can we manipulate this limit like it was a fraction?!?" (or rather, the "notation representing this limit...")
    – user1729
    Apr 3 '14 at 9:44



















0















I know that they are not fractions [...]




Well, by Non-standard analysis (following a book I referred on a comment about a similar answer), that's where you're wrong. And that is the premise supporting the whole question, if you said the opposite, you wouldn't've made this question.




My question is: what on earth is going on at line (1)?!?




First, the $u$-substitution, while used in integration, is on its own an operation of differentiation. As differentiation is a function (on functions), and both sides are equal, the differentials must be equal. It is by definition of any function that $a=bRightarrow f(a)=f(b)$.



So, what is differentiation? It is the infinitesimal variation of the tangent of a function on a point. What you might be thinking is: why make a distinction between the variation on the tangent and on the function itself if they're coincident when zooming in enough? The answer is this allows us to both define the derivative as a (hyper)real fraction (pun intended), and not simply and informally discard the smaller infinitesimals.



An accurate image to illustrate this is the following taken from the book:



http://i.stack.imgur.com/OqYKE.png



As the differential is on the tangent, and to know the tangent one should know the derivative, the former's definition is $dy = f'(x) dx$. Note everything here are numbers, and by the transfer principal, usual rules of algebra apply.






share|cite|improve this answer























    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "69"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f737928%2fthe-formalism-behind-integration-by-substitution%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    6 Answers
    6






    active

    oldest

    votes








    6 Answers
    6






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    7














    Consider evaluating $int (3x^2 + 2x) e^{x^3 + x^2} , dx$ (as in this Khan Academy video).



    Often teachers will say, let $u = x^3 + x^2$, and note that "$du = (3x^2 + 2x) dx$".
    Therefore, they say,
    begin{align}
    int (3x^2 + 2x) e^{x^3 + x^2} , dx &= int e^u du \
    &= e^u + C \
    &= e^{x^3 + x^2} + C.
    end{align}



    However, this explanation is confusing because there's no such thing as $du$ or $dx$.



    A more clear (in my opinion) and perfectly rigorous explanation is just to notice that our integral has the form $int f(g(x)) g'(x) dx$, and use the rule
    begin{equation}
    int f(g(x)) g'(x) dx = F(g(x)) + C
    end{equation}
    where $F$ is an antiderivative of $f$. This rule is clearly true, because it's nothing more than the chain rule in reverse. There's no need to use any "infinitesimals" or anything.






    share|cite|improve this answer

















    • 1




      +1 Right, I actually use the second way you describe, its completely clear and not confusing.
      – Sawarnik
      Apr 3 '14 at 10:17












    • +1 This is pretty much what I have to say. You could also note that $int f(g(x))g^prime(x) dx = int f(g(x))frac{dg}{dx}dx = int f(g)dg$ is where the $dg=frac{dg}{dx}dx$ notation comes from, and could be considered the definition of the latter.
      – Joshua Pepper
      Apr 3 '14 at 10:40












    • Yes, this explanation is nice. The unconfortable fact remains, though, that "what the teachers say" is what we (all?) end doing.
      – leonbloy
      Apr 3 '14 at 15:48


















    7














    Consider evaluating $int (3x^2 + 2x) e^{x^3 + x^2} , dx$ (as in this Khan Academy video).



    Often teachers will say, let $u = x^3 + x^2$, and note that "$du = (3x^2 + 2x) dx$".
    Therefore, they say,
    begin{align}
    int (3x^2 + 2x) e^{x^3 + x^2} , dx &= int e^u du \
    &= e^u + C \
    &= e^{x^3 + x^2} + C.
    end{align}



    However, this explanation is confusing because there's no such thing as $du$ or $dx$.



    A more clear (in my opinion) and perfectly rigorous explanation is just to notice that our integral has the form $int f(g(x)) g'(x) dx$, and use the rule
    begin{equation}
    int f(g(x)) g'(x) dx = F(g(x)) + C
    end{equation}
    where $F$ is an antiderivative of $f$. This rule is clearly true, because it's nothing more than the chain rule in reverse. There's no need to use any "infinitesimals" or anything.






    share|cite|improve this answer

















    • 1




      +1 Right, I actually use the second way you describe, its completely clear and not confusing.
      – Sawarnik
      Apr 3 '14 at 10:17












    • +1 This is pretty much what I have to say. You could also note that $int f(g(x))g^prime(x) dx = int f(g(x))frac{dg}{dx}dx = int f(g)dg$ is where the $dg=frac{dg}{dx}dx$ notation comes from, and could be considered the definition of the latter.
      – Joshua Pepper
      Apr 3 '14 at 10:40












    • Yes, this explanation is nice. The unconfortable fact remains, though, that "what the teachers say" is what we (all?) end doing.
      – leonbloy
      Apr 3 '14 at 15:48
















    7












    7








    7






    Consider evaluating $int (3x^2 + 2x) e^{x^3 + x^2} , dx$ (as in this Khan Academy video).



    Often teachers will say, let $u = x^3 + x^2$, and note that "$du = (3x^2 + 2x) dx$".
    Therefore, they say,
    begin{align}
    int (3x^2 + 2x) e^{x^3 + x^2} , dx &= int e^u du \
    &= e^u + C \
    &= e^{x^3 + x^2} + C.
    end{align}



    However, this explanation is confusing because there's no such thing as $du$ or $dx$.



    A more clear (in my opinion) and perfectly rigorous explanation is just to notice that our integral has the form $int f(g(x)) g'(x) dx$, and use the rule
    begin{equation}
    int f(g(x)) g'(x) dx = F(g(x)) + C
    end{equation}
    where $F$ is an antiderivative of $f$. This rule is clearly true, because it's nothing more than the chain rule in reverse. There's no need to use any "infinitesimals" or anything.






    share|cite|improve this answer












    Consider evaluating $int (3x^2 + 2x) e^{x^3 + x^2} , dx$ (as in this Khan Academy video).



    Often teachers will say, let $u = x^3 + x^2$, and note that "$du = (3x^2 + 2x) dx$".
    Therefore, they say,
    begin{align}
    int (3x^2 + 2x) e^{x^3 + x^2} , dx &= int e^u du \
    &= e^u + C \
    &= e^{x^3 + x^2} + C.
    end{align}



    However, this explanation is confusing because there's no such thing as $du$ or $dx$.



    A more clear (in my opinion) and perfectly rigorous explanation is just to notice that our integral has the form $int f(g(x)) g'(x) dx$, and use the rule
    begin{equation}
    int f(g(x)) g'(x) dx = F(g(x)) + C
    end{equation}
    where $F$ is an antiderivative of $f$. This rule is clearly true, because it's nothing more than the chain rule in reverse. There's no need to use any "infinitesimals" or anything.







    share|cite|improve this answer












    share|cite|improve this answer



    share|cite|improve this answer










    answered Apr 3 '14 at 10:09









    littleO

    28.5k642105




    28.5k642105








    • 1




      +1 Right, I actually use the second way you describe, its completely clear and not confusing.
      – Sawarnik
      Apr 3 '14 at 10:17












    • +1 This is pretty much what I have to say. You could also note that $int f(g(x))g^prime(x) dx = int f(g(x))frac{dg}{dx}dx = int f(g)dg$ is where the $dg=frac{dg}{dx}dx$ notation comes from, and could be considered the definition of the latter.
      – Joshua Pepper
      Apr 3 '14 at 10:40












    • Yes, this explanation is nice. The unconfortable fact remains, though, that "what the teachers say" is what we (all?) end doing.
      – leonbloy
      Apr 3 '14 at 15:48
















    • 1




      +1 Right, I actually use the second way you describe, its completely clear and not confusing.
      – Sawarnik
      Apr 3 '14 at 10:17












    • +1 This is pretty much what I have to say. You could also note that $int f(g(x))g^prime(x) dx = int f(g(x))frac{dg}{dx}dx = int f(g)dg$ is where the $dg=frac{dg}{dx}dx$ notation comes from, and could be considered the definition of the latter.
      – Joshua Pepper
      Apr 3 '14 at 10:40












    • Yes, this explanation is nice. The unconfortable fact remains, though, that "what the teachers say" is what we (all?) end doing.
      – leonbloy
      Apr 3 '14 at 15:48










    1




    1




    +1 Right, I actually use the second way you describe, its completely clear and not confusing.
    – Sawarnik
    Apr 3 '14 at 10:17






    +1 Right, I actually use the second way you describe, its completely clear and not confusing.
    – Sawarnik
    Apr 3 '14 at 10:17














    +1 This is pretty much what I have to say. You could also note that $int f(g(x))g^prime(x) dx = int f(g(x))frac{dg}{dx}dx = int f(g)dg$ is where the $dg=frac{dg}{dx}dx$ notation comes from, and could be considered the definition of the latter.
    – Joshua Pepper
    Apr 3 '14 at 10:40






    +1 This is pretty much what I have to say. You could also note that $int f(g(x))g^prime(x) dx = int f(g(x))frac{dg}{dx}dx = int f(g)dg$ is where the $dg=frac{dg}{dx}dx$ notation comes from, and could be considered the definition of the latter.
    – Joshua Pepper
    Apr 3 '14 at 10:40














    Yes, this explanation is nice. The unconfortable fact remains, though, that "what the teachers say" is what we (all?) end doing.
    – leonbloy
    Apr 3 '14 at 15:48






    Yes, this explanation is nice. The unconfortable fact remains, though, that "what the teachers say" is what we (all?) end doing.
    – leonbloy
    Apr 3 '14 at 15:48













    6














    Recall that $u$-substitution is really the inverse rule of the chain rule, just like integration by parts is the inverse rule of the product rule. The essence of the chain rule is that



    $$ frac{mathrm{d}y}{mathrm{d}x} = frac{mathrm{d}y}{mathrm{d}u}frac{mathrm{d}u}{mathrm{d}x},$$



    which is why we like to write derivatives as ratios - often, when they look like they cancel, they really "do cancel," so to speak.



    A better way of writing $u$-substitution is to say that $dfrac{mathrm{d}u}{mathrm{d}x} = f'(x)$, though we might as well notate this as $u'(x)$, since that's what we're really doing. Then



    $$ int g(u(x))u'(x) mathrm{d}x = color{#F01C2C}{int g(u(x)) frac{mathrm{d}u}{mathrm{d}x}mathrm{d}x = int g(u) mathrm{d}u} = int g(u) mathrm{d}u,$$



    where I've notated the important equality in red. The step in red is visibly related to the chain rule: the part that looks like it cancels really does cancel. $diamondsuit$



    The theme here is that this is valid because of the chain rule, and the notation is chosen to support the cancellation effects. The fact that people go around separating this very convenient notation is largely for different reasons, and/or because they are implying a good amount of knowledge of "differentials."



    We can even more directly relate this to the chain rule by giving a proof. Consider the function



    $$ F(x) = int_{0}^x g(t)mathrm{d}t.$$



    Consider the function $F(u(x))$ and differentiate it:



    $$ begin{align}
    F(u(x))' &= F'(u(x)) u'(x) = frac{mathrm{d}F}{mathrm{d}u}frac{mathrm{d}u}{mathrm{d}x}\
    &=frac{mathrm{d}}{mathrm{d}u}int_{0}^{u(x)} g(u(t))mathrm{d}t cdot u'(x)\
    &= g(u(x))u'(x).
    end{align}$$



    The the second fundamental theorem of calculus says that



    $$begin{align}
    int_a^b g(u(x))u'(x)mathrm{d}x &= F(u(b)) - F(u(a)) \
    &= int_{a}^{b} g(u(t))u'(t)mathrm{d}t \
    &=int_{a}^{b}g(u(t))frac{mathrm{d}u}{mathrm{d}t}mathrm{d}t.
    end{align}$$



    Of course, we also know that $displaystyle F(u(b)) - F(u(a)) = int_{u(a)}^{u(b)} g(t) mathrm{d}t = int_{u(a)}^{u(b)} g(u) mathrm{d}u$.






    share|cite|improve this answer





















    • Why is the date of this answer Feb 6 when the question was asked 8 days ago? ಠ_ಠ
      – Superbus
      Apr 12 '14 at 1:09










    • @Lucius: Magic ;p No, actually this answer was merged into this question from an earlier version.
      – davidlowryduda
      Apr 12 '14 at 1:13






    • 2




      This color is so cute: #F01C2C that I'm gonna steal from you!
      – Lucas Zanella
      May 24 '14 at 4:31


















    6














    Recall that $u$-substitution is really the inverse rule of the chain rule, just like integration by parts is the inverse rule of the product rule. The essence of the chain rule is that



    $$ frac{mathrm{d}y}{mathrm{d}x} = frac{mathrm{d}y}{mathrm{d}u}frac{mathrm{d}u}{mathrm{d}x},$$



    which is why we like to write derivatives as ratios - often, when they look like they cancel, they really "do cancel," so to speak.



    A better way of writing $u$-substitution is to say that $dfrac{mathrm{d}u}{mathrm{d}x} = f'(x)$, though we might as well notate this as $u'(x)$, since that's what we're really doing. Then



    $$ int g(u(x))u'(x) mathrm{d}x = color{#F01C2C}{int g(u(x)) frac{mathrm{d}u}{mathrm{d}x}mathrm{d}x = int g(u) mathrm{d}u} = int g(u) mathrm{d}u,$$



    where I've notated the important equality in red. The step in red is visibly related to the chain rule: the part that looks like it cancels really does cancel. $diamondsuit$



    The theme here is that this is valid because of the chain rule, and the notation is chosen to support the cancellation effects. The fact that people go around separating this very convenient notation is largely for different reasons, and/or because they are implying a good amount of knowledge of "differentials."



    We can even more directly relate this to the chain rule by giving a proof. Consider the function



    $$ F(x) = int_{0}^x g(t)mathrm{d}t.$$



    Consider the function $F(u(x))$ and differentiate it:



    $$ begin{align}
    F(u(x))' &= F'(u(x)) u'(x) = frac{mathrm{d}F}{mathrm{d}u}frac{mathrm{d}u}{mathrm{d}x}\
    &=frac{mathrm{d}}{mathrm{d}u}int_{0}^{u(x)} g(u(t))mathrm{d}t cdot u'(x)\
    &= g(u(x))u'(x).
    end{align}$$



    The the second fundamental theorem of calculus says that



    $$begin{align}
    int_a^b g(u(x))u'(x)mathrm{d}x &= F(u(b)) - F(u(a)) \
    &= int_{a}^{b} g(u(t))u'(t)mathrm{d}t \
    &=int_{a}^{b}g(u(t))frac{mathrm{d}u}{mathrm{d}t}mathrm{d}t.
    end{align}$$



    Of course, we also know that $displaystyle F(u(b)) - F(u(a)) = int_{u(a)}^{u(b)} g(t) mathrm{d}t = int_{u(a)}^{u(b)} g(u) mathrm{d}u$.






    share|cite|improve this answer





















    • Why is the date of this answer Feb 6 when the question was asked 8 days ago? ಠ_ಠ
      – Superbus
      Apr 12 '14 at 1:09










    • @Lucius: Magic ;p No, actually this answer was merged into this question from an earlier version.
      – davidlowryduda
      Apr 12 '14 at 1:13






    • 2




      This color is so cute: #F01C2C that I'm gonna steal from you!
      – Lucas Zanella
      May 24 '14 at 4:31
















    6












    6








    6






    Recall that $u$-substitution is really the inverse rule of the chain rule, just like integration by parts is the inverse rule of the product rule. The essence of the chain rule is that



    $$ frac{mathrm{d}y}{mathrm{d}x} = frac{mathrm{d}y}{mathrm{d}u}frac{mathrm{d}u}{mathrm{d}x},$$



    which is why we like to write derivatives as ratios - often, when they look like they cancel, they really "do cancel," so to speak.



    A better way of writing $u$-substitution is to say that $dfrac{mathrm{d}u}{mathrm{d}x} = f'(x)$, though we might as well notate this as $u'(x)$, since that's what we're really doing. Then



    $$ int g(u(x))u'(x) mathrm{d}x = color{#F01C2C}{int g(u(x)) frac{mathrm{d}u}{mathrm{d}x}mathrm{d}x = int g(u) mathrm{d}u} = int g(u) mathrm{d}u,$$



    where I've notated the important equality in red. The step in red is visibly related to the chain rule: the part that looks like it cancels really does cancel. $diamondsuit$



    The theme here is that this is valid because of the chain rule, and the notation is chosen to support the cancellation effects. The fact that people go around separating this very convenient notation is largely for different reasons, and/or because they are implying a good amount of knowledge of "differentials."



    We can even more directly relate this to the chain rule by giving a proof. Consider the function



    $$ F(x) = int_{0}^x g(t)mathrm{d}t.$$



    Consider the function $F(u(x))$ and differentiate it:



    $$ begin{align}
    F(u(x))' &= F'(u(x)) u'(x) = frac{mathrm{d}F}{mathrm{d}u}frac{mathrm{d}u}{mathrm{d}x}\
    &=frac{mathrm{d}}{mathrm{d}u}int_{0}^{u(x)} g(u(t))mathrm{d}t cdot u'(x)\
    &= g(u(x))u'(x).
    end{align}$$



    The the second fundamental theorem of calculus says that



    $$begin{align}
    int_a^b g(u(x))u'(x)mathrm{d}x &= F(u(b)) - F(u(a)) \
    &= int_{a}^{b} g(u(t))u'(t)mathrm{d}t \
    &=int_{a}^{b}g(u(t))frac{mathrm{d}u}{mathrm{d}t}mathrm{d}t.
    end{align}$$



    Of course, we also know that $displaystyle F(u(b)) - F(u(a)) = int_{u(a)}^{u(b)} g(t) mathrm{d}t = int_{u(a)}^{u(b)} g(u) mathrm{d}u$.






    share|cite|improve this answer












    Recall that $u$-substitution is really the inverse rule of the chain rule, just like integration by parts is the inverse rule of the product rule. The essence of the chain rule is that



    $$ frac{mathrm{d}y}{mathrm{d}x} = frac{mathrm{d}y}{mathrm{d}u}frac{mathrm{d}u}{mathrm{d}x},$$



    which is why we like to write derivatives as ratios - often, when they look like they cancel, they really "do cancel," so to speak.



    A better way of writing $u$-substitution is to say that $dfrac{mathrm{d}u}{mathrm{d}x} = f'(x)$, though we might as well notate this as $u'(x)$, since that's what we're really doing. Then



    $$ int g(u(x))u'(x) mathrm{d}x = color{#F01C2C}{int g(u(x)) frac{mathrm{d}u}{mathrm{d}x}mathrm{d}x = int g(u) mathrm{d}u} = int g(u) mathrm{d}u,$$



    where I've notated the important equality in red. The step in red is visibly related to the chain rule: the part that looks like it cancels really does cancel. $diamondsuit$



    The theme here is that this is valid because of the chain rule, and the notation is chosen to support the cancellation effects. The fact that people go around separating this very convenient notation is largely for different reasons, and/or because they are implying a good amount of knowledge of "differentials."



    We can even more directly relate this to the chain rule by giving a proof. Consider the function



    $$ F(x) = int_{0}^x g(t)mathrm{d}t.$$



    Consider the function $F(u(x))$ and differentiate it:



    $$ begin{align}
    F(u(x))' &= F'(u(x)) u'(x) = frac{mathrm{d}F}{mathrm{d}u}frac{mathrm{d}u}{mathrm{d}x}\
    &=frac{mathrm{d}}{mathrm{d}u}int_{0}^{u(x)} g(u(t))mathrm{d}t cdot u'(x)\
    &= g(u(x))u'(x).
    end{align}$$



    The the second fundamental theorem of calculus says that



    $$begin{align}
    int_a^b g(u(x))u'(x)mathrm{d}x &= F(u(b)) - F(u(a)) \
    &= int_{a}^{b} g(u(t))u'(t)mathrm{d}t \
    &=int_{a}^{b}g(u(t))frac{mathrm{d}u}{mathrm{d}t}mathrm{d}t.
    end{align}$$



    Of course, we also know that $displaystyle F(u(b)) - F(u(a)) = int_{u(a)}^{u(b)} g(t) mathrm{d}t = int_{u(a)}^{u(b)} g(u) mathrm{d}u$.







    share|cite|improve this answer












    share|cite|improve this answer



    share|cite|improve this answer










    answered Feb 6 '14 at 2:20









    davidlowryduda

    74.3k7117251




    74.3k7117251












    • Why is the date of this answer Feb 6 when the question was asked 8 days ago? ಠ_ಠ
      – Superbus
      Apr 12 '14 at 1:09










    • @Lucius: Magic ;p No, actually this answer was merged into this question from an earlier version.
      – davidlowryduda
      Apr 12 '14 at 1:13






    • 2




      This color is so cute: #F01C2C that I'm gonna steal from you!
      – Lucas Zanella
      May 24 '14 at 4:31




















    • Why is the date of this answer Feb 6 when the question was asked 8 days ago? ಠ_ಠ
      – Superbus
      Apr 12 '14 at 1:09










    • @Lucius: Magic ;p No, actually this answer was merged into this question from an earlier version.
      – davidlowryduda
      Apr 12 '14 at 1:13






    • 2




      This color is so cute: #F01C2C that I'm gonna steal from you!
      – Lucas Zanella
      May 24 '14 at 4:31


















    Why is the date of this answer Feb 6 when the question was asked 8 days ago? ಠ_ಠ
    – Superbus
    Apr 12 '14 at 1:09




    Why is the date of this answer Feb 6 when the question was asked 8 days ago? ಠ_ಠ
    – Superbus
    Apr 12 '14 at 1:09












    @Lucius: Magic ;p No, actually this answer was merged into this question from an earlier version.
    – davidlowryduda
    Apr 12 '14 at 1:13




    @Lucius: Magic ;p No, actually this answer was merged into this question from an earlier version.
    – davidlowryduda
    Apr 12 '14 at 1:13




    2




    2




    This color is so cute: #F01C2C that I'm gonna steal from you!
    – Lucas Zanella
    May 24 '14 at 4:31






    This color is so cute: #F01C2C that I'm gonna steal from you!
    – Lucas Zanella
    May 24 '14 at 4:31













    3














    One way to interpret $df$ (for $f ,:, mathbb{R} to mathbb{R}$ for simplicity)is to view it as a map $$
    df ,:, mathbb{R}to left(mathbb{R} to mathbb{R}right) ,:, c mapsto left(x mapsto xF_cright) text{.}
    $$
    In plain english, $df$ is map which maps each point in $mathbb{R}$ to a linear function $mathbb{R} to mathbb{R}$. For each $c$, the linear map $(df)(c) = x mapsto xF_c$ is the best linear approximation of $f$ at point $c$. We know, of course, that this means nothing other than that $F_c = f'(c)$ - after all, that's one way to define the derivative - as the slope of the best linear approximation at point $c$.



    So what is $frac{du}{dv}$, then? It's a quotient of maps, and if you interpret it simply point-wise, you get $$
    frac{du}{dv} = frac{(c,x) mapsto xU_c}{(c,x) mapsto xV_c}
    = (c,x) mapsto frac{xU_c}{xV_c} = (c,x) mapsto frac{U_c}{V_c} text{.}
    $$
    This doesn't depend on $x$ anymore, so we may re-interpret it as a function $mathbb{R} to mathbb{R}$, and if $u=u(v)$ and $v$ is an independent variable, then $U_c = u'(c)$ and $V_c = 1$, so we get $frac{du}{dv} ,:, mathbb{R} to mathbb{R} ,:, c mapsto u'(c)$, i.e. $frac{du}{dv} = u'$.






    share|cite|improve this answer



















    • 2




      Aaaah - is the key word "differential"?
      – user1729
      Apr 3 '14 at 10:14






    • 1




      By the way, usually when we notate a function value, we use mapsto instead of to, as in: the function $f:Ato B$ defined by $xmapsto x^2$, or for a function of functions: $g:Ato(Bto C)$ defined as $g=xmapsto(ymapsto xcdot y)$.
      – Mario Carneiro
      Apr 3 '14 at 17:16










    • @MarioCarneiro Ah, will try to remember that, thanks.
      – fgp
      Apr 6 '14 at 12:13
















    3














    One way to interpret $df$ (for $f ,:, mathbb{R} to mathbb{R}$ for simplicity)is to view it as a map $$
    df ,:, mathbb{R}to left(mathbb{R} to mathbb{R}right) ,:, c mapsto left(x mapsto xF_cright) text{.}
    $$
    In plain english, $df$ is map which maps each point in $mathbb{R}$ to a linear function $mathbb{R} to mathbb{R}$. For each $c$, the linear map $(df)(c) = x mapsto xF_c$ is the best linear approximation of $f$ at point $c$. We know, of course, that this means nothing other than that $F_c = f'(c)$ - after all, that's one way to define the derivative - as the slope of the best linear approximation at point $c$.



    So what is $frac{du}{dv}$, then? It's a quotient of maps, and if you interpret it simply point-wise, you get $$
    frac{du}{dv} = frac{(c,x) mapsto xU_c}{(c,x) mapsto xV_c}
    = (c,x) mapsto frac{xU_c}{xV_c} = (c,x) mapsto frac{U_c}{V_c} text{.}
    $$
    This doesn't depend on $x$ anymore, so we may re-interpret it as a function $mathbb{R} to mathbb{R}$, and if $u=u(v)$ and $v$ is an independent variable, then $U_c = u'(c)$ and $V_c = 1$, so we get $frac{du}{dv} ,:, mathbb{R} to mathbb{R} ,:, c mapsto u'(c)$, i.e. $frac{du}{dv} = u'$.






    share|cite|improve this answer



















    • 2




      Aaaah - is the key word "differential"?
      – user1729
      Apr 3 '14 at 10:14






    • 1




      By the way, usually when we notate a function value, we use mapsto instead of to, as in: the function $f:Ato B$ defined by $xmapsto x^2$, or for a function of functions: $g:Ato(Bto C)$ defined as $g=xmapsto(ymapsto xcdot y)$.
      – Mario Carneiro
      Apr 3 '14 at 17:16










    • @MarioCarneiro Ah, will try to remember that, thanks.
      – fgp
      Apr 6 '14 at 12:13














    3












    3








    3






    One way to interpret $df$ (for $f ,:, mathbb{R} to mathbb{R}$ for simplicity)is to view it as a map $$
    df ,:, mathbb{R}to left(mathbb{R} to mathbb{R}right) ,:, c mapsto left(x mapsto xF_cright) text{.}
    $$
    In plain english, $df$ is map which maps each point in $mathbb{R}$ to a linear function $mathbb{R} to mathbb{R}$. For each $c$, the linear map $(df)(c) = x mapsto xF_c$ is the best linear approximation of $f$ at point $c$. We know, of course, that this means nothing other than that $F_c = f'(c)$ - after all, that's one way to define the derivative - as the slope of the best linear approximation at point $c$.



    So what is $frac{du}{dv}$, then? It's a quotient of maps, and if you interpret it simply point-wise, you get $$
    frac{du}{dv} = frac{(c,x) mapsto xU_c}{(c,x) mapsto xV_c}
    = (c,x) mapsto frac{xU_c}{xV_c} = (c,x) mapsto frac{U_c}{V_c} text{.}
    $$
    This doesn't depend on $x$ anymore, so we may re-interpret it as a function $mathbb{R} to mathbb{R}$, and if $u=u(v)$ and $v$ is an independent variable, then $U_c = u'(c)$ and $V_c = 1$, so we get $frac{du}{dv} ,:, mathbb{R} to mathbb{R} ,:, c mapsto u'(c)$, i.e. $frac{du}{dv} = u'$.






    share|cite|improve this answer














    One way to interpret $df$ (for $f ,:, mathbb{R} to mathbb{R}$ for simplicity)is to view it as a map $$
    df ,:, mathbb{R}to left(mathbb{R} to mathbb{R}right) ,:, c mapsto left(x mapsto xF_cright) text{.}
    $$
    In plain english, $df$ is map which maps each point in $mathbb{R}$ to a linear function $mathbb{R} to mathbb{R}$. For each $c$, the linear map $(df)(c) = x mapsto xF_c$ is the best linear approximation of $f$ at point $c$. We know, of course, that this means nothing other than that $F_c = f'(c)$ - after all, that's one way to define the derivative - as the slope of the best linear approximation at point $c$.



    So what is $frac{du}{dv}$, then? It's a quotient of maps, and if you interpret it simply point-wise, you get $$
    frac{du}{dv} = frac{(c,x) mapsto xU_c}{(c,x) mapsto xV_c}
    = (c,x) mapsto frac{xU_c}{xV_c} = (c,x) mapsto frac{U_c}{V_c} text{.}
    $$
    This doesn't depend on $x$ anymore, so we may re-interpret it as a function $mathbb{R} to mathbb{R}$, and if $u=u(v)$ and $v$ is an independent variable, then $U_c = u'(c)$ and $V_c = 1$, so we get $frac{du}{dv} ,:, mathbb{R} to mathbb{R} ,:, c mapsto u'(c)$, i.e. $frac{du}{dv} = u'$.







    share|cite|improve this answer














    share|cite|improve this answer



    share|cite|improve this answer








    edited Apr 6 '14 at 12:16

























    answered Apr 3 '14 at 10:06









    fgp

    17.7k22236




    17.7k22236








    • 2




      Aaaah - is the key word "differential"?
      – user1729
      Apr 3 '14 at 10:14






    • 1




      By the way, usually when we notate a function value, we use mapsto instead of to, as in: the function $f:Ato B$ defined by $xmapsto x^2$, or for a function of functions: $g:Ato(Bto C)$ defined as $g=xmapsto(ymapsto xcdot y)$.
      – Mario Carneiro
      Apr 3 '14 at 17:16










    • @MarioCarneiro Ah, will try to remember that, thanks.
      – fgp
      Apr 6 '14 at 12:13














    • 2




      Aaaah - is the key word "differential"?
      – user1729
      Apr 3 '14 at 10:14






    • 1




      By the way, usually when we notate a function value, we use mapsto instead of to, as in: the function $f:Ato B$ defined by $xmapsto x^2$, or for a function of functions: $g:Ato(Bto C)$ defined as $g=xmapsto(ymapsto xcdot y)$.
      – Mario Carneiro
      Apr 3 '14 at 17:16










    • @MarioCarneiro Ah, will try to remember that, thanks.
      – fgp
      Apr 6 '14 at 12:13








    2




    2




    Aaaah - is the key word "differential"?
    – user1729
    Apr 3 '14 at 10:14




    Aaaah - is the key word "differential"?
    – user1729
    Apr 3 '14 at 10:14




    1




    1




    By the way, usually when we notate a function value, we use mapsto instead of to, as in: the function $f:Ato B$ defined by $xmapsto x^2$, or for a function of functions: $g:Ato(Bto C)$ defined as $g=xmapsto(ymapsto xcdot y)$.
    – Mario Carneiro
    Apr 3 '14 at 17:16




    By the way, usually when we notate a function value, we use mapsto instead of to, as in: the function $f:Ato B$ defined by $xmapsto x^2$, or for a function of functions: $g:Ato(Bto C)$ defined as $g=xmapsto(ymapsto xcdot y)$.
    – Mario Carneiro
    Apr 3 '14 at 17:16












    @MarioCarneiro Ah, will try to remember that, thanks.
    – fgp
    Apr 6 '14 at 12:13




    @MarioCarneiro Ah, will try to remember that, thanks.
    – fgp
    Apr 6 '14 at 12:13











    0














    When applying notions like chain rule and substitution we treat derivatives just like fractions, but the rules are slighly bent, since for multi variable chain rule:



    if $frac{partial f(g(t),h(t))}{partial t}= frac{partial f}{partial g}frac{partial g}{partial t}+frac{partial f}{partial h}frac{partial h}{partial t}$, but if we cancel these down we get $frac{partial f(g(t),h(t))}{partial t}=2frac{partial f(g(t),h(t))}{partial t}$.



    But in one variable just like above, everything runs smoothly, and it is goodd to note the things like "$dx$" are infinitesimely small changes in x, so when we consider $du/dx$, we consider both "$du$" and "$dx$" as they become infinitesimely small, so we can manipulate them like fractions.






    share|cite|improve this answer





















    • I know that the notation represents the limit of some fractions, but I suppose my question is "why can we manipulate this limit like it was a fraction?!?" (or rather, the "notation representing this limit...")
      – user1729
      Apr 3 '14 at 9:44












    • We can because when we consider these objects, they are changing, becoming smaller and smaller, so they act like non zero objects, allowing us to manipulate them like fractions. But like I said there are "paradoxical" examples.
      – Ellya
      Apr 3 '14 at 10:06
















    0














    When applying notions like chain rule and substitution we treat derivatives just like fractions, but the rules are slighly bent, since for multi variable chain rule:



    if $frac{partial f(g(t),h(t))}{partial t}= frac{partial f}{partial g}frac{partial g}{partial t}+frac{partial f}{partial h}frac{partial h}{partial t}$, but if we cancel these down we get $frac{partial f(g(t),h(t))}{partial t}=2frac{partial f(g(t),h(t))}{partial t}$.



    But in one variable just like above, everything runs smoothly, and it is goodd to note the things like "$dx$" are infinitesimely small changes in x, so when we consider $du/dx$, we consider both "$du$" and "$dx$" as they become infinitesimely small, so we can manipulate them like fractions.






    share|cite|improve this answer





















    • I know that the notation represents the limit of some fractions, but I suppose my question is "why can we manipulate this limit like it was a fraction?!?" (or rather, the "notation representing this limit...")
      – user1729
      Apr 3 '14 at 9:44












    • We can because when we consider these objects, they are changing, becoming smaller and smaller, so they act like non zero objects, allowing us to manipulate them like fractions. But like I said there are "paradoxical" examples.
      – Ellya
      Apr 3 '14 at 10:06














    0












    0








    0






    When applying notions like chain rule and substitution we treat derivatives just like fractions, but the rules are slighly bent, since for multi variable chain rule:



    if $frac{partial f(g(t),h(t))}{partial t}= frac{partial f}{partial g}frac{partial g}{partial t}+frac{partial f}{partial h}frac{partial h}{partial t}$, but if we cancel these down we get $frac{partial f(g(t),h(t))}{partial t}=2frac{partial f(g(t),h(t))}{partial t}$.



    But in one variable just like above, everything runs smoothly, and it is goodd to note the things like "$dx$" are infinitesimely small changes in x, so when we consider $du/dx$, we consider both "$du$" and "$dx$" as they become infinitesimely small, so we can manipulate them like fractions.






    share|cite|improve this answer












    When applying notions like chain rule and substitution we treat derivatives just like fractions, but the rules are slighly bent, since for multi variable chain rule:



    if $frac{partial f(g(t),h(t))}{partial t}= frac{partial f}{partial g}frac{partial g}{partial t}+frac{partial f}{partial h}frac{partial h}{partial t}$, but if we cancel these down we get $frac{partial f(g(t),h(t))}{partial t}=2frac{partial f(g(t),h(t))}{partial t}$.



    But in one variable just like above, everything runs smoothly, and it is goodd to note the things like "$dx$" are infinitesimely small changes in x, so when we consider $du/dx$, we consider both "$du$" and "$dx$" as they become infinitesimely small, so we can manipulate them like fractions.







    share|cite|improve this answer












    share|cite|improve this answer



    share|cite|improve this answer










    answered Apr 3 '14 at 9:20









    Ellya

    9,55411226




    9,55411226












    • I know that the notation represents the limit of some fractions, but I suppose my question is "why can we manipulate this limit like it was a fraction?!?" (or rather, the "notation representing this limit...")
      – user1729
      Apr 3 '14 at 9:44












    • We can because when we consider these objects, they are changing, becoming smaller and smaller, so they act like non zero objects, allowing us to manipulate them like fractions. But like I said there are "paradoxical" examples.
      – Ellya
      Apr 3 '14 at 10:06


















    • I know that the notation represents the limit of some fractions, but I suppose my question is "why can we manipulate this limit like it was a fraction?!?" (or rather, the "notation representing this limit...")
      – user1729
      Apr 3 '14 at 9:44












    • We can because when we consider these objects, they are changing, becoming smaller and smaller, so they act like non zero objects, allowing us to manipulate them like fractions. But like I said there are "paradoxical" examples.
      – Ellya
      Apr 3 '14 at 10:06
















    I know that the notation represents the limit of some fractions, but I suppose my question is "why can we manipulate this limit like it was a fraction?!?" (or rather, the "notation representing this limit...")
    – user1729
    Apr 3 '14 at 9:44






    I know that the notation represents the limit of some fractions, but I suppose my question is "why can we manipulate this limit like it was a fraction?!?" (or rather, the "notation representing this limit...")
    – user1729
    Apr 3 '14 at 9:44














    We can because when we consider these objects, they are changing, becoming smaller and smaller, so they act like non zero objects, allowing us to manipulate them like fractions. But like I said there are "paradoxical" examples.
    – Ellya
    Apr 3 '14 at 10:06




    We can because when we consider these objects, they are changing, becoming smaller and smaller, so they act like non zero objects, allowing us to manipulate them like fractions. But like I said there are "paradoxical" examples.
    – Ellya
    Apr 3 '14 at 10:06











    0














    Consider the geometrical interpretation you have a right square with lengths $Delta x$ and $Delta u$ and $f'$ is actually $f'=k=tan(alpha)$ so you get $f'=k=tan(alpha)=frac{Delta u}{Delta x}$. Now let $Delta x rightarrow 0$ and you get a definition of derivation... So du and dx have a meaning and doing something like $dx=frac{du}{f'(x)}$ does have sense.






    share|cite|improve this answer























    • I know that the notation represents the limit of some fractions, but I suppose my question is "why can we manipulate this limit like it was a fraction?!?" (or rather, the "notation representing this limit...")
      – user1729
      Apr 3 '14 at 9:44
















    0














    Consider the geometrical interpretation you have a right square with lengths $Delta x$ and $Delta u$ and $f'$ is actually $f'=k=tan(alpha)$ so you get $f'=k=tan(alpha)=frac{Delta u}{Delta x}$. Now let $Delta x rightarrow 0$ and you get a definition of derivation... So du and dx have a meaning and doing something like $dx=frac{du}{f'(x)}$ does have sense.






    share|cite|improve this answer























    • I know that the notation represents the limit of some fractions, but I suppose my question is "why can we manipulate this limit like it was a fraction?!?" (or rather, the "notation representing this limit...")
      – user1729
      Apr 3 '14 at 9:44














    0












    0








    0






    Consider the geometrical interpretation you have a right square with lengths $Delta x$ and $Delta u$ and $f'$ is actually $f'=k=tan(alpha)$ so you get $f'=k=tan(alpha)=frac{Delta u}{Delta x}$. Now let $Delta x rightarrow 0$ and you get a definition of derivation... So du and dx have a meaning and doing something like $dx=frac{du}{f'(x)}$ does have sense.






    share|cite|improve this answer














    Consider the geometrical interpretation you have a right square with lengths $Delta x$ and $Delta u$ and $f'$ is actually $f'=k=tan(alpha)$ so you get $f'=k=tan(alpha)=frac{Delta u}{Delta x}$. Now let $Delta x rightarrow 0$ and you get a definition of derivation... So du and dx have a meaning and doing something like $dx=frac{du}{f'(x)}$ does have sense.







    share|cite|improve this answer














    share|cite|improve this answer



    share|cite|improve this answer








    edited Apr 3 '14 at 9:46









    user1729

    17.2k64085




    17.2k64085










    answered Apr 3 '14 at 9:36









    MarkisaB

    696310




    696310












    • I know that the notation represents the limit of some fractions, but I suppose my question is "why can we manipulate this limit like it was a fraction?!?" (or rather, the "notation representing this limit...")
      – user1729
      Apr 3 '14 at 9:44


















    • I know that the notation represents the limit of some fractions, but I suppose my question is "why can we manipulate this limit like it was a fraction?!?" (or rather, the "notation representing this limit...")
      – user1729
      Apr 3 '14 at 9:44
















    I know that the notation represents the limit of some fractions, but I suppose my question is "why can we manipulate this limit like it was a fraction?!?" (or rather, the "notation representing this limit...")
    – user1729
    Apr 3 '14 at 9:44




    I know that the notation represents the limit of some fractions, but I suppose my question is "why can we manipulate this limit like it was a fraction?!?" (or rather, the "notation representing this limit...")
    – user1729
    Apr 3 '14 at 9:44











    0















    I know that they are not fractions [...]




    Well, by Non-standard analysis (following a book I referred on a comment about a similar answer), that's where you're wrong. And that is the premise supporting the whole question, if you said the opposite, you wouldn't've made this question.




    My question is: what on earth is going on at line (1)?!?




    First, the $u$-substitution, while used in integration, is on its own an operation of differentiation. As differentiation is a function (on functions), and both sides are equal, the differentials must be equal. It is by definition of any function that $a=bRightarrow f(a)=f(b)$.



    So, what is differentiation? It is the infinitesimal variation of the tangent of a function on a point. What you might be thinking is: why make a distinction between the variation on the tangent and on the function itself if they're coincident when zooming in enough? The answer is this allows us to both define the derivative as a (hyper)real fraction (pun intended), and not simply and informally discard the smaller infinitesimals.



    An accurate image to illustrate this is the following taken from the book:



    http://i.stack.imgur.com/OqYKE.png



    As the differential is on the tangent, and to know the tangent one should know the derivative, the former's definition is $dy = f'(x) dx$. Note everything here are numbers, and by the transfer principal, usual rules of algebra apply.






    share|cite|improve this answer




























      0















      I know that they are not fractions [...]




      Well, by Non-standard analysis (following a book I referred on a comment about a similar answer), that's where you're wrong. And that is the premise supporting the whole question, if you said the opposite, you wouldn't've made this question.




      My question is: what on earth is going on at line (1)?!?




      First, the $u$-substitution, while used in integration, is on its own an operation of differentiation. As differentiation is a function (on functions), and both sides are equal, the differentials must be equal. It is by definition of any function that $a=bRightarrow f(a)=f(b)$.



      So, what is differentiation? It is the infinitesimal variation of the tangent of a function on a point. What you might be thinking is: why make a distinction between the variation on the tangent and on the function itself if they're coincident when zooming in enough? The answer is this allows us to both define the derivative as a (hyper)real fraction (pun intended), and not simply and informally discard the smaller infinitesimals.



      An accurate image to illustrate this is the following taken from the book:



      http://i.stack.imgur.com/OqYKE.png



      As the differential is on the tangent, and to know the tangent one should know the derivative, the former's definition is $dy = f'(x) dx$. Note everything here are numbers, and by the transfer principal, usual rules of algebra apply.






      share|cite|improve this answer


























        0












        0








        0







        I know that they are not fractions [...]




        Well, by Non-standard analysis (following a book I referred on a comment about a similar answer), that's where you're wrong. And that is the premise supporting the whole question, if you said the opposite, you wouldn't've made this question.




        My question is: what on earth is going on at line (1)?!?




        First, the $u$-substitution, while used in integration, is on its own an operation of differentiation. As differentiation is a function (on functions), and both sides are equal, the differentials must be equal. It is by definition of any function that $a=bRightarrow f(a)=f(b)$.



        So, what is differentiation? It is the infinitesimal variation of the tangent of a function on a point. What you might be thinking is: why make a distinction between the variation on the tangent and on the function itself if they're coincident when zooming in enough? The answer is this allows us to both define the derivative as a (hyper)real fraction (pun intended), and not simply and informally discard the smaller infinitesimals.



        An accurate image to illustrate this is the following taken from the book:



        http://i.stack.imgur.com/OqYKE.png



        As the differential is on the tangent, and to know the tangent one should know the derivative, the former's definition is $dy = f'(x) dx$. Note everything here are numbers, and by the transfer principal, usual rules of algebra apply.






        share|cite|improve this answer















        I know that they are not fractions [...]




        Well, by Non-standard analysis (following a book I referred on a comment about a similar answer), that's where you're wrong. And that is the premise supporting the whole question, if you said the opposite, you wouldn't've made this question.




        My question is: what on earth is going on at line (1)?!?




        First, the $u$-substitution, while used in integration, is on its own an operation of differentiation. As differentiation is a function (on functions), and both sides are equal, the differentials must be equal. It is by definition of any function that $a=bRightarrow f(a)=f(b)$.



        So, what is differentiation? It is the infinitesimal variation of the tangent of a function on a point. What you might be thinking is: why make a distinction between the variation on the tangent and on the function itself if they're coincident when zooming in enough? The answer is this allows us to both define the derivative as a (hyper)real fraction (pun intended), and not simply and informally discard the smaller infinitesimals.



        An accurate image to illustrate this is the following taken from the book:



        http://i.stack.imgur.com/OqYKE.png



        As the differential is on the tangent, and to know the tangent one should know the derivative, the former's definition is $dy = f'(x) dx$. Note everything here are numbers, and by the transfer principal, usual rules of algebra apply.







        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited Apr 13 '17 at 12:20









        Community

        1




        1










        answered Apr 4 '14 at 20:17









        JMCF125

        1,4351129




        1,4351129






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Mathematics Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f737928%2fthe-formalism-behind-integration-by-substitution%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Bundesstraße 106

            Verónica Boquete

            Ida-Boy-Ed-Garten