Matrix Differentiation of Kronecker Product












3












$begingroup$


I have a question about differentiating an expression which has multiple kronecker products.



I have the following objective function I would like to differentiate with respect to $mathbf{Q}$:
begin{equation*}
lVertmathbf{y}-mathbf{A}(mathbf{Q}otimesmathbf{Q}otimesmathbf{Q}otimesmathbf{Q})mathbf{x}rVert^2_2
end{equation*}

where $mathbf{y}inmathbb{R}^m$, $mathbf{A}inmathbb{R}^{mtimes K^4}$, $mathbf{Q}inmathbb{R}^{Ktimes K}$ and $mathbf{x}inmathbb{R}^{K^4}$. I am confused with how the chain rule works with respect to matrix differentiation. This is how I proceeded:



Let $ f=lVertmathbf{y}-mathbf{A}(mathbf{Q}otimesmathbf{Q}otimesmathbf{Q}otimesmathbf{Q})mathbf{x}rVert^2_2$
and $mathbf{B}=mathbf{Q}otimesmathbf{Q}otimesmathbf{Q}otimesmathbf{Q}$. Therefore $frac{df}{dmathbf{Q}}=frac{df}{dmathbf{B}}frac{dmathbf{B}}{dmathbf{Q}}$



When I calculate $frac{df}{dmathbf{B}}=mathbf{A}^T(mathbf{y}-mathbf{ABx})mathbf{x}^T$ I gain a $mathbb{R}^{K^4times K^4}$ matrix not a $mathbb{R}^{Ktimes K}$ matrix that I am hoping for.
Therefore I am using the chain rule wrong because of the change in dimensions i.e scalar to matrix.



Thank you for your help in advance.










share|cite|improve this question









$endgroup$












  • $begingroup$
    Of course $df/dB$ is $K^4times K^4$. $B$ itself is $K^4times K^4$. There is no mystery here
    $endgroup$
    – Federico
    Dec 13 '18 at 16:50










  • $begingroup$
    Thanks for your comments! In reply to your first comment I agree there is no mystery but I was lead to believe a scalar-matrix differential has the same dimensions as the matrix i.e $dim(frac{df}{dmathbf{Q}})=dim(mathbf{Q})$ therefore it seems like the chain rule in this case does not preserve dimensionality. Secondly thanks for your longer answer, that definitely clears up part of my question :)
    $endgroup$
    – shex95
    Dec 13 '18 at 17:04






  • 1




    $begingroup$
    Yes, $frac{df}{dQ}$ has the same dimension as $Q$, that is correct. The problem is how you interpret $frac{df}{dB}frac{dB}{dQ}$. The former is a $k^4times k^4$ matrix, the second a $(k^4times k^4)times(ktimes k)$ tensor, and you have to contract the correct indices to obtain the chain rule: $$ frac{df}{dQ_{ij}} = sum_{k,l=1}^{k^4} frac{df}{dB_{kl}} frac{dB_{kl}}{dQ_{ij}} = sum_{k,l=1}^{k^4} left(frac{df}{dB}right)_{kl} frac{dB_{kl}}{dQ_{ij}} $$
    $endgroup$
    – Federico
    Dec 13 '18 at 17:15












  • $begingroup$
    Thanks Federico, you have cleared things up there. I appreciate your help.
    $endgroup$
    – shex95
    Dec 13 '18 at 17:17










  • $begingroup$
    So the contraction between $frac{df}{dB}$ and $frac{dB}{dQ}$ is what some people call double dot product. But my advice is to write out the indices in order to not commit mistakes
    $endgroup$
    – Federico
    Dec 13 '18 at 17:18
















3












$begingroup$


I have a question about differentiating an expression which has multiple kronecker products.



I have the following objective function I would like to differentiate with respect to $mathbf{Q}$:
begin{equation*}
lVertmathbf{y}-mathbf{A}(mathbf{Q}otimesmathbf{Q}otimesmathbf{Q}otimesmathbf{Q})mathbf{x}rVert^2_2
end{equation*}

where $mathbf{y}inmathbb{R}^m$, $mathbf{A}inmathbb{R}^{mtimes K^4}$, $mathbf{Q}inmathbb{R}^{Ktimes K}$ and $mathbf{x}inmathbb{R}^{K^4}$. I am confused with how the chain rule works with respect to matrix differentiation. This is how I proceeded:



Let $ f=lVertmathbf{y}-mathbf{A}(mathbf{Q}otimesmathbf{Q}otimesmathbf{Q}otimesmathbf{Q})mathbf{x}rVert^2_2$
and $mathbf{B}=mathbf{Q}otimesmathbf{Q}otimesmathbf{Q}otimesmathbf{Q}$. Therefore $frac{df}{dmathbf{Q}}=frac{df}{dmathbf{B}}frac{dmathbf{B}}{dmathbf{Q}}$



When I calculate $frac{df}{dmathbf{B}}=mathbf{A}^T(mathbf{y}-mathbf{ABx})mathbf{x}^T$ I gain a $mathbb{R}^{K^4times K^4}$ matrix not a $mathbb{R}^{Ktimes K}$ matrix that I am hoping for.
Therefore I am using the chain rule wrong because of the change in dimensions i.e scalar to matrix.



Thank you for your help in advance.










share|cite|improve this question









$endgroup$












  • $begingroup$
    Of course $df/dB$ is $K^4times K^4$. $B$ itself is $K^4times K^4$. There is no mystery here
    $endgroup$
    – Federico
    Dec 13 '18 at 16:50










  • $begingroup$
    Thanks for your comments! In reply to your first comment I agree there is no mystery but I was lead to believe a scalar-matrix differential has the same dimensions as the matrix i.e $dim(frac{df}{dmathbf{Q}})=dim(mathbf{Q})$ therefore it seems like the chain rule in this case does not preserve dimensionality. Secondly thanks for your longer answer, that definitely clears up part of my question :)
    $endgroup$
    – shex95
    Dec 13 '18 at 17:04






  • 1




    $begingroup$
    Yes, $frac{df}{dQ}$ has the same dimension as $Q$, that is correct. The problem is how you interpret $frac{df}{dB}frac{dB}{dQ}$. The former is a $k^4times k^4$ matrix, the second a $(k^4times k^4)times(ktimes k)$ tensor, and you have to contract the correct indices to obtain the chain rule: $$ frac{df}{dQ_{ij}} = sum_{k,l=1}^{k^4} frac{df}{dB_{kl}} frac{dB_{kl}}{dQ_{ij}} = sum_{k,l=1}^{k^4} left(frac{df}{dB}right)_{kl} frac{dB_{kl}}{dQ_{ij}} $$
    $endgroup$
    – Federico
    Dec 13 '18 at 17:15












  • $begingroup$
    Thanks Federico, you have cleared things up there. I appreciate your help.
    $endgroup$
    – shex95
    Dec 13 '18 at 17:17










  • $begingroup$
    So the contraction between $frac{df}{dB}$ and $frac{dB}{dQ}$ is what some people call double dot product. But my advice is to write out the indices in order to not commit mistakes
    $endgroup$
    – Federico
    Dec 13 '18 at 17:18














3












3








3


1



$begingroup$


I have a question about differentiating an expression which has multiple kronecker products.



I have the following objective function I would like to differentiate with respect to $mathbf{Q}$:
begin{equation*}
lVertmathbf{y}-mathbf{A}(mathbf{Q}otimesmathbf{Q}otimesmathbf{Q}otimesmathbf{Q})mathbf{x}rVert^2_2
end{equation*}

where $mathbf{y}inmathbb{R}^m$, $mathbf{A}inmathbb{R}^{mtimes K^4}$, $mathbf{Q}inmathbb{R}^{Ktimes K}$ and $mathbf{x}inmathbb{R}^{K^4}$. I am confused with how the chain rule works with respect to matrix differentiation. This is how I proceeded:



Let $ f=lVertmathbf{y}-mathbf{A}(mathbf{Q}otimesmathbf{Q}otimesmathbf{Q}otimesmathbf{Q})mathbf{x}rVert^2_2$
and $mathbf{B}=mathbf{Q}otimesmathbf{Q}otimesmathbf{Q}otimesmathbf{Q}$. Therefore $frac{df}{dmathbf{Q}}=frac{df}{dmathbf{B}}frac{dmathbf{B}}{dmathbf{Q}}$



When I calculate $frac{df}{dmathbf{B}}=mathbf{A}^T(mathbf{y}-mathbf{ABx})mathbf{x}^T$ I gain a $mathbb{R}^{K^4times K^4}$ matrix not a $mathbb{R}^{Ktimes K}$ matrix that I am hoping for.
Therefore I am using the chain rule wrong because of the change in dimensions i.e scalar to matrix.



Thank you for your help in advance.










share|cite|improve this question









$endgroup$




I have a question about differentiating an expression which has multiple kronecker products.



I have the following objective function I would like to differentiate with respect to $mathbf{Q}$:
begin{equation*}
lVertmathbf{y}-mathbf{A}(mathbf{Q}otimesmathbf{Q}otimesmathbf{Q}otimesmathbf{Q})mathbf{x}rVert^2_2
end{equation*}

where $mathbf{y}inmathbb{R}^m$, $mathbf{A}inmathbb{R}^{mtimes K^4}$, $mathbf{Q}inmathbb{R}^{Ktimes K}$ and $mathbf{x}inmathbb{R}^{K^4}$. I am confused with how the chain rule works with respect to matrix differentiation. This is how I proceeded:



Let $ f=lVertmathbf{y}-mathbf{A}(mathbf{Q}otimesmathbf{Q}otimesmathbf{Q}otimesmathbf{Q})mathbf{x}rVert^2_2$
and $mathbf{B}=mathbf{Q}otimesmathbf{Q}otimesmathbf{Q}otimesmathbf{Q}$. Therefore $frac{df}{dmathbf{Q}}=frac{df}{dmathbf{B}}frac{dmathbf{B}}{dmathbf{Q}}$



When I calculate $frac{df}{dmathbf{B}}=mathbf{A}^T(mathbf{y}-mathbf{ABx})mathbf{x}^T$ I gain a $mathbb{R}^{K^4times K^4}$ matrix not a $mathbb{R}^{Ktimes K}$ matrix that I am hoping for.
Therefore I am using the chain rule wrong because of the change in dimensions i.e scalar to matrix.



Thank you for your help in advance.







matrix-calculus chain-rule kronecker-product






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Dec 13 '18 at 14:20









shex95shex95

325




325












  • $begingroup$
    Of course $df/dB$ is $K^4times K^4$. $B$ itself is $K^4times K^4$. There is no mystery here
    $endgroup$
    – Federico
    Dec 13 '18 at 16:50










  • $begingroup$
    Thanks for your comments! In reply to your first comment I agree there is no mystery but I was lead to believe a scalar-matrix differential has the same dimensions as the matrix i.e $dim(frac{df}{dmathbf{Q}})=dim(mathbf{Q})$ therefore it seems like the chain rule in this case does not preserve dimensionality. Secondly thanks for your longer answer, that definitely clears up part of my question :)
    $endgroup$
    – shex95
    Dec 13 '18 at 17:04






  • 1




    $begingroup$
    Yes, $frac{df}{dQ}$ has the same dimension as $Q$, that is correct. The problem is how you interpret $frac{df}{dB}frac{dB}{dQ}$. The former is a $k^4times k^4$ matrix, the second a $(k^4times k^4)times(ktimes k)$ tensor, and you have to contract the correct indices to obtain the chain rule: $$ frac{df}{dQ_{ij}} = sum_{k,l=1}^{k^4} frac{df}{dB_{kl}} frac{dB_{kl}}{dQ_{ij}} = sum_{k,l=1}^{k^4} left(frac{df}{dB}right)_{kl} frac{dB_{kl}}{dQ_{ij}} $$
    $endgroup$
    – Federico
    Dec 13 '18 at 17:15












  • $begingroup$
    Thanks Federico, you have cleared things up there. I appreciate your help.
    $endgroup$
    – shex95
    Dec 13 '18 at 17:17










  • $begingroup$
    So the contraction between $frac{df}{dB}$ and $frac{dB}{dQ}$ is what some people call double dot product. But my advice is to write out the indices in order to not commit mistakes
    $endgroup$
    – Federico
    Dec 13 '18 at 17:18


















  • $begingroup$
    Of course $df/dB$ is $K^4times K^4$. $B$ itself is $K^4times K^4$. There is no mystery here
    $endgroup$
    – Federico
    Dec 13 '18 at 16:50










  • $begingroup$
    Thanks for your comments! In reply to your first comment I agree there is no mystery but I was lead to believe a scalar-matrix differential has the same dimensions as the matrix i.e $dim(frac{df}{dmathbf{Q}})=dim(mathbf{Q})$ therefore it seems like the chain rule in this case does not preserve dimensionality. Secondly thanks for your longer answer, that definitely clears up part of my question :)
    $endgroup$
    – shex95
    Dec 13 '18 at 17:04






  • 1




    $begingroup$
    Yes, $frac{df}{dQ}$ has the same dimension as $Q$, that is correct. The problem is how you interpret $frac{df}{dB}frac{dB}{dQ}$. The former is a $k^4times k^4$ matrix, the second a $(k^4times k^4)times(ktimes k)$ tensor, and you have to contract the correct indices to obtain the chain rule: $$ frac{df}{dQ_{ij}} = sum_{k,l=1}^{k^4} frac{df}{dB_{kl}} frac{dB_{kl}}{dQ_{ij}} = sum_{k,l=1}^{k^4} left(frac{df}{dB}right)_{kl} frac{dB_{kl}}{dQ_{ij}} $$
    $endgroup$
    – Federico
    Dec 13 '18 at 17:15












  • $begingroup$
    Thanks Federico, you have cleared things up there. I appreciate your help.
    $endgroup$
    – shex95
    Dec 13 '18 at 17:17










  • $begingroup$
    So the contraction between $frac{df}{dB}$ and $frac{dB}{dQ}$ is what some people call double dot product. But my advice is to write out the indices in order to not commit mistakes
    $endgroup$
    – Federico
    Dec 13 '18 at 17:18
















$begingroup$
Of course $df/dB$ is $K^4times K^4$. $B$ itself is $K^4times K^4$. There is no mystery here
$endgroup$
– Federico
Dec 13 '18 at 16:50




$begingroup$
Of course $df/dB$ is $K^4times K^4$. $B$ itself is $K^4times K^4$. There is no mystery here
$endgroup$
– Federico
Dec 13 '18 at 16:50












$begingroup$
Thanks for your comments! In reply to your first comment I agree there is no mystery but I was lead to believe a scalar-matrix differential has the same dimensions as the matrix i.e $dim(frac{df}{dmathbf{Q}})=dim(mathbf{Q})$ therefore it seems like the chain rule in this case does not preserve dimensionality. Secondly thanks for your longer answer, that definitely clears up part of my question :)
$endgroup$
– shex95
Dec 13 '18 at 17:04




$begingroup$
Thanks for your comments! In reply to your first comment I agree there is no mystery but I was lead to believe a scalar-matrix differential has the same dimensions as the matrix i.e $dim(frac{df}{dmathbf{Q}})=dim(mathbf{Q})$ therefore it seems like the chain rule in this case does not preserve dimensionality. Secondly thanks for your longer answer, that definitely clears up part of my question :)
$endgroup$
– shex95
Dec 13 '18 at 17:04




1




1




$begingroup$
Yes, $frac{df}{dQ}$ has the same dimension as $Q$, that is correct. The problem is how you interpret $frac{df}{dB}frac{dB}{dQ}$. The former is a $k^4times k^4$ matrix, the second a $(k^4times k^4)times(ktimes k)$ tensor, and you have to contract the correct indices to obtain the chain rule: $$ frac{df}{dQ_{ij}} = sum_{k,l=1}^{k^4} frac{df}{dB_{kl}} frac{dB_{kl}}{dQ_{ij}} = sum_{k,l=1}^{k^4} left(frac{df}{dB}right)_{kl} frac{dB_{kl}}{dQ_{ij}} $$
$endgroup$
– Federico
Dec 13 '18 at 17:15






$begingroup$
Yes, $frac{df}{dQ}$ has the same dimension as $Q$, that is correct. The problem is how you interpret $frac{df}{dB}frac{dB}{dQ}$. The former is a $k^4times k^4$ matrix, the second a $(k^4times k^4)times(ktimes k)$ tensor, and you have to contract the correct indices to obtain the chain rule: $$ frac{df}{dQ_{ij}} = sum_{k,l=1}^{k^4} frac{df}{dB_{kl}} frac{dB_{kl}}{dQ_{ij}} = sum_{k,l=1}^{k^4} left(frac{df}{dB}right)_{kl} frac{dB_{kl}}{dQ_{ij}} $$
$endgroup$
– Federico
Dec 13 '18 at 17:15














$begingroup$
Thanks Federico, you have cleared things up there. I appreciate your help.
$endgroup$
– shex95
Dec 13 '18 at 17:17




$begingroup$
Thanks Federico, you have cleared things up there. I appreciate your help.
$endgroup$
– shex95
Dec 13 '18 at 17:17












$begingroup$
So the contraction between $frac{df}{dB}$ and $frac{dB}{dQ}$ is what some people call double dot product. But my advice is to write out the indices in order to not commit mistakes
$endgroup$
– Federico
Dec 13 '18 at 17:18




$begingroup$
So the contraction between $frac{df}{dB}$ and $frac{dB}{dQ}$ is what some people call double dot product. But my advice is to write out the indices in order to not commit mistakes
$endgroup$
– Federico
Dec 13 '18 at 17:18










1 Answer
1






active

oldest

votes


















5












$begingroup$

Short answer: the derivative of $Qotimes Qotimes Qotimes Q$ with respect to $Q$ is a mess, at first sight...



Let's start simple. Let $Q$ be a $Ktimes K$ matrix with entries $Q_{ij}$ and let $E^{ab}$ be the $Ktimes K$ matrix with all $0$ entries, except the entry $(a,b)$ which is $1$; in other words, $(E^{ab})_{ij} = delta_a^idelta_b^j$.



Then I claim that
$$
frac{partial(Qotimes Q)}{partial Q_{ij}} = E^{ij}otimes Q+Qotimes E^{ij} .
$$

I leave it to you to see why, because trying to write out the involved matrices will probably crash the entire Stack Exchange network...



Jokes aside, this is really immediate to see: just write $Qtimes Q$ as in the first formula of the definition and think which elements are affected by $Q_{ij}$. There is the entire $(i,j)$th block, so you get $E^{ij}otimes Q$, but there is also the $(i,j)$th entry in each block, which gives you $Qotimes E^{ij}$.



Now, if $A$ and $B$ are matrices which are functions of $Q$, by the same reasoning you get
$$
frac{partial(Aotimes B)}{partial Q_{ij}} = frac{partial A}{partial Q_{ij}}otimes B + Aotimes frac{partial B}{partial Q_{ij}} .
$$



So you can iterate for instance
$$
begin{split}
frac{partial (Qotimes Qotimes Q)}{partial Q_{ij}}
&= frac{partial(Qotimes Q)}{partial Q_{ij}}otimes Q
+ (Qotimes Q)otimes frac{partial Q}{partial Q_{ij}} \
&= (E^{ij}otimes Q+Qotimes E^{ij})otimes Q + (Qotimes Q)otimes E^{ij} \
&= E^{ij}otimes Qotimes Q + Qotimes E^{ij}otimes Q + Qotimes Qotimes E^{ij}.
end{split}
$$



Now you can prove by induction that
$$
frac{partial bigl(bigotimes_{n=1}^N Qbigr)}{partial Q_{ij}}
= sum_{n=1}^N left(bigotimes_{h=1}^{n-1} Qright) otimes E^{ij} otimes left(bigotimes_{h=n+1}^{N} Qright).
$$



Written more concisely,
$$
frac{partial Q^{otimes N}}{partial Q_{ij}}
= sum_{n=1}^N Q^{otimes (n-1)}otimes E^{ij} otimes Q^{otimes (N-n)} .
$$






share|cite|improve this answer









$endgroup$













    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "69"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3038058%2fmatrix-differentiation-of-kronecker-product%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    5












    $begingroup$

    Short answer: the derivative of $Qotimes Qotimes Qotimes Q$ with respect to $Q$ is a mess, at first sight...



    Let's start simple. Let $Q$ be a $Ktimes K$ matrix with entries $Q_{ij}$ and let $E^{ab}$ be the $Ktimes K$ matrix with all $0$ entries, except the entry $(a,b)$ which is $1$; in other words, $(E^{ab})_{ij} = delta_a^idelta_b^j$.



    Then I claim that
    $$
    frac{partial(Qotimes Q)}{partial Q_{ij}} = E^{ij}otimes Q+Qotimes E^{ij} .
    $$

    I leave it to you to see why, because trying to write out the involved matrices will probably crash the entire Stack Exchange network...



    Jokes aside, this is really immediate to see: just write $Qtimes Q$ as in the first formula of the definition and think which elements are affected by $Q_{ij}$. There is the entire $(i,j)$th block, so you get $E^{ij}otimes Q$, but there is also the $(i,j)$th entry in each block, which gives you $Qotimes E^{ij}$.



    Now, if $A$ and $B$ are matrices which are functions of $Q$, by the same reasoning you get
    $$
    frac{partial(Aotimes B)}{partial Q_{ij}} = frac{partial A}{partial Q_{ij}}otimes B + Aotimes frac{partial B}{partial Q_{ij}} .
    $$



    So you can iterate for instance
    $$
    begin{split}
    frac{partial (Qotimes Qotimes Q)}{partial Q_{ij}}
    &= frac{partial(Qotimes Q)}{partial Q_{ij}}otimes Q
    + (Qotimes Q)otimes frac{partial Q}{partial Q_{ij}} \
    &= (E^{ij}otimes Q+Qotimes E^{ij})otimes Q + (Qotimes Q)otimes E^{ij} \
    &= E^{ij}otimes Qotimes Q + Qotimes E^{ij}otimes Q + Qotimes Qotimes E^{ij}.
    end{split}
    $$



    Now you can prove by induction that
    $$
    frac{partial bigl(bigotimes_{n=1}^N Qbigr)}{partial Q_{ij}}
    = sum_{n=1}^N left(bigotimes_{h=1}^{n-1} Qright) otimes E^{ij} otimes left(bigotimes_{h=n+1}^{N} Qright).
    $$



    Written more concisely,
    $$
    frac{partial Q^{otimes N}}{partial Q_{ij}}
    = sum_{n=1}^N Q^{otimes (n-1)}otimes E^{ij} otimes Q^{otimes (N-n)} .
    $$






    share|cite|improve this answer









    $endgroup$


















      5












      $begingroup$

      Short answer: the derivative of $Qotimes Qotimes Qotimes Q$ with respect to $Q$ is a mess, at first sight...



      Let's start simple. Let $Q$ be a $Ktimes K$ matrix with entries $Q_{ij}$ and let $E^{ab}$ be the $Ktimes K$ matrix with all $0$ entries, except the entry $(a,b)$ which is $1$; in other words, $(E^{ab})_{ij} = delta_a^idelta_b^j$.



      Then I claim that
      $$
      frac{partial(Qotimes Q)}{partial Q_{ij}} = E^{ij}otimes Q+Qotimes E^{ij} .
      $$

      I leave it to you to see why, because trying to write out the involved matrices will probably crash the entire Stack Exchange network...



      Jokes aside, this is really immediate to see: just write $Qtimes Q$ as in the first formula of the definition and think which elements are affected by $Q_{ij}$. There is the entire $(i,j)$th block, so you get $E^{ij}otimes Q$, but there is also the $(i,j)$th entry in each block, which gives you $Qotimes E^{ij}$.



      Now, if $A$ and $B$ are matrices which are functions of $Q$, by the same reasoning you get
      $$
      frac{partial(Aotimes B)}{partial Q_{ij}} = frac{partial A}{partial Q_{ij}}otimes B + Aotimes frac{partial B}{partial Q_{ij}} .
      $$



      So you can iterate for instance
      $$
      begin{split}
      frac{partial (Qotimes Qotimes Q)}{partial Q_{ij}}
      &= frac{partial(Qotimes Q)}{partial Q_{ij}}otimes Q
      + (Qotimes Q)otimes frac{partial Q}{partial Q_{ij}} \
      &= (E^{ij}otimes Q+Qotimes E^{ij})otimes Q + (Qotimes Q)otimes E^{ij} \
      &= E^{ij}otimes Qotimes Q + Qotimes E^{ij}otimes Q + Qotimes Qotimes E^{ij}.
      end{split}
      $$



      Now you can prove by induction that
      $$
      frac{partial bigl(bigotimes_{n=1}^N Qbigr)}{partial Q_{ij}}
      = sum_{n=1}^N left(bigotimes_{h=1}^{n-1} Qright) otimes E^{ij} otimes left(bigotimes_{h=n+1}^{N} Qright).
      $$



      Written more concisely,
      $$
      frac{partial Q^{otimes N}}{partial Q_{ij}}
      = sum_{n=1}^N Q^{otimes (n-1)}otimes E^{ij} otimes Q^{otimes (N-n)} .
      $$






      share|cite|improve this answer









      $endgroup$
















        5












        5








        5





        $begingroup$

        Short answer: the derivative of $Qotimes Qotimes Qotimes Q$ with respect to $Q$ is a mess, at first sight...



        Let's start simple. Let $Q$ be a $Ktimes K$ matrix with entries $Q_{ij}$ and let $E^{ab}$ be the $Ktimes K$ matrix with all $0$ entries, except the entry $(a,b)$ which is $1$; in other words, $(E^{ab})_{ij} = delta_a^idelta_b^j$.



        Then I claim that
        $$
        frac{partial(Qotimes Q)}{partial Q_{ij}} = E^{ij}otimes Q+Qotimes E^{ij} .
        $$

        I leave it to you to see why, because trying to write out the involved matrices will probably crash the entire Stack Exchange network...



        Jokes aside, this is really immediate to see: just write $Qtimes Q$ as in the first formula of the definition and think which elements are affected by $Q_{ij}$. There is the entire $(i,j)$th block, so you get $E^{ij}otimes Q$, but there is also the $(i,j)$th entry in each block, which gives you $Qotimes E^{ij}$.



        Now, if $A$ and $B$ are matrices which are functions of $Q$, by the same reasoning you get
        $$
        frac{partial(Aotimes B)}{partial Q_{ij}} = frac{partial A}{partial Q_{ij}}otimes B + Aotimes frac{partial B}{partial Q_{ij}} .
        $$



        So you can iterate for instance
        $$
        begin{split}
        frac{partial (Qotimes Qotimes Q)}{partial Q_{ij}}
        &= frac{partial(Qotimes Q)}{partial Q_{ij}}otimes Q
        + (Qotimes Q)otimes frac{partial Q}{partial Q_{ij}} \
        &= (E^{ij}otimes Q+Qotimes E^{ij})otimes Q + (Qotimes Q)otimes E^{ij} \
        &= E^{ij}otimes Qotimes Q + Qotimes E^{ij}otimes Q + Qotimes Qotimes E^{ij}.
        end{split}
        $$



        Now you can prove by induction that
        $$
        frac{partial bigl(bigotimes_{n=1}^N Qbigr)}{partial Q_{ij}}
        = sum_{n=1}^N left(bigotimes_{h=1}^{n-1} Qright) otimes E^{ij} otimes left(bigotimes_{h=n+1}^{N} Qright).
        $$



        Written more concisely,
        $$
        frac{partial Q^{otimes N}}{partial Q_{ij}}
        = sum_{n=1}^N Q^{otimes (n-1)}otimes E^{ij} otimes Q^{otimes (N-n)} .
        $$






        share|cite|improve this answer









        $endgroup$



        Short answer: the derivative of $Qotimes Qotimes Qotimes Q$ with respect to $Q$ is a mess, at first sight...



        Let's start simple. Let $Q$ be a $Ktimes K$ matrix with entries $Q_{ij}$ and let $E^{ab}$ be the $Ktimes K$ matrix with all $0$ entries, except the entry $(a,b)$ which is $1$; in other words, $(E^{ab})_{ij} = delta_a^idelta_b^j$.



        Then I claim that
        $$
        frac{partial(Qotimes Q)}{partial Q_{ij}} = E^{ij}otimes Q+Qotimes E^{ij} .
        $$

        I leave it to you to see why, because trying to write out the involved matrices will probably crash the entire Stack Exchange network...



        Jokes aside, this is really immediate to see: just write $Qtimes Q$ as in the first formula of the definition and think which elements are affected by $Q_{ij}$. There is the entire $(i,j)$th block, so you get $E^{ij}otimes Q$, but there is also the $(i,j)$th entry in each block, which gives you $Qotimes E^{ij}$.



        Now, if $A$ and $B$ are matrices which are functions of $Q$, by the same reasoning you get
        $$
        frac{partial(Aotimes B)}{partial Q_{ij}} = frac{partial A}{partial Q_{ij}}otimes B + Aotimes frac{partial B}{partial Q_{ij}} .
        $$



        So you can iterate for instance
        $$
        begin{split}
        frac{partial (Qotimes Qotimes Q)}{partial Q_{ij}}
        &= frac{partial(Qotimes Q)}{partial Q_{ij}}otimes Q
        + (Qotimes Q)otimes frac{partial Q}{partial Q_{ij}} \
        &= (E^{ij}otimes Q+Qotimes E^{ij})otimes Q + (Qotimes Q)otimes E^{ij} \
        &= E^{ij}otimes Qotimes Q + Qotimes E^{ij}otimes Q + Qotimes Qotimes E^{ij}.
        end{split}
        $$



        Now you can prove by induction that
        $$
        frac{partial bigl(bigotimes_{n=1}^N Qbigr)}{partial Q_{ij}}
        = sum_{n=1}^N left(bigotimes_{h=1}^{n-1} Qright) otimes E^{ij} otimes left(bigotimes_{h=n+1}^{N} Qright).
        $$



        Written more concisely,
        $$
        frac{partial Q^{otimes N}}{partial Q_{ij}}
        = sum_{n=1}^N Q^{otimes (n-1)}otimes E^{ij} otimes Q^{otimes (N-n)} .
        $$







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered Dec 13 '18 at 16:47









        FedericoFederico

        5,124514




        5,124514






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Mathematics Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3038058%2fmatrix-differentiation-of-kronecker-product%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Willebadessen

            Ida-Boy-Ed-Garten

            Residenzschloss Arolsen