extract characters between two commas?





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







3















I have a file with ~ 3 million rows, here is the first few lines of my file:



head out.txt
NA
NA
NA
NA
NA
gene85752,gene85753
gene85752,gene85753
gene85752,gene85753
gene85752,gene85753
gene85752,gene85753
gene85752,gene85753
gene85752,gene85753,gene85754
gene85752,gene85753,gene85754
gene85752,gene85753,gene85754
gene85752,gene85753,gene85754
gene85752,gene85753
gene85752,gene85753
gene85752,gene85753
gene85752,gene85753
gene85752,gene85753
gene85752,gene85753
gene85752,gene85753
gene85752,gene85753
gene85752,gene85753
gene85752
gene85752


For those rows that are separated by ",", I want to keep everything after the first comma and before the second comma.
This is my desired output:



outgood.txt
NA
NA
NA
NA
NA
gene85753
gene85753
gene85753
gene85753
gene85753
gene85753
gene85753
gene85753
gene85753
gene85753
gene85753
gene85753
gene85753
gene85753
gene85753
gene85753
gene85753
gene85753
gene85753
gene85752
gene85752









share|improve this question





























    3















    I have a file with ~ 3 million rows, here is the first few lines of my file:



    head out.txt
    NA
    NA
    NA
    NA
    NA
    gene85752,gene85753
    gene85752,gene85753
    gene85752,gene85753
    gene85752,gene85753
    gene85752,gene85753
    gene85752,gene85753
    gene85752,gene85753,gene85754
    gene85752,gene85753,gene85754
    gene85752,gene85753,gene85754
    gene85752,gene85753,gene85754
    gene85752,gene85753
    gene85752,gene85753
    gene85752,gene85753
    gene85752,gene85753
    gene85752,gene85753
    gene85752,gene85753
    gene85752,gene85753
    gene85752,gene85753
    gene85752,gene85753
    gene85752
    gene85752


    For those rows that are separated by ",", I want to keep everything after the first comma and before the second comma.
    This is my desired output:



    outgood.txt
    NA
    NA
    NA
    NA
    NA
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85753
    gene85752
    gene85752









    share|improve this question

























      3












      3








      3








      I have a file with ~ 3 million rows, here is the first few lines of my file:



      head out.txt
      NA
      NA
      NA
      NA
      NA
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753,gene85754
      gene85752,gene85753,gene85754
      gene85752,gene85753,gene85754
      gene85752,gene85753,gene85754
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752
      gene85752


      For those rows that are separated by ",", I want to keep everything after the first comma and before the second comma.
      This is my desired output:



      outgood.txt
      NA
      NA
      NA
      NA
      NA
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85752
      gene85752









      share|improve this question














      I have a file with ~ 3 million rows, here is the first few lines of my file:



      head out.txt
      NA
      NA
      NA
      NA
      NA
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753,gene85754
      gene85752,gene85753,gene85754
      gene85752,gene85753,gene85754
      gene85752,gene85753,gene85754
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752,gene85753
      gene85752
      gene85752


      For those rows that are separated by ",", I want to keep everything after the first comma and before the second comma.
      This is my desired output:



      outgood.txt
      NA
      NA
      NA
      NA
      NA
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85753
      gene85752
      gene85752






      text-processing awk






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 8 hours ago









      Anna1364Anna1364

      449213




      449213






















          3 Answers
          3






          active

          oldest

          votes


















          13














          Since cut prints non-delimited lines by default the following works



          cut -f2 -d, file





          share|improve this answer



















          • 1





            It's nice when someone remember the little quirks of standard tools.

            – Kusalananda
            8 hours ago



















          2














          awk -F, 'NF > 1 { $1 = $2 } { print $1 }' file


          This uses awk to parse the file as lines consisting of comma-delimited fields.



          The code detects when there is more than a single field on a line, and when there is, the first field is replaced by the second field. The first field, either unmodified or modified by the conditional code, is then printed.






          share|improve this answer
























          • With a big file, this would probably be faster: awk -F, '{print(NF>1 ? $2 : $1)}' -- since you won't have to rewrite $0

            – glenn jackman
            6 hours ago











          • @glennjackman Well, the cut solution would be even faster in any case.

            – Kusalananda
            6 hours ago



















          1














          awk -F, 'NF == 1 {print $1}
          NF > 1 { print $2}' filename


          This will print just the first string if there is no comma, second string if there is one or more comma.






          share|improve this answer
























            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "106"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f511284%2fextract-characters-between-two-commas%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            3 Answers
            3






            active

            oldest

            votes








            3 Answers
            3






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            13














            Since cut prints non-delimited lines by default the following works



            cut -f2 -d, file





            share|improve this answer



















            • 1





              It's nice when someone remember the little quirks of standard tools.

              – Kusalananda
              8 hours ago
















            13














            Since cut prints non-delimited lines by default the following works



            cut -f2 -d, file





            share|improve this answer



















            • 1





              It's nice when someone remember the little quirks of standard tools.

              – Kusalananda
              8 hours ago














            13












            13








            13







            Since cut prints non-delimited lines by default the following works



            cut -f2 -d, file





            share|improve this answer













            Since cut prints non-delimited lines by default the following works



            cut -f2 -d, file






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered 8 hours ago









            iruvariruvar

            12.4k63063




            12.4k63063








            • 1





              It's nice when someone remember the little quirks of standard tools.

              – Kusalananda
              8 hours ago














            • 1





              It's nice when someone remember the little quirks of standard tools.

              – Kusalananda
              8 hours ago








            1




            1





            It's nice when someone remember the little quirks of standard tools.

            – Kusalananda
            8 hours ago





            It's nice when someone remember the little quirks of standard tools.

            – Kusalananda
            8 hours ago













            2














            awk -F, 'NF > 1 { $1 = $2 } { print $1 }' file


            This uses awk to parse the file as lines consisting of comma-delimited fields.



            The code detects when there is more than a single field on a line, and when there is, the first field is replaced by the second field. The first field, either unmodified or modified by the conditional code, is then printed.






            share|improve this answer
























            • With a big file, this would probably be faster: awk -F, '{print(NF>1 ? $2 : $1)}' -- since you won't have to rewrite $0

              – glenn jackman
              6 hours ago











            • @glennjackman Well, the cut solution would be even faster in any case.

              – Kusalananda
              6 hours ago
















            2














            awk -F, 'NF > 1 { $1 = $2 } { print $1 }' file


            This uses awk to parse the file as lines consisting of comma-delimited fields.



            The code detects when there is more than a single field on a line, and when there is, the first field is replaced by the second field. The first field, either unmodified or modified by the conditional code, is then printed.






            share|improve this answer
























            • With a big file, this would probably be faster: awk -F, '{print(NF>1 ? $2 : $1)}' -- since you won't have to rewrite $0

              – glenn jackman
              6 hours ago











            • @glennjackman Well, the cut solution would be even faster in any case.

              – Kusalananda
              6 hours ago














            2












            2








            2







            awk -F, 'NF > 1 { $1 = $2 } { print $1 }' file


            This uses awk to parse the file as lines consisting of comma-delimited fields.



            The code detects when there is more than a single field on a line, and when there is, the first field is replaced by the second field. The first field, either unmodified or modified by the conditional code, is then printed.






            share|improve this answer













            awk -F, 'NF > 1 { $1 = $2 } { print $1 }' file


            This uses awk to parse the file as lines consisting of comma-delimited fields.



            The code detects when there is more than a single field on a line, and when there is, the first field is replaced by the second field. The first field, either unmodified or modified by the conditional code, is then printed.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered 8 hours ago









            KusalanandaKusalananda

            140k17261435




            140k17261435













            • With a big file, this would probably be faster: awk -F, '{print(NF>1 ? $2 : $1)}' -- since you won't have to rewrite $0

              – glenn jackman
              6 hours ago











            • @glennjackman Well, the cut solution would be even faster in any case.

              – Kusalananda
              6 hours ago



















            • With a big file, this would probably be faster: awk -F, '{print(NF>1 ? $2 : $1)}' -- since you won't have to rewrite $0

              – glenn jackman
              6 hours ago











            • @glennjackman Well, the cut solution would be even faster in any case.

              – Kusalananda
              6 hours ago

















            With a big file, this would probably be faster: awk -F, '{print(NF>1 ? $2 : $1)}' -- since you won't have to rewrite $0

            – glenn jackman
            6 hours ago





            With a big file, this would probably be faster: awk -F, '{print(NF>1 ? $2 : $1)}' -- since you won't have to rewrite $0

            – glenn jackman
            6 hours ago













            @glennjackman Well, the cut solution would be even faster in any case.

            – Kusalananda
            6 hours ago





            @glennjackman Well, the cut solution would be even faster in any case.

            – Kusalananda
            6 hours ago











            1














            awk -F, 'NF == 1 {print $1}
            NF > 1 { print $2}' filename


            This will print just the first string if there is no comma, second string if there is one or more comma.






            share|improve this answer




























              1














              awk -F, 'NF == 1 {print $1}
              NF > 1 { print $2}' filename


              This will print just the first string if there is no comma, second string if there is one or more comma.






              share|improve this answer


























                1












                1








                1







                awk -F, 'NF == 1 {print $1}
                NF > 1 { print $2}' filename


                This will print just the first string if there is no comma, second string if there is one or more comma.






                share|improve this answer













                awk -F, 'NF == 1 {print $1}
                NF > 1 { print $2}' filename


                This will print just the first string if there is no comma, second string if there is one or more comma.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 8 hours ago









                unxnutunxnut

                3,80721120




                3,80721120






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Unix & Linux Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f511284%2fextract-characters-between-two-commas%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Bundesstraße 106

                    Verónica Boquete

                    Ida-Boy-Ed-Garten