How can I replace end line with fixed text when the next line begins with a defined set of characters?











up vote
2
down vote

favorite












I have several big files with some measurements.



It looks this way:



N 12344;PE 9.9999999;...
#S 0 0 31 44 75 130 165 196...
#S_+ "2 5 2 3 3 1 1 2 3 1 2 2...

N 12345;PE 9.9999999;...
#S 0 0 34 57 84 133 152...
#S_+ "1 0 1 1 2 3 0 0 0...

N 12346;PE 9.9999999;...
#S 0 0 31 44 73 140 169...
#S_+ "3 3 4 0 0 2 1 2 4...

N 25104;PE 9.9999999;...
#S 0 0 36 52 102 108 145...
#S_+ "1 1 0 1 0 0 3 0 1...

N 25105;PE 9.9999999;...
#S 0 0 32 58 88 130 143...


Sample is here:
http://pasted.co/d9806b7c4



The file is much bigger but I replaced part of the data with "..." to make it shorter.



I need to somehow replace the line ends before "#S" - in fact simply merge the "N" line with the following two ones into one line (or with the following three ones so I can get rid of the blank lines). Expect output like this:



N 12344;PE 9.9999999; #S 0 0 31 44 75 130 165 196 #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...
N 12345;PE 9.9999999; #S 0 0 34 57 84 133 152 #S_+ "1 0 1 1 2 3 0 0 0...
N 12346;PE 9.9999999; #S 0 0 31 44 73 140 169 #S_+ "3 3 4 0 0 2 1 2 4...
N 25104;PE 9.9999999; #S 0 0 36 52 102 108 145 #S_+ "1 1 0 1 0 0 3 0 1...
N 25105;PE 9.9999999; #S 0 0 32 58 88 130 143...


Is this possible to achieve using some command-line utility in linux?



My knowledge is quite limited in this area so I would appreciate any help.



thanks










share|improve this question
























  • thanks to grawity for helping me with the code :-)
    – Juhele
    Nov 21 at 14:09






  • 1




    @Pimp Juice IT: OK, I updated the question.
    – Juhele
    Nov 21 at 14:14










  • Hi @Juhele can you specify better the output format: Do you need to cut the first line after e.g. PE 9.9999999;, do you need to cut the second after the 7th (8th) number or, as you write, merge the "N" line with the following two ones? What about the " present only in the output ?! I give some edit to your post, please check it. It can be an incomplete file? BTW for the most simple case you already have more than one good answer.
    – Hastur
    Nov 22 at 9:09












  • the " is both in input and output like #S_+ "2 5 - it is not important character for me (I am going to remove it in next processing step) but it is just in the input data. "do you need to cut the second after the 7th (8th) number" - No, I just shortened the example as the data has several hundreds "columns".
    – Juhele
    Dec 4 at 9:05






  • 1




    For personal experiences with acquisition devices do redundancy checks (the more the better). I suppose you already know, but with enough lines, it is not important how little is the likelihood of corrupted data, if it is not zero it will occur. Many time is enough awk '{print NF}' YourFILE | sort -n -u | uniq -c to know that you have the same number of columns in (almost, because of headers) each line, or a consistent structure (3 lines of data 1 blank...)
    – Hastur
    Dec 4 at 11:09

















up vote
2
down vote

favorite












I have several big files with some measurements.



It looks this way:



N 12344;PE 9.9999999;...
#S 0 0 31 44 75 130 165 196...
#S_+ "2 5 2 3 3 1 1 2 3 1 2 2...

N 12345;PE 9.9999999;...
#S 0 0 34 57 84 133 152...
#S_+ "1 0 1 1 2 3 0 0 0...

N 12346;PE 9.9999999;...
#S 0 0 31 44 73 140 169...
#S_+ "3 3 4 0 0 2 1 2 4...

N 25104;PE 9.9999999;...
#S 0 0 36 52 102 108 145...
#S_+ "1 1 0 1 0 0 3 0 1...

N 25105;PE 9.9999999;...
#S 0 0 32 58 88 130 143...


Sample is here:
http://pasted.co/d9806b7c4



The file is much bigger but I replaced part of the data with "..." to make it shorter.



I need to somehow replace the line ends before "#S" - in fact simply merge the "N" line with the following two ones into one line (or with the following three ones so I can get rid of the blank lines). Expect output like this:



N 12344;PE 9.9999999; #S 0 0 31 44 75 130 165 196 #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...
N 12345;PE 9.9999999; #S 0 0 34 57 84 133 152 #S_+ "1 0 1 1 2 3 0 0 0...
N 12346;PE 9.9999999; #S 0 0 31 44 73 140 169 #S_+ "3 3 4 0 0 2 1 2 4...
N 25104;PE 9.9999999; #S 0 0 36 52 102 108 145 #S_+ "1 1 0 1 0 0 3 0 1...
N 25105;PE 9.9999999; #S 0 0 32 58 88 130 143...


Is this possible to achieve using some command-line utility in linux?



My knowledge is quite limited in this area so I would appreciate any help.



thanks










share|improve this question
























  • thanks to grawity for helping me with the code :-)
    – Juhele
    Nov 21 at 14:09






  • 1




    @Pimp Juice IT: OK, I updated the question.
    – Juhele
    Nov 21 at 14:14










  • Hi @Juhele can you specify better the output format: Do you need to cut the first line after e.g. PE 9.9999999;, do you need to cut the second after the 7th (8th) number or, as you write, merge the "N" line with the following two ones? What about the " present only in the output ?! I give some edit to your post, please check it. It can be an incomplete file? BTW for the most simple case you already have more than one good answer.
    – Hastur
    Nov 22 at 9:09












  • the " is both in input and output like #S_+ "2 5 - it is not important character for me (I am going to remove it in next processing step) but it is just in the input data. "do you need to cut the second after the 7th (8th) number" - No, I just shortened the example as the data has several hundreds "columns".
    – Juhele
    Dec 4 at 9:05






  • 1




    For personal experiences with acquisition devices do redundancy checks (the more the better). I suppose you already know, but with enough lines, it is not important how little is the likelihood of corrupted data, if it is not zero it will occur. Many time is enough awk '{print NF}' YourFILE | sort -n -u | uniq -c to know that you have the same number of columns in (almost, because of headers) each line, or a consistent structure (3 lines of data 1 blank...)
    – Hastur
    Dec 4 at 11:09















up vote
2
down vote

favorite









up vote
2
down vote

favorite











I have several big files with some measurements.



It looks this way:



N 12344;PE 9.9999999;...
#S 0 0 31 44 75 130 165 196...
#S_+ "2 5 2 3 3 1 1 2 3 1 2 2...

N 12345;PE 9.9999999;...
#S 0 0 34 57 84 133 152...
#S_+ "1 0 1 1 2 3 0 0 0...

N 12346;PE 9.9999999;...
#S 0 0 31 44 73 140 169...
#S_+ "3 3 4 0 0 2 1 2 4...

N 25104;PE 9.9999999;...
#S 0 0 36 52 102 108 145...
#S_+ "1 1 0 1 0 0 3 0 1...

N 25105;PE 9.9999999;...
#S 0 0 32 58 88 130 143...


Sample is here:
http://pasted.co/d9806b7c4



The file is much bigger but I replaced part of the data with "..." to make it shorter.



I need to somehow replace the line ends before "#S" - in fact simply merge the "N" line with the following two ones into one line (or with the following three ones so I can get rid of the blank lines). Expect output like this:



N 12344;PE 9.9999999; #S 0 0 31 44 75 130 165 196 #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...
N 12345;PE 9.9999999; #S 0 0 34 57 84 133 152 #S_+ "1 0 1 1 2 3 0 0 0...
N 12346;PE 9.9999999; #S 0 0 31 44 73 140 169 #S_+ "3 3 4 0 0 2 1 2 4...
N 25104;PE 9.9999999; #S 0 0 36 52 102 108 145 #S_+ "1 1 0 1 0 0 3 0 1...
N 25105;PE 9.9999999; #S 0 0 32 58 88 130 143...


Is this possible to achieve using some command-line utility in linux?



My knowledge is quite limited in this area so I would appreciate any help.



thanks










share|improve this question















I have several big files with some measurements.



It looks this way:



N 12344;PE 9.9999999;...
#S 0 0 31 44 75 130 165 196...
#S_+ "2 5 2 3 3 1 1 2 3 1 2 2...

N 12345;PE 9.9999999;...
#S 0 0 34 57 84 133 152...
#S_+ "1 0 1 1 2 3 0 0 0...

N 12346;PE 9.9999999;...
#S 0 0 31 44 73 140 169...
#S_+ "3 3 4 0 0 2 1 2 4...

N 25104;PE 9.9999999;...
#S 0 0 36 52 102 108 145...
#S_+ "1 1 0 1 0 0 3 0 1...

N 25105;PE 9.9999999;...
#S 0 0 32 58 88 130 143...


Sample is here:
http://pasted.co/d9806b7c4



The file is much bigger but I replaced part of the data with "..." to make it shorter.



I need to somehow replace the line ends before "#S" - in fact simply merge the "N" line with the following two ones into one line (or with the following three ones so I can get rid of the blank lines). Expect output like this:



N 12344;PE 9.9999999; #S 0 0 31 44 75 130 165 196 #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...
N 12345;PE 9.9999999; #S 0 0 34 57 84 133 152 #S_+ "1 0 1 1 2 3 0 0 0...
N 12346;PE 9.9999999; #S 0 0 31 44 73 140 169 #S_+ "3 3 4 0 0 2 1 2 4...
N 25104;PE 9.9999999; #S 0 0 36 52 102 108 145 #S_+ "1 1 0 1 0 0 3 0 1...
N 25105;PE 9.9999999; #S 0 0 32 58 88 130 143...


Is this possible to achieve using some command-line utility in linux?



My knowledge is quite limited in this area so I would appreciate any help.



thanks







linux command-line regex






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 22 at 11:39









Toto

3,37191126




3,37191126










asked Nov 21 at 13:52









Juhele

2,07221222




2,07221222












  • thanks to grawity for helping me with the code :-)
    – Juhele
    Nov 21 at 14:09






  • 1




    @Pimp Juice IT: OK, I updated the question.
    – Juhele
    Nov 21 at 14:14










  • Hi @Juhele can you specify better the output format: Do you need to cut the first line after e.g. PE 9.9999999;, do you need to cut the second after the 7th (8th) number or, as you write, merge the "N" line with the following two ones? What about the " present only in the output ?! I give some edit to your post, please check it. It can be an incomplete file? BTW for the most simple case you already have more than one good answer.
    – Hastur
    Nov 22 at 9:09












  • the " is both in input and output like #S_+ "2 5 - it is not important character for me (I am going to remove it in next processing step) but it is just in the input data. "do you need to cut the second after the 7th (8th) number" - No, I just shortened the example as the data has several hundreds "columns".
    – Juhele
    Dec 4 at 9:05






  • 1




    For personal experiences with acquisition devices do redundancy checks (the more the better). I suppose you already know, but with enough lines, it is not important how little is the likelihood of corrupted data, if it is not zero it will occur. Many time is enough awk '{print NF}' YourFILE | sort -n -u | uniq -c to know that you have the same number of columns in (almost, because of headers) each line, or a consistent structure (3 lines of data 1 blank...)
    – Hastur
    Dec 4 at 11:09




















  • thanks to grawity for helping me with the code :-)
    – Juhele
    Nov 21 at 14:09






  • 1




    @Pimp Juice IT: OK, I updated the question.
    – Juhele
    Nov 21 at 14:14










  • Hi @Juhele can you specify better the output format: Do you need to cut the first line after e.g. PE 9.9999999;, do you need to cut the second after the 7th (8th) number or, as you write, merge the "N" line with the following two ones? What about the " present only in the output ?! I give some edit to your post, please check it. It can be an incomplete file? BTW for the most simple case you already have more than one good answer.
    – Hastur
    Nov 22 at 9:09












  • the " is both in input and output like #S_+ "2 5 - it is not important character for me (I am going to remove it in next processing step) but it is just in the input data. "do you need to cut the second after the 7th (8th) number" - No, I just shortened the example as the data has several hundreds "columns".
    – Juhele
    Dec 4 at 9:05






  • 1




    For personal experiences with acquisition devices do redundancy checks (the more the better). I suppose you already know, but with enough lines, it is not important how little is the likelihood of corrupted data, if it is not zero it will occur. Many time is enough awk '{print NF}' YourFILE | sort -n -u | uniq -c to know that you have the same number of columns in (almost, because of headers) each line, or a consistent structure (3 lines of data 1 blank...)
    – Hastur
    Dec 4 at 11:09


















thanks to grawity for helping me with the code :-)
– Juhele
Nov 21 at 14:09




thanks to grawity for helping me with the code :-)
– Juhele
Nov 21 at 14:09




1




1




@Pimp Juice IT: OK, I updated the question.
– Juhele
Nov 21 at 14:14




@Pimp Juice IT: OK, I updated the question.
– Juhele
Nov 21 at 14:14












Hi @Juhele can you specify better the output format: Do you need to cut the first line after e.g. PE 9.9999999;, do you need to cut the second after the 7th (8th) number or, as you write, merge the "N" line with the following two ones? What about the " present only in the output ?! I give some edit to your post, please check it. It can be an incomplete file? BTW for the most simple case you already have more than one good answer.
– Hastur
Nov 22 at 9:09






Hi @Juhele can you specify better the output format: Do you need to cut the first line after e.g. PE 9.9999999;, do you need to cut the second after the 7th (8th) number or, as you write, merge the "N" line with the following two ones? What about the " present only in the output ?! I give some edit to your post, please check it. It can be an incomplete file? BTW for the most simple case you already have more than one good answer.
– Hastur
Nov 22 at 9:09














the " is both in input and output like #S_+ "2 5 - it is not important character for me (I am going to remove it in next processing step) but it is just in the input data. "do you need to cut the second after the 7th (8th) number" - No, I just shortened the example as the data has several hundreds "columns".
– Juhele
Dec 4 at 9:05




the " is both in input and output like #S_+ "2 5 - it is not important character for me (I am going to remove it in next processing step) but it is just in the input data. "do you need to cut the second after the 7th (8th) number" - No, I just shortened the example as the data has several hundreds "columns".
– Juhele
Dec 4 at 9:05




1




1




For personal experiences with acquisition devices do redundancy checks (the more the better). I suppose you already know, but with enough lines, it is not important how little is the likelihood of corrupted data, if it is not zero it will occur. Many time is enough awk '{print NF}' YourFILE | sort -n -u | uniq -c to know that you have the same number of columns in (almost, because of headers) each line, or a consistent structure (3 lines of data 1 blank...)
– Hastur
Dec 4 at 11:09






For personal experiences with acquisition devices do redundancy checks (the more the better). I suppose you already know, but with enough lines, it is not important how little is the likelihood of corrupted data, if it is not zero it will occur. Many time is enough awk '{print NF}' YourFILE | sort -n -u | uniq -c to know that you have the same number of columns in (almost, because of headers) each line, or a consistent structure (3 lines of data 1 blank...)
– Hastur
Dec 4 at 11:09












6 Answers
6






active

oldest

votes

















up vote
4
down vote













With sed:



sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' data


In slow-mo:





  • -z makes sed consider the file as a single line (so the line ends are plain characters)


  • 's/n#S/#S/g' replaces all LF's occurring just before a #S by a space


  • -e 's/nN /N /g' replaces all LFs before N (ie, the blank lines)






share|improve this answer





















  • hmm, looks I will have to adjust it a little bit as "sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" still does not do the same as replacing "rn#S" with "#S" in Notepad++ so there is still CR left on the previous line before "#S".
    – Juhele
    Dec 4 at 10:59










  • OK, "sed -z -e 's/rn#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" finally turns the multiple lines into one. Just blank line remains as the "N..." line ends with CR LF (this is fine) and then there is a blank line with CR.
    – Juhele
    Dec 4 at 11:09


















up vote
4
down vote













With paste (this requires to always have groups of 4 lines):



 paste -s -d '   n' data


In slo-mo:





  • paste -s concatenates the lines from the file


  • -d specifies characters to be inserted as delimiters. When there are several characters, they are used in a round-robin fashion, so with 3 spaces and a LF:


    • the first space is used on the first splice (N to #S),

    • the second space is used on the second splice (#S to #S),

    • the third space is used on the thrid splice (#S to blank line),

    • the last delimiter, a LF, is used on the fourth splice (blank line to N)

    • and the cycle repeats for the next 4 lines.








share|improve this answer




























    up vote
    4
    down vote













    This is a portable solution with POSIX sed, implementing the following rules:




    • empty lines shall be deleted;

    • any line starting with #S shall be merged with the previous non-empty line, with a single space character between them, unless there is no previous non-empty line.


    The code:



    <data sed '/^$/ d; :start; N; s/n$//; t start; s/n#S/ #S/; t start; P; D'


    The same with comments (still working code):



    <data sed '
    /^$/ d # If empty line read, delete it and start a new cycle.
    :start # A label.
    N # Read additional line, there are now two lines in the pattern space.
    s/n$// # If the second line is empty, replace the newline with nothing.
    t start # If the above replacement occurred, go to start (to add another line).
    # Otherwise
    s/n#S/ #S/ # if the second line starts with #S, replace the newline with space.
    t start # If the above replacement occurred, go to start (to add another line).
    # Otherwise
    # (i.e when non-empty line not starting with #S occurred)
    P # print the pattern space up to the first newline and...
    D # delete the initial segment of the pattern space
    # through the first newline (i.e. everything just printed),
    # and start the next cycle with the resultant pattern space
    # and without reading any new input
    # (in our case the new input will be explicitly read by N then).
    '


    Note the solution uses sed pattern space to accumulate many input lines. This remark applies:




    The pattern and hold spaces shall each be able to hold at least 8192 bytes.




    Just before the P command the pattern space holds one (relatively long) line meant to be printed and a single (relatively short) input line, plus a newline in between. Obviously it depends on your data, whether or not such structure exceeds 8192 bytes at some point. If it does, some sed implementations may fail.






    share|improve this answer






























      up vote
      3
      down vote













      Using Perl:



      perl -0 -ape 's/R(?=RN|#)/ /g' file.txt
      N 12344;PE 9.9999999;... #S 0 0 31 44 75 130 165 196... #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...
      N 12345;PE 9.9999999;... #S 0 0 34 57 84 133 152... #S_+ "1 0 1 1 2 3 0 0 0...
      N 12346;PE 9.9999999;... #S 0 0 31 44 73 140 169... #S_+ "3 3 4 0 0 2 1 2 4...
      N 25104;PE 9.9999999;... #S 0 0 36 52 102 108 145... #S_+ "1 1 0 1 0 0 3 0 1...
      N 25105;PE 9.9999999;... #S 0 0 32 58 88 130 143...


      Regex explain:



      s/              : substitute
      R : any kind of line break (ie. r, n, rn)
      (?= : positive lookahead, zero-length assertion that make sure we have after
      RN : a line break followed by letter N
      | : OR
      # : # character
      ) : end lookahead
      / /g : replace with a space, global





      share|improve this answer






























        up vote
        3
        down vote













        awk (gawk [1])



        As usually other than sed you can use awk (and in many different ways...)



        awk 'ORS=" "; NR % 4 == 0 && ORS="n" ' data


        where





        • ORS=" " fixes the output record separator, by default a newline, to a space (you can change)


        • NR % 4 == 0 && ORS="n" each 4th line it fixes back to the newline n

        • If nothing else is specified awk prints the full line


        • data is your data file.


        If you want you can use regex as in sed (in a similar way).





        A format check version with awk



        Even if not requested, you may want to manage a truncated file eliminating the corrupted output line and generating an error and an error message.



        awk '{a=$0; getline b; getline c; 
        if ( getline > 0 ) {print a, b, c, $0 }
        else { print "Ohi " > "/dev/stderr" ; exit 65; } }' data


        where





        • a=$0; puts the full line in the variable a


        • getline b; reads a line and puts the variable b


        • getline c; obscure unfathomable command :-)


        • if (getline) if it is able to read a line...

        • ..............{print a, b, c, $0} prints the 4 lines


        • else prints an error on the stderr device (screen or other) you can custom here...


        • exit 65 return an exit code different from 0 --->error


        Bonus: why 65?



        Searching for a good value for your exit code [2] you may found that it is suggested to see in /usr/include/sysexits.h among some C standards...



          #define EX_DATAERR      65      /* data format error */


        65 is the most appropriate for the a data format error...



        Honestly as answer I preferred 42,

        but each value different from zero (and not reserved[2]) could be good and 65 is the specific one...






        share|improve this answer























        • One disadvantage though: the last pack of lines may consist of three of them (i.e. no empty line at the very end); or may not. If three, then the last character of your output is space, not a newline. POSIX defines "line" as a sequence of zero or more non- <newline> characters plus a terminating <newline> character. This will probably backfire if the output is parsed further.
          – Kamil Maciorowski
          Nov 22 at 9:36










        • Nice though, but the OP, among some other points not completely specified, states that are sets of 4 lines, last of them blank. With a truncated file the next unknown processing may be however compromised. A not requested formats check is out of this thread scope, and IMHO a good practice is to generate an error. If you require solidity it is better to opt for a script (awk,sed,perl are scripting languages) that also allows you to reproduce the data processing. Then you have to decide how to deal with errors, but that is another quesiton...:-) I just try to keep it simple.
          – Hastur
          Nov 22 at 10:45












        • @KamilMaciorowski ... nonetheless I added another version with error check...
          – Hastur
          Nov 22 at 11:29


















        up vote
        0
        down vote













        You can do it with any text editor that support regular expressions like Notepad++.



        The new line is just simple non-printable character or two characters. In Windows usually CarrigeReturn and LineFeed and in Unix based system usually LineFeed only.



        To see them you need to turn on showing non-printable character (usually a Paragraph icon)
        See here: https://imgur.com/cqiTvrp



        Now what you need to do is to use regular expression replacer (CTRL + H) to replace CRLF#S to #S.
        The symbol for CR is r and for LF is n. So you gonna end up with rn#S or n#S to #S.
        https://imgur.com/GoeVn70



        Or you can replace it to SPACE if you need.






        share|improve this answer





















        • The question is tagged "Linux"....
          – xenoid
          Nov 21 at 14:16










        • I think regular expressions in Geany are the same. Is used Notepad++ as an example beacuse I am currently at Windows.
          – KaRolthas
          Nov 21 at 14:20












        • The question also asks for a command-line utility...
          – xenoid
          Nov 21 at 14:22










        • Nice, works. I need to somehow process at least few files now so even Notepad++ helps when I am working on my other machine with Windows. thanks
          – Juhele
          Nov 21 at 14:44











        Your Answer








        StackExchange.ready(function() {
        var channelOptions = {
        tags: "".split(" "),
        id: "3"
        };
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function() {
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled) {
        StackExchange.using("snippets", function() {
        createEditor();
        });
        }
        else {
        createEditor();
        }
        });

        function createEditor() {
        StackExchange.prepareEditor({
        heartbeatType: 'answer',
        convertImagesToLinks: true,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: 10,
        bindNavPrevention: true,
        postfix: "",
        imageUploader: {
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        },
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        });


        }
        });














        draft saved

        draft discarded


















        StackExchange.ready(
        function () {
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1377291%2fhow-can-i-replace-end-line-with-fixed-text-when-the-next-line-begins-with-a-defi%23new-answer', 'question_page');
        }
        );

        Post as a guest















        Required, but never shown

























        6 Answers
        6






        active

        oldest

        votes








        6 Answers
        6






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes








        up vote
        4
        down vote













        With sed:



        sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' data


        In slow-mo:





        • -z makes sed consider the file as a single line (so the line ends are plain characters)


        • 's/n#S/#S/g' replaces all LF's occurring just before a #S by a space


        • -e 's/nN /N /g' replaces all LFs before N (ie, the blank lines)






        share|improve this answer





















        • hmm, looks I will have to adjust it a little bit as "sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" still does not do the same as replacing "rn#S" with "#S" in Notepad++ so there is still CR left on the previous line before "#S".
          – Juhele
          Dec 4 at 10:59










        • OK, "sed -z -e 's/rn#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" finally turns the multiple lines into one. Just blank line remains as the "N..." line ends with CR LF (this is fine) and then there is a blank line with CR.
          – Juhele
          Dec 4 at 11:09















        up vote
        4
        down vote













        With sed:



        sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' data


        In slow-mo:





        • -z makes sed consider the file as a single line (so the line ends are plain characters)


        • 's/n#S/#S/g' replaces all LF's occurring just before a #S by a space


        • -e 's/nN /N /g' replaces all LFs before N (ie, the blank lines)






        share|improve this answer





















        • hmm, looks I will have to adjust it a little bit as "sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" still does not do the same as replacing "rn#S" with "#S" in Notepad++ so there is still CR left on the previous line before "#S".
          – Juhele
          Dec 4 at 10:59










        • OK, "sed -z -e 's/rn#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" finally turns the multiple lines into one. Just blank line remains as the "N..." line ends with CR LF (this is fine) and then there is a blank line with CR.
          – Juhele
          Dec 4 at 11:09













        up vote
        4
        down vote










        up vote
        4
        down vote









        With sed:



        sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' data


        In slow-mo:





        • -z makes sed consider the file as a single line (so the line ends are plain characters)


        • 's/n#S/#S/g' replaces all LF's occurring just before a #S by a space


        • -e 's/nN /N /g' replaces all LFs before N (ie, the blank lines)






        share|improve this answer












        With sed:



        sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' data


        In slow-mo:





        • -z makes sed consider the file as a single line (so the line ends are plain characters)


        • 's/n#S/#S/g' replaces all LF's occurring just before a #S by a space


        • -e 's/nN /N /g' replaces all LFs before N (ie, the blank lines)







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 21 at 14:32









        xenoid

        3,5533718




        3,5533718












        • hmm, looks I will have to adjust it a little bit as "sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" still does not do the same as replacing "rn#S" with "#S" in Notepad++ so there is still CR left on the previous line before "#S".
          – Juhele
          Dec 4 at 10:59










        • OK, "sed -z -e 's/rn#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" finally turns the multiple lines into one. Just blank line remains as the "N..." line ends with CR LF (this is fine) and then there is a blank line with CR.
          – Juhele
          Dec 4 at 11:09


















        • hmm, looks I will have to adjust it a little bit as "sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" still does not do the same as replacing "rn#S" with "#S" in Notepad++ so there is still CR left on the previous line before "#S".
          – Juhele
          Dec 4 at 10:59










        • OK, "sed -z -e 's/rn#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" finally turns the multiple lines into one. Just blank line remains as the "N..." line ends with CR LF (this is fine) and then there is a blank line with CR.
          – Juhele
          Dec 4 at 11:09
















        hmm, looks I will have to adjust it a little bit as "sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" still does not do the same as replacing "rn#S" with "#S" in Notepad++ so there is still CR left on the previous line before "#S".
        – Juhele
        Dec 4 at 10:59




        hmm, looks I will have to adjust it a little bit as "sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" still does not do the same as replacing "rn#S" with "#S" in Notepad++ so there is still CR left on the previous line before "#S".
        – Juhele
        Dec 4 at 10:59












        OK, "sed -z -e 's/rn#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" finally turns the multiple lines into one. Just blank line remains as the "N..." line ends with CR LF (this is fine) and then there is a blank line with CR.
        – Juhele
        Dec 4 at 11:09




        OK, "sed -z -e 's/rn#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" finally turns the multiple lines into one. Just blank line remains as the "N..." line ends with CR LF (this is fine) and then there is a blank line with CR.
        – Juhele
        Dec 4 at 11:09












        up vote
        4
        down vote













        With paste (this requires to always have groups of 4 lines):



         paste -s -d '   n' data


        In slo-mo:





        • paste -s concatenates the lines from the file


        • -d specifies characters to be inserted as delimiters. When there are several characters, they are used in a round-robin fashion, so with 3 spaces and a LF:


          • the first space is used on the first splice (N to #S),

          • the second space is used on the second splice (#S to #S),

          • the third space is used on the thrid splice (#S to blank line),

          • the last delimiter, a LF, is used on the fourth splice (blank line to N)

          • and the cycle repeats for the next 4 lines.








        share|improve this answer

























          up vote
          4
          down vote













          With paste (this requires to always have groups of 4 lines):



           paste -s -d '   n' data


          In slo-mo:





          • paste -s concatenates the lines from the file


          • -d specifies characters to be inserted as delimiters. When there are several characters, they are used in a round-robin fashion, so with 3 spaces and a LF:


            • the first space is used on the first splice (N to #S),

            • the second space is used on the second splice (#S to #S),

            • the third space is used on the thrid splice (#S to blank line),

            • the last delimiter, a LF, is used on the fourth splice (blank line to N)

            • and the cycle repeats for the next 4 lines.








          share|improve this answer























            up vote
            4
            down vote










            up vote
            4
            down vote









            With paste (this requires to always have groups of 4 lines):



             paste -s -d '   n' data


            In slo-mo:





            • paste -s concatenates the lines from the file


            • -d specifies characters to be inserted as delimiters. When there are several characters, they are used in a round-robin fashion, so with 3 spaces and a LF:


              • the first space is used on the first splice (N to #S),

              • the second space is used on the second splice (#S to #S),

              • the third space is used on the thrid splice (#S to blank line),

              • the last delimiter, a LF, is used on the fourth splice (blank line to N)

              • and the cycle repeats for the next 4 lines.








            share|improve this answer












            With paste (this requires to always have groups of 4 lines):



             paste -s -d '   n' data


            In slo-mo:





            • paste -s concatenates the lines from the file


            • -d specifies characters to be inserted as delimiters. When there are several characters, they are used in a round-robin fashion, so with 3 spaces and a LF:


              • the first space is used on the first splice (N to #S),

              • the second space is used on the second splice (#S to #S),

              • the third space is used on the thrid splice (#S to blank line),

              • the last delimiter, a LF, is used on the fourth splice (blank line to N)

              • and the cycle repeats for the next 4 lines.









            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 21 at 14:42









            xenoid

            3,5533718




            3,5533718






















                up vote
                4
                down vote













                This is a portable solution with POSIX sed, implementing the following rules:




                • empty lines shall be deleted;

                • any line starting with #S shall be merged with the previous non-empty line, with a single space character between them, unless there is no previous non-empty line.


                The code:



                <data sed '/^$/ d; :start; N; s/n$//; t start; s/n#S/ #S/; t start; P; D'


                The same with comments (still working code):



                <data sed '
                /^$/ d # If empty line read, delete it and start a new cycle.
                :start # A label.
                N # Read additional line, there are now two lines in the pattern space.
                s/n$// # If the second line is empty, replace the newline with nothing.
                t start # If the above replacement occurred, go to start (to add another line).
                # Otherwise
                s/n#S/ #S/ # if the second line starts with #S, replace the newline with space.
                t start # If the above replacement occurred, go to start (to add another line).
                # Otherwise
                # (i.e when non-empty line not starting with #S occurred)
                P # print the pattern space up to the first newline and...
                D # delete the initial segment of the pattern space
                # through the first newline (i.e. everything just printed),
                # and start the next cycle with the resultant pattern space
                # and without reading any new input
                # (in our case the new input will be explicitly read by N then).
                '


                Note the solution uses sed pattern space to accumulate many input lines. This remark applies:




                The pattern and hold spaces shall each be able to hold at least 8192 bytes.




                Just before the P command the pattern space holds one (relatively long) line meant to be printed and a single (relatively short) input line, plus a newline in between. Obviously it depends on your data, whether or not such structure exceeds 8192 bytes at some point. If it does, some sed implementations may fail.






                share|improve this answer



























                  up vote
                  4
                  down vote













                  This is a portable solution with POSIX sed, implementing the following rules:




                  • empty lines shall be deleted;

                  • any line starting with #S shall be merged with the previous non-empty line, with a single space character between them, unless there is no previous non-empty line.


                  The code:



                  <data sed '/^$/ d; :start; N; s/n$//; t start; s/n#S/ #S/; t start; P; D'


                  The same with comments (still working code):



                  <data sed '
                  /^$/ d # If empty line read, delete it and start a new cycle.
                  :start # A label.
                  N # Read additional line, there are now two lines in the pattern space.
                  s/n$// # If the second line is empty, replace the newline with nothing.
                  t start # If the above replacement occurred, go to start (to add another line).
                  # Otherwise
                  s/n#S/ #S/ # if the second line starts with #S, replace the newline with space.
                  t start # If the above replacement occurred, go to start (to add another line).
                  # Otherwise
                  # (i.e when non-empty line not starting with #S occurred)
                  P # print the pattern space up to the first newline and...
                  D # delete the initial segment of the pattern space
                  # through the first newline (i.e. everything just printed),
                  # and start the next cycle with the resultant pattern space
                  # and without reading any new input
                  # (in our case the new input will be explicitly read by N then).
                  '


                  Note the solution uses sed pattern space to accumulate many input lines. This remark applies:




                  The pattern and hold spaces shall each be able to hold at least 8192 bytes.




                  Just before the P command the pattern space holds one (relatively long) line meant to be printed and a single (relatively short) input line, plus a newline in between. Obviously it depends on your data, whether or not such structure exceeds 8192 bytes at some point. If it does, some sed implementations may fail.






                  share|improve this answer

























                    up vote
                    4
                    down vote










                    up vote
                    4
                    down vote









                    This is a portable solution with POSIX sed, implementing the following rules:




                    • empty lines shall be deleted;

                    • any line starting with #S shall be merged with the previous non-empty line, with a single space character between them, unless there is no previous non-empty line.


                    The code:



                    <data sed '/^$/ d; :start; N; s/n$//; t start; s/n#S/ #S/; t start; P; D'


                    The same with comments (still working code):



                    <data sed '
                    /^$/ d # If empty line read, delete it and start a new cycle.
                    :start # A label.
                    N # Read additional line, there are now two lines in the pattern space.
                    s/n$// # If the second line is empty, replace the newline with nothing.
                    t start # If the above replacement occurred, go to start (to add another line).
                    # Otherwise
                    s/n#S/ #S/ # if the second line starts with #S, replace the newline with space.
                    t start # If the above replacement occurred, go to start (to add another line).
                    # Otherwise
                    # (i.e when non-empty line not starting with #S occurred)
                    P # print the pattern space up to the first newline and...
                    D # delete the initial segment of the pattern space
                    # through the first newline (i.e. everything just printed),
                    # and start the next cycle with the resultant pattern space
                    # and without reading any new input
                    # (in our case the new input will be explicitly read by N then).
                    '


                    Note the solution uses sed pattern space to accumulate many input lines. This remark applies:




                    The pattern and hold spaces shall each be able to hold at least 8192 bytes.




                    Just before the P command the pattern space holds one (relatively long) line meant to be printed and a single (relatively short) input line, plus a newline in between. Obviously it depends on your data, whether or not such structure exceeds 8192 bytes at some point. If it does, some sed implementations may fail.






                    share|improve this answer














                    This is a portable solution with POSIX sed, implementing the following rules:




                    • empty lines shall be deleted;

                    • any line starting with #S shall be merged with the previous non-empty line, with a single space character between them, unless there is no previous non-empty line.


                    The code:



                    <data sed '/^$/ d; :start; N; s/n$//; t start; s/n#S/ #S/; t start; P; D'


                    The same with comments (still working code):



                    <data sed '
                    /^$/ d # If empty line read, delete it and start a new cycle.
                    :start # A label.
                    N # Read additional line, there are now two lines in the pattern space.
                    s/n$// # If the second line is empty, replace the newline with nothing.
                    t start # If the above replacement occurred, go to start (to add another line).
                    # Otherwise
                    s/n#S/ #S/ # if the second line starts with #S, replace the newline with space.
                    t start # If the above replacement occurred, go to start (to add another line).
                    # Otherwise
                    # (i.e when non-empty line not starting with #S occurred)
                    P # print the pattern space up to the first newline and...
                    D # delete the initial segment of the pattern space
                    # through the first newline (i.e. everything just printed),
                    # and start the next cycle with the resultant pattern space
                    # and without reading any new input
                    # (in our case the new input will be explicitly read by N then).
                    '


                    Note the solution uses sed pattern space to accumulate many input lines. This remark applies:




                    The pattern and hold spaces shall each be able to hold at least 8192 bytes.




                    Just before the P command the pattern space holds one (relatively long) line meant to be printed and a single (relatively short) input line, plus a newline in between. Obviously it depends on your data, whether or not such structure exceeds 8192 bytes at some point. If it does, some sed implementations may fail.







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Nov 22 at 6:35

























                    answered Nov 21 at 18:17









                    Kamil Maciorowski

                    23.1k155072




                    23.1k155072






















                        up vote
                        3
                        down vote













                        Using Perl:



                        perl -0 -ape 's/R(?=RN|#)/ /g' file.txt
                        N 12344;PE 9.9999999;... #S 0 0 31 44 75 130 165 196... #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...
                        N 12345;PE 9.9999999;... #S 0 0 34 57 84 133 152... #S_+ "1 0 1 1 2 3 0 0 0...
                        N 12346;PE 9.9999999;... #S 0 0 31 44 73 140 169... #S_+ "3 3 4 0 0 2 1 2 4...
                        N 25104;PE 9.9999999;... #S 0 0 36 52 102 108 145... #S_+ "1 1 0 1 0 0 3 0 1...
                        N 25105;PE 9.9999999;... #S 0 0 32 58 88 130 143...


                        Regex explain:



                        s/              : substitute
                        R : any kind of line break (ie. r, n, rn)
                        (?= : positive lookahead, zero-length assertion that make sure we have after
                        RN : a line break followed by letter N
                        | : OR
                        # : # character
                        ) : end lookahead
                        / /g : replace with a space, global





                        share|improve this answer



























                          up vote
                          3
                          down vote













                          Using Perl:



                          perl -0 -ape 's/R(?=RN|#)/ /g' file.txt
                          N 12344;PE 9.9999999;... #S 0 0 31 44 75 130 165 196... #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...
                          N 12345;PE 9.9999999;... #S 0 0 34 57 84 133 152... #S_+ "1 0 1 1 2 3 0 0 0...
                          N 12346;PE 9.9999999;... #S 0 0 31 44 73 140 169... #S_+ "3 3 4 0 0 2 1 2 4...
                          N 25104;PE 9.9999999;... #S 0 0 36 52 102 108 145... #S_+ "1 1 0 1 0 0 3 0 1...
                          N 25105;PE 9.9999999;... #S 0 0 32 58 88 130 143...


                          Regex explain:



                          s/              : substitute
                          R : any kind of line break (ie. r, n, rn)
                          (?= : positive lookahead, zero-length assertion that make sure we have after
                          RN : a line break followed by letter N
                          | : OR
                          # : # character
                          ) : end lookahead
                          / /g : replace with a space, global





                          share|improve this answer

























                            up vote
                            3
                            down vote










                            up vote
                            3
                            down vote









                            Using Perl:



                            perl -0 -ape 's/R(?=RN|#)/ /g' file.txt
                            N 12344;PE 9.9999999;... #S 0 0 31 44 75 130 165 196... #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...
                            N 12345;PE 9.9999999;... #S 0 0 34 57 84 133 152... #S_+ "1 0 1 1 2 3 0 0 0...
                            N 12346;PE 9.9999999;... #S 0 0 31 44 73 140 169... #S_+ "3 3 4 0 0 2 1 2 4...
                            N 25104;PE 9.9999999;... #S 0 0 36 52 102 108 145... #S_+ "1 1 0 1 0 0 3 0 1...
                            N 25105;PE 9.9999999;... #S 0 0 32 58 88 130 143...


                            Regex explain:



                            s/              : substitute
                            R : any kind of line break (ie. r, n, rn)
                            (?= : positive lookahead, zero-length assertion that make sure we have after
                            RN : a line break followed by letter N
                            | : OR
                            # : # character
                            ) : end lookahead
                            / /g : replace with a space, global





                            share|improve this answer














                            Using Perl:



                            perl -0 -ape 's/R(?=RN|#)/ /g' file.txt
                            N 12344;PE 9.9999999;... #S 0 0 31 44 75 130 165 196... #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...
                            N 12345;PE 9.9999999;... #S 0 0 34 57 84 133 152... #S_+ "1 0 1 1 2 3 0 0 0...
                            N 12346;PE 9.9999999;... #S 0 0 31 44 73 140 169... #S_+ "3 3 4 0 0 2 1 2 4...
                            N 25104;PE 9.9999999;... #S 0 0 36 52 102 108 145... #S_+ "1 1 0 1 0 0 3 0 1...
                            N 25105;PE 9.9999999;... #S 0 0 32 58 88 130 143...


                            Regex explain:



                            s/              : substitute
                            R : any kind of line break (ie. r, n, rn)
                            (?= : positive lookahead, zero-length assertion that make sure we have after
                            RN : a line break followed by letter N
                            | : OR
                            # : # character
                            ) : end lookahead
                            / /g : replace with a space, global






                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Nov 21 at 16:31

























                            answered Nov 21 at 15:58









                            Toto

                            3,37191126




                            3,37191126






















                                up vote
                                3
                                down vote













                                awk (gawk [1])



                                As usually other than sed you can use awk (and in many different ways...)



                                awk 'ORS=" "; NR % 4 == 0 && ORS="n" ' data


                                where





                                • ORS=" " fixes the output record separator, by default a newline, to a space (you can change)


                                • NR % 4 == 0 && ORS="n" each 4th line it fixes back to the newline n

                                • If nothing else is specified awk prints the full line


                                • data is your data file.


                                If you want you can use regex as in sed (in a similar way).





                                A format check version with awk



                                Even if not requested, you may want to manage a truncated file eliminating the corrupted output line and generating an error and an error message.



                                awk '{a=$0; getline b; getline c; 
                                if ( getline > 0 ) {print a, b, c, $0 }
                                else { print "Ohi " > "/dev/stderr" ; exit 65; } }' data


                                where





                                • a=$0; puts the full line in the variable a


                                • getline b; reads a line and puts the variable b


                                • getline c; obscure unfathomable command :-)


                                • if (getline) if it is able to read a line...

                                • ..............{print a, b, c, $0} prints the 4 lines


                                • else prints an error on the stderr device (screen or other) you can custom here...


                                • exit 65 return an exit code different from 0 --->error


                                Bonus: why 65?



                                Searching for a good value for your exit code [2] you may found that it is suggested to see in /usr/include/sysexits.h among some C standards...



                                  #define EX_DATAERR      65      /* data format error */


                                65 is the most appropriate for the a data format error...



                                Honestly as answer I preferred 42,

                                but each value different from zero (and not reserved[2]) could be good and 65 is the specific one...






                                share|improve this answer























                                • One disadvantage though: the last pack of lines may consist of three of them (i.e. no empty line at the very end); or may not. If three, then the last character of your output is space, not a newline. POSIX defines "line" as a sequence of zero or more non- <newline> characters plus a terminating <newline> character. This will probably backfire if the output is parsed further.
                                  – Kamil Maciorowski
                                  Nov 22 at 9:36










                                • Nice though, but the OP, among some other points not completely specified, states that are sets of 4 lines, last of them blank. With a truncated file the next unknown processing may be however compromised. A not requested formats check is out of this thread scope, and IMHO a good practice is to generate an error. If you require solidity it is better to opt for a script (awk,sed,perl are scripting languages) that also allows you to reproduce the data processing. Then you have to decide how to deal with errors, but that is another quesiton...:-) I just try to keep it simple.
                                  – Hastur
                                  Nov 22 at 10:45












                                • @KamilMaciorowski ... nonetheless I added another version with error check...
                                  – Hastur
                                  Nov 22 at 11:29















                                up vote
                                3
                                down vote













                                awk (gawk [1])



                                As usually other than sed you can use awk (and in many different ways...)



                                awk 'ORS=" "; NR % 4 == 0 && ORS="n" ' data


                                where





                                • ORS=" " fixes the output record separator, by default a newline, to a space (you can change)


                                • NR % 4 == 0 && ORS="n" each 4th line it fixes back to the newline n

                                • If nothing else is specified awk prints the full line


                                • data is your data file.


                                If you want you can use regex as in sed (in a similar way).





                                A format check version with awk



                                Even if not requested, you may want to manage a truncated file eliminating the corrupted output line and generating an error and an error message.



                                awk '{a=$0; getline b; getline c; 
                                if ( getline > 0 ) {print a, b, c, $0 }
                                else { print "Ohi " > "/dev/stderr" ; exit 65; } }' data


                                where





                                • a=$0; puts the full line in the variable a


                                • getline b; reads a line and puts the variable b


                                • getline c; obscure unfathomable command :-)


                                • if (getline) if it is able to read a line...

                                • ..............{print a, b, c, $0} prints the 4 lines


                                • else prints an error on the stderr device (screen or other) you can custom here...


                                • exit 65 return an exit code different from 0 --->error


                                Bonus: why 65?



                                Searching for a good value for your exit code [2] you may found that it is suggested to see in /usr/include/sysexits.h among some C standards...



                                  #define EX_DATAERR      65      /* data format error */


                                65 is the most appropriate for the a data format error...



                                Honestly as answer I preferred 42,

                                but each value different from zero (and not reserved[2]) could be good and 65 is the specific one...






                                share|improve this answer























                                • One disadvantage though: the last pack of lines may consist of three of them (i.e. no empty line at the very end); or may not. If three, then the last character of your output is space, not a newline. POSIX defines "line" as a sequence of zero or more non- <newline> characters plus a terminating <newline> character. This will probably backfire if the output is parsed further.
                                  – Kamil Maciorowski
                                  Nov 22 at 9:36










                                • Nice though, but the OP, among some other points not completely specified, states that are sets of 4 lines, last of them blank. With a truncated file the next unknown processing may be however compromised. A not requested formats check is out of this thread scope, and IMHO a good practice is to generate an error. If you require solidity it is better to opt for a script (awk,sed,perl are scripting languages) that also allows you to reproduce the data processing. Then you have to decide how to deal with errors, but that is another quesiton...:-) I just try to keep it simple.
                                  – Hastur
                                  Nov 22 at 10:45












                                • @KamilMaciorowski ... nonetheless I added another version with error check...
                                  – Hastur
                                  Nov 22 at 11:29













                                up vote
                                3
                                down vote










                                up vote
                                3
                                down vote









                                awk (gawk [1])



                                As usually other than sed you can use awk (and in many different ways...)



                                awk 'ORS=" "; NR % 4 == 0 && ORS="n" ' data


                                where





                                • ORS=" " fixes the output record separator, by default a newline, to a space (you can change)


                                • NR % 4 == 0 && ORS="n" each 4th line it fixes back to the newline n

                                • If nothing else is specified awk prints the full line


                                • data is your data file.


                                If you want you can use regex as in sed (in a similar way).





                                A format check version with awk



                                Even if not requested, you may want to manage a truncated file eliminating the corrupted output line and generating an error and an error message.



                                awk '{a=$0; getline b; getline c; 
                                if ( getline > 0 ) {print a, b, c, $0 }
                                else { print "Ohi " > "/dev/stderr" ; exit 65; } }' data


                                where





                                • a=$0; puts the full line in the variable a


                                • getline b; reads a line and puts the variable b


                                • getline c; obscure unfathomable command :-)


                                • if (getline) if it is able to read a line...

                                • ..............{print a, b, c, $0} prints the 4 lines


                                • else prints an error on the stderr device (screen or other) you can custom here...


                                • exit 65 return an exit code different from 0 --->error


                                Bonus: why 65?



                                Searching for a good value for your exit code [2] you may found that it is suggested to see in /usr/include/sysexits.h among some C standards...



                                  #define EX_DATAERR      65      /* data format error */


                                65 is the most appropriate for the a data format error...



                                Honestly as answer I preferred 42,

                                but each value different from zero (and not reserved[2]) could be good and 65 is the specific one...






                                share|improve this answer














                                awk (gawk [1])



                                As usually other than sed you can use awk (and in many different ways...)



                                awk 'ORS=" "; NR % 4 == 0 && ORS="n" ' data


                                where





                                • ORS=" " fixes the output record separator, by default a newline, to a space (you can change)


                                • NR % 4 == 0 && ORS="n" each 4th line it fixes back to the newline n

                                • If nothing else is specified awk prints the full line


                                • data is your data file.


                                If you want you can use regex as in sed (in a similar way).





                                A format check version with awk



                                Even if not requested, you may want to manage a truncated file eliminating the corrupted output line and generating an error and an error message.



                                awk '{a=$0; getline b; getline c; 
                                if ( getline > 0 ) {print a, b, c, $0 }
                                else { print "Ohi " > "/dev/stderr" ; exit 65; } }' data


                                where





                                • a=$0; puts the full line in the variable a


                                • getline b; reads a line and puts the variable b


                                • getline c; obscure unfathomable command :-)


                                • if (getline) if it is able to read a line...

                                • ..............{print a, b, c, $0} prints the 4 lines


                                • else prints an error on the stderr device (screen or other) you can custom here...


                                • exit 65 return an exit code different from 0 --->error


                                Bonus: why 65?



                                Searching for a good value for your exit code [2] you may found that it is suggested to see in /usr/include/sysexits.h among some C standards...



                                  #define EX_DATAERR      65      /* data format error */


                                65 is the most appropriate for the a data format error...



                                Honestly as answer I preferred 42,

                                but each value different from zero (and not reserved[2]) could be good and 65 is the specific one...







                                share|improve this answer














                                share|improve this answer



                                share|improve this answer








                                edited Nov 22 at 11:49

























                                answered Nov 21 at 22:28









                                Hastur

                                13k53266




                                13k53266












                                • One disadvantage though: the last pack of lines may consist of three of them (i.e. no empty line at the very end); or may not. If three, then the last character of your output is space, not a newline. POSIX defines "line" as a sequence of zero or more non- <newline> characters plus a terminating <newline> character. This will probably backfire if the output is parsed further.
                                  – Kamil Maciorowski
                                  Nov 22 at 9:36










                                • Nice though, but the OP, among some other points not completely specified, states that are sets of 4 lines, last of them blank. With a truncated file the next unknown processing may be however compromised. A not requested formats check is out of this thread scope, and IMHO a good practice is to generate an error. If you require solidity it is better to opt for a script (awk,sed,perl are scripting languages) that also allows you to reproduce the data processing. Then you have to decide how to deal with errors, but that is another quesiton...:-) I just try to keep it simple.
                                  – Hastur
                                  Nov 22 at 10:45












                                • @KamilMaciorowski ... nonetheless I added another version with error check...
                                  – Hastur
                                  Nov 22 at 11:29


















                                • One disadvantage though: the last pack of lines may consist of three of them (i.e. no empty line at the very end); or may not. If three, then the last character of your output is space, not a newline. POSIX defines "line" as a sequence of zero or more non- <newline> characters plus a terminating <newline> character. This will probably backfire if the output is parsed further.
                                  – Kamil Maciorowski
                                  Nov 22 at 9:36










                                • Nice though, but the OP, among some other points not completely specified, states that are sets of 4 lines, last of them blank. With a truncated file the next unknown processing may be however compromised. A not requested formats check is out of this thread scope, and IMHO a good practice is to generate an error. If you require solidity it is better to opt for a script (awk,sed,perl are scripting languages) that also allows you to reproduce the data processing. Then you have to decide how to deal with errors, but that is another quesiton...:-) I just try to keep it simple.
                                  – Hastur
                                  Nov 22 at 10:45












                                • @KamilMaciorowski ... nonetheless I added another version with error check...
                                  – Hastur
                                  Nov 22 at 11:29
















                                One disadvantage though: the last pack of lines may consist of three of them (i.e. no empty line at the very end); or may not. If three, then the last character of your output is space, not a newline. POSIX defines "line" as a sequence of zero or more non- <newline> characters plus a terminating <newline> character. This will probably backfire if the output is parsed further.
                                – Kamil Maciorowski
                                Nov 22 at 9:36




                                One disadvantage though: the last pack of lines may consist of three of them (i.e. no empty line at the very end); or may not. If three, then the last character of your output is space, not a newline. POSIX defines "line" as a sequence of zero or more non- <newline> characters plus a terminating <newline> character. This will probably backfire if the output is parsed further.
                                – Kamil Maciorowski
                                Nov 22 at 9:36












                                Nice though, but the OP, among some other points not completely specified, states that are sets of 4 lines, last of them blank. With a truncated file the next unknown processing may be however compromised. A not requested formats check is out of this thread scope, and IMHO a good practice is to generate an error. If you require solidity it is better to opt for a script (awk,sed,perl are scripting languages) that also allows you to reproduce the data processing. Then you have to decide how to deal with errors, but that is another quesiton...:-) I just try to keep it simple.
                                – Hastur
                                Nov 22 at 10:45






                                Nice though, but the OP, among some other points not completely specified, states that are sets of 4 lines, last of them blank. With a truncated file the next unknown processing may be however compromised. A not requested formats check is out of this thread scope, and IMHO a good practice is to generate an error. If you require solidity it is better to opt for a script (awk,sed,perl are scripting languages) that also allows you to reproduce the data processing. Then you have to decide how to deal with errors, but that is another quesiton...:-) I just try to keep it simple.
                                – Hastur
                                Nov 22 at 10:45














                                @KamilMaciorowski ... nonetheless I added another version with error check...
                                – Hastur
                                Nov 22 at 11:29




                                @KamilMaciorowski ... nonetheless I added another version with error check...
                                – Hastur
                                Nov 22 at 11:29










                                up vote
                                0
                                down vote













                                You can do it with any text editor that support regular expressions like Notepad++.



                                The new line is just simple non-printable character or two characters. In Windows usually CarrigeReturn and LineFeed and in Unix based system usually LineFeed only.



                                To see them you need to turn on showing non-printable character (usually a Paragraph icon)
                                See here: https://imgur.com/cqiTvrp



                                Now what you need to do is to use regular expression replacer (CTRL + H) to replace CRLF#S to #S.
                                The symbol for CR is r and for LF is n. So you gonna end up with rn#S or n#S to #S.
                                https://imgur.com/GoeVn70



                                Or you can replace it to SPACE if you need.






                                share|improve this answer





















                                • The question is tagged "Linux"....
                                  – xenoid
                                  Nov 21 at 14:16










                                • I think regular expressions in Geany are the same. Is used Notepad++ as an example beacuse I am currently at Windows.
                                  – KaRolthas
                                  Nov 21 at 14:20












                                • The question also asks for a command-line utility...
                                  – xenoid
                                  Nov 21 at 14:22










                                • Nice, works. I need to somehow process at least few files now so even Notepad++ helps when I am working on my other machine with Windows. thanks
                                  – Juhele
                                  Nov 21 at 14:44















                                up vote
                                0
                                down vote













                                You can do it with any text editor that support regular expressions like Notepad++.



                                The new line is just simple non-printable character or two characters. In Windows usually CarrigeReturn and LineFeed and in Unix based system usually LineFeed only.



                                To see them you need to turn on showing non-printable character (usually a Paragraph icon)
                                See here: https://imgur.com/cqiTvrp



                                Now what you need to do is to use regular expression replacer (CTRL + H) to replace CRLF#S to #S.
                                The symbol for CR is r and for LF is n. So you gonna end up with rn#S or n#S to #S.
                                https://imgur.com/GoeVn70



                                Or you can replace it to SPACE if you need.






                                share|improve this answer





















                                • The question is tagged "Linux"....
                                  – xenoid
                                  Nov 21 at 14:16










                                • I think regular expressions in Geany are the same. Is used Notepad++ as an example beacuse I am currently at Windows.
                                  – KaRolthas
                                  Nov 21 at 14:20












                                • The question also asks for a command-line utility...
                                  – xenoid
                                  Nov 21 at 14:22










                                • Nice, works. I need to somehow process at least few files now so even Notepad++ helps when I am working on my other machine with Windows. thanks
                                  – Juhele
                                  Nov 21 at 14:44













                                up vote
                                0
                                down vote










                                up vote
                                0
                                down vote









                                You can do it with any text editor that support regular expressions like Notepad++.



                                The new line is just simple non-printable character or two characters. In Windows usually CarrigeReturn and LineFeed and in Unix based system usually LineFeed only.



                                To see them you need to turn on showing non-printable character (usually a Paragraph icon)
                                See here: https://imgur.com/cqiTvrp



                                Now what you need to do is to use regular expression replacer (CTRL + H) to replace CRLF#S to #S.
                                The symbol for CR is r and for LF is n. So you gonna end up with rn#S or n#S to #S.
                                https://imgur.com/GoeVn70



                                Or you can replace it to SPACE if you need.






                                share|improve this answer












                                You can do it with any text editor that support regular expressions like Notepad++.



                                The new line is just simple non-printable character or two characters. In Windows usually CarrigeReturn and LineFeed and in Unix based system usually LineFeed only.



                                To see them you need to turn on showing non-printable character (usually a Paragraph icon)
                                See here: https://imgur.com/cqiTvrp



                                Now what you need to do is to use regular expression replacer (CTRL + H) to replace CRLF#S to #S.
                                The symbol for CR is r and for LF is n. So you gonna end up with rn#S or n#S to #S.
                                https://imgur.com/GoeVn70



                                Or you can replace it to SPACE if you need.







                                share|improve this answer












                                share|improve this answer



                                share|improve this answer










                                answered Nov 21 at 14:15









                                KaRolthas

                                1




                                1












                                • The question is tagged "Linux"....
                                  – xenoid
                                  Nov 21 at 14:16










                                • I think regular expressions in Geany are the same. Is used Notepad++ as an example beacuse I am currently at Windows.
                                  – KaRolthas
                                  Nov 21 at 14:20












                                • The question also asks for a command-line utility...
                                  – xenoid
                                  Nov 21 at 14:22










                                • Nice, works. I need to somehow process at least few files now so even Notepad++ helps when I am working on my other machine with Windows. thanks
                                  – Juhele
                                  Nov 21 at 14:44


















                                • The question is tagged "Linux"....
                                  – xenoid
                                  Nov 21 at 14:16










                                • I think regular expressions in Geany are the same. Is used Notepad++ as an example beacuse I am currently at Windows.
                                  – KaRolthas
                                  Nov 21 at 14:20












                                • The question also asks for a command-line utility...
                                  – xenoid
                                  Nov 21 at 14:22










                                • Nice, works. I need to somehow process at least few files now so even Notepad++ helps when I am working on my other machine with Windows. thanks
                                  – Juhele
                                  Nov 21 at 14:44
















                                The question is tagged "Linux"....
                                – xenoid
                                Nov 21 at 14:16




                                The question is tagged "Linux"....
                                – xenoid
                                Nov 21 at 14:16












                                I think regular expressions in Geany are the same. Is used Notepad++ as an example beacuse I am currently at Windows.
                                – KaRolthas
                                Nov 21 at 14:20






                                I think regular expressions in Geany are the same. Is used Notepad++ as an example beacuse I am currently at Windows.
                                – KaRolthas
                                Nov 21 at 14:20














                                The question also asks for a command-line utility...
                                – xenoid
                                Nov 21 at 14:22




                                The question also asks for a command-line utility...
                                – xenoid
                                Nov 21 at 14:22












                                Nice, works. I need to somehow process at least few files now so even Notepad++ helps when I am working on my other machine with Windows. thanks
                                – Juhele
                                Nov 21 at 14:44




                                Nice, works. I need to somehow process at least few files now so even Notepad++ helps when I am working on my other machine with Windows. thanks
                                – Juhele
                                Nov 21 at 14:44


















                                draft saved

                                draft discarded




















































                                Thanks for contributing an answer to Super User!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                To learn more, see our tips on writing great answers.





                                Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                                Please pay close attention to the following guidance:


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function () {
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1377291%2fhow-can-i-replace-end-line-with-fixed-text-when-the-next-line-begins-with-a-defi%23new-answer', 'question_page');
                                }
                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                Le Mesnil-Réaume

                                Ida-Boy-Ed-Garten

                                web3.py web3.isConnected() returns false always