How can I replace end line with fixed text when the next line begins with a defined set of characters?

up vote
2
down vote

favorite

I have several big files with some measurements.

It looks this way:

N 12344;PE 9.9999999;...

#S 0 0 31 44 75 130 165 196...

#S_+ "2 5 2 3 3 1 1 2 3 1 2 2...



N 12345;PE 9.9999999;...

#S 0 0 34 57 84 133 152...

#S_+ "1 0 1 1 2 3 0 0 0...



N 12346;PE 9.9999999;...

#S 0 0 31 44 73 140 169...

#S_+ "3 3 4 0 0 2 1 2 4...



N 25104;PE 9.9999999;...

#S 0 0 36 52 102 108 145...

#S_+ "1 1 0 1 0 0 3 0 1...



N 25105;PE 9.9999999;...

#S 0 0 32 58 88 130 143...

Sample is here:
http://pasted.co/d9806b7c4

The file is much bigger but I replaced part of the data with "..." to make it shorter.

I need to somehow replace the line ends before "#S" - in fact simply merge the "N" line with the following two ones into one line (or with the following three ones so I can get rid of the blank lines). Expect output like this:

N 12344;PE 9.9999999; #S 0 0 31 44 75 130 165 196 #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...

N 12345;PE 9.9999999; #S 0 0 34 57 84 133 152 #S_+ "1 0 1 1 2 3 0 0 0...

N 12346;PE 9.9999999; #S 0 0 31 44 73 140 169 #S_+ "3 3 4 0 0 2 1 2 4...

N 25104;PE 9.9999999; #S 0 0 36 52 102 108 145 #S_+ "1 1 0 1 0 0 3 0 1...

N 25105;PE 9.9999999; #S 0 0 32 58 88 130 143...

Is this possible to achieve using some command-line utility in linux?

My knowledge is quite limited in this area so I would appreciate any help.

thanks

edited Nov 22 at 11:39

Toto

3,37191126

asked Nov 21 at 13:52

Juhele

2,07221222

thanks to grawity for helping me with the code :-)
– Juhele
Nov 21 at 14:09

1

@Pimp Juice IT: OK, I updated the question.
– Juhele
Nov 21 at 14:14

Hi @Juhele can you specify better the output format: Do you need to cut the first line after e.g. PE 9.9999999;, do you need to cut the second after the 7th (8th) number or, as you write, merge the "N" line with the following two ones? What about the " present only in the output ?! I give some edit to your post, please check it. It can be an incomplete file? BTW for the most simple case you already have more than one good answer.
– Hastur
Nov 22 at 9:09

the " is both in input and output like #S_+ "2 5 - it is not important character for me (I am going to remove it in next processing step) but it is just in the input data. "do you need to cut the second after the 7th (8th) number" - No, I just shortened the example as the data has several hundreds "columns".
– Juhele
Dec 4 at 9:05

1

For personal experiences with acquisition devices do redundancy checks (the more the better). I suppose you already know, but with enough lines, it is not important how little is the likelihood of corrupted data, if it is not zero it will occur. Many time is enough awk '{print NF}' YourFILE | sort -n -u | uniq -c to know that you have the same number of columns in (almost, because of headers) each line, or a consistent structure (3 lines of data 1 blank...)
– Hastur
Dec 4 at 11:09

|
show 2 more comments

up vote
2
down vote

favorite

I have several big files with some measurements.

It looks this way:

N 12344;PE 9.9999999;...

#S 0 0 31 44 75 130 165 196...

#S_+ "2 5 2 3 3 1 1 2 3 1 2 2...



N 12345;PE 9.9999999;...

#S 0 0 34 57 84 133 152...

#S_+ "1 0 1 1 2 3 0 0 0...



N 12346;PE 9.9999999;...

#S 0 0 31 44 73 140 169...

#S_+ "3 3 4 0 0 2 1 2 4...



N 25104;PE 9.9999999;...

#S 0 0 36 52 102 108 145...

#S_+ "1 1 0 1 0 0 3 0 1...



N 25105;PE 9.9999999;...

#S 0 0 32 58 88 130 143...

Sample is here:
http://pasted.co/d9806b7c4

The file is much bigger but I replaced part of the data with "..." to make it shorter.

N 12344;PE 9.9999999; #S 0 0 31 44 75 130 165 196 #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...

N 12345;PE 9.9999999; #S 0 0 34 57 84 133 152 #S_+ "1 0 1 1 2 3 0 0 0...

N 12346;PE 9.9999999; #S 0 0 31 44 73 140 169 #S_+ "3 3 4 0 0 2 1 2 4...

N 25104;PE 9.9999999; #S 0 0 36 52 102 108 145 #S_+ "1 1 0 1 0 0 3 0 1...

N 25105;PE 9.9999999; #S 0 0 32 58 88 130 143...

Is this possible to achieve using some command-line utility in linux?

My knowledge is quite limited in this area so I would appreciate any help.

thanks

edited Nov 22 at 11:39

Toto

3,37191126

asked Nov 21 at 13:52

Juhele

2,07221222

thanks to grawity for helping me with the code :-)
– Juhele
Nov 21 at 14:09

1

@Pimp Juice IT: OK, I updated the question.
– Juhele
Nov 21 at 14:14

Hi @Juhele can you specify better the output format: Do you need to cut the first line after e.g. PE 9.9999999;, do you need to cut the second after the 7th (8th) number or, as you write, merge the "N" line with the following two ones? What about the " present only in the output ?! I give some edit to your post, please check it. It can be an incomplete file? BTW for the most simple case you already have more than one good answer.
– Hastur
Nov 22 at 9:09

the " is both in input and output like #S_+ "2 5 - it is not important character for me (I am going to remove it in next processing step) but it is just in the input data. "do you need to cut the second after the 7th (8th) number" - No, I just shortened the example as the data has several hundreds "columns".
– Juhele
Dec 4 at 9:05

1

For personal experiences with acquisition devices do redundancy checks (the more the better). I suppose you already know, but with enough lines, it is not important how little is the likelihood of corrupted data, if it is not zero it will occur. Many time is enough awk '{print NF}' YourFILE | sort -n -u | uniq -c to know that you have the same number of columns in (almost, because of headers) each line, or a consistent structure (3 lines of data 1 blank...)
– Hastur
Dec 4 at 11:09

|
show 2 more comments

up vote
2
down vote

favorite

I have several big files with some measurements.

It looks this way:

N 12344;PE 9.9999999;...

#S 0 0 31 44 75 130 165 196...

#S_+ "2 5 2 3 3 1 1 2 3 1 2 2...



N 12345;PE 9.9999999;...

#S 0 0 34 57 84 133 152...

#S_+ "1 0 1 1 2 3 0 0 0...



N 12346;PE 9.9999999;...

#S 0 0 31 44 73 140 169...

#S_+ "3 3 4 0 0 2 1 2 4...



N 25104;PE 9.9999999;...

#S 0 0 36 52 102 108 145...

#S_+ "1 1 0 1 0 0 3 0 1...



N 25105;PE 9.9999999;...

#S 0 0 32 58 88 130 143...

Sample is here:
http://pasted.co/d9806b7c4

The file is much bigger but I replaced part of the data with "..." to make it shorter.

N 12344;PE 9.9999999; #S 0 0 31 44 75 130 165 196 #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...

N 12345;PE 9.9999999; #S 0 0 34 57 84 133 152 #S_+ "1 0 1 1 2 3 0 0 0...

N 12346;PE 9.9999999; #S 0 0 31 44 73 140 169 #S_+ "3 3 4 0 0 2 1 2 4...

N 25104;PE 9.9999999; #S 0 0 36 52 102 108 145 #S_+ "1 1 0 1 0 0 3 0 1...

N 25105;PE 9.9999999; #S 0 0 32 58 88 130 143...

Is this possible to achieve using some command-line utility in linux?

My knowledge is quite limited in this area so I would appreciate any help.

thanks

edited Nov 22 at 11:39

Toto

3,37191126

asked Nov 21 at 13:52

Juhele

2,07221222

I have several big files with some measurements.

It looks this way:

N 12344;PE 9.9999999;...

#S 0 0 31 44 75 130 165 196...

#S_+ "2 5 2 3 3 1 1 2 3 1 2 2...



N 12345;PE 9.9999999;...

#S 0 0 34 57 84 133 152...

#S_+ "1 0 1 1 2 3 0 0 0...



N 12346;PE 9.9999999;...

#S 0 0 31 44 73 140 169...

#S_+ "3 3 4 0 0 2 1 2 4...



N 25104;PE 9.9999999;...

#S 0 0 36 52 102 108 145...

#S_+ "1 1 0 1 0 0 3 0 1...



N 25105;PE 9.9999999;...

#S 0 0 32 58 88 130 143...

Sample is here:
http://pasted.co/d9806b7c4

The file is much bigger but I replaced part of the data with "..." to make it shorter.

N 12344;PE 9.9999999; #S 0 0 31 44 75 130 165 196 #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...

N 12345;PE 9.9999999; #S 0 0 34 57 84 133 152 #S_+ "1 0 1 1 2 3 0 0 0...

N 12346;PE 9.9999999; #S 0 0 31 44 73 140 169 #S_+ "3 3 4 0 0 2 1 2 4...

N 25104;PE 9.9999999; #S 0 0 36 52 102 108 145 #S_+ "1 1 0 1 0 0 3 0 1...

N 25105;PE 9.9999999; #S 0 0 32 58 88 130 143...

Is this possible to achieve using some command-line utility in linux?

My knowledge is quite limited in this area so I would appreciate any help.

thanks

linux command-line regex

edited Nov 22 at 11:39

Toto

3,37191126

asked Nov 21 at 13:52

Juhele

2,07221222

edited Nov 22 at 11:39

Toto

3,37191126

asked Nov 21 at 13:52

Juhele

2,07221222

edited Nov 22 at 11:39

Toto

3,37191126

edited Nov 22 at 11:39

Toto

3,37191126

edited Nov 22 at 11:39

Toto

3,37191126

asked Nov 21 at 13:52

Juhele

2,07221222

asked Nov 21 at 13:52

Juhele

2,07221222

asked Nov 21 at 13:52

Juhele

2,07221222

thanks to grawity for helping me with the code :-)
– Juhele
Nov 21 at 14:09

1

@Pimp Juice IT: OK, I updated the question.
– Juhele
Nov 21 at 14:14

Hi @Juhele can you specify better the output format: Do you need to cut the first line after e.g. PE 9.9999999;, do you need to cut the second after the 7th (8th) number or, as you write, merge the "N" line with the following two ones? What about the " present only in the output ?! I give some edit to your post, please check it. It can be an incomplete file? BTW for the most simple case you already have more than one good answer.
– Hastur
Nov 22 at 9:09

the " is both in input and output like #S_+ "2 5 - it is not important character for me (I am going to remove it in next processing step) but it is just in the input data. "do you need to cut the second after the 7th (8th) number" - No, I just shortened the example as the data has several hundreds "columns".
– Juhele
Dec 4 at 9:05

1

For personal experiences with acquisition devices do redundancy checks (the more the better). I suppose you already know, but with enough lines, it is not important how little is the likelihood of corrupted data, if it is not zero it will occur. Many time is enough awk '{print NF}' YourFILE | sort -n -u | uniq -c to know that you have the same number of columns in (almost, because of headers) each line, or a consistent structure (3 lines of data 1 blank...)
– Hastur
Dec 4 at 11:09

|
show 2 more comments

thanks to grawity for helping me with the code :-)
– Juhele
Nov 21 at 14:09

1

@Pimp Juice IT: OK, I updated the question.
– Juhele
Nov 21 at 14:14

Hi @Juhele can you specify better the output format: Do you need to cut the first line after e.g. PE 9.9999999;, do you need to cut the second after the 7th (8th) number or, as you write, merge the "N" line with the following two ones? What about the " present only in the output ?! I give some edit to your post, please check it. It can be an incomplete file? BTW for the most simple case you already have more than one good answer.
– Hastur
Nov 22 at 9:09

the " is both in input and output like #S_+ "2 5 - it is not important character for me (I am going to remove it in next processing step) but it is just in the input data. "do you need to cut the second after the 7th (8th) number" - No, I just shortened the example as the data has several hundreds "columns".
– Juhele
Dec 4 at 9:05

1

For personal experiences with acquisition devices do redundancy checks (the more the better). I suppose you already know, but with enough lines, it is not important how little is the likelihood of corrupted data, if it is not zero it will occur. Many time is enough awk '{print NF}' YourFILE | sort -n -u | uniq -c to know that you have the same number of columns in (almost, because of headers) each line, or a consistent structure (3 lines of data 1 blank...)
– Hastur
Dec 4 at 11:09

thanks to grawity for helping me with the code :-)
– Juhele
Nov 21 at 14:09

@Pimp Juice IT: OK, I updated the question.
– Juhele
Nov 21 at 14:14

Hi @Juhele can you specify better the output format: Do you need to cut the first line after e.g. PE 9.9999999;, do you need to cut the second after the 7th (8th) number or, as you write, merge the "N" line with the following two ones? What about the " present only in the output ?! I give some edit to your post, please check it. It can be an incomplete file? BTW for the most simple case you already have more than one good answer.
– Hastur
Nov 22 at 9:09

the " is both in input and output like #S_+ "2 5 - it is not important character for me (I am going to remove it in next processing step) but it is just in the input data. "do you need to cut the second after the 7th (8th) number" - No, I just shortened the example as the data has several hundreds "columns".
– Juhele
Dec 4 at 9:05

For personal experiences with acquisition devices do redundancy checks (the more the better). I suppose you already know, but with enough lines, it is not important how little is the likelihood of corrupted data, if it is not zero it will occur. Many time is enough awk '{print NF}' YourFILE | sort -n -u | uniq -c to know that you have the same number of columns in (almost, because of headers) each line, or a consistent structure (3 lines of data 1 blank...)
– Hastur
Dec 4 at 11:09

|
show 2 more comments

6 Answers
6

active

oldest

votes

up vote
4
down vote

With sed:

sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' data

In slow-mo:

-z makes sed consider the file as a single line (so the line ends are plain characters)

's/n#S/#S/g' replaces all LF's occurring just before a #S by a space

-e 's/nN /N /g' replaces all LFs before N (ie, the blank lines)

answered Nov 21 at 14:32

xenoid

3,5533718

hmm, looks I will have to adjust it a little bit as "sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" still does not do the same as replacing "rn#S" with "#S" in Notepad++ so there is still CR left on the previous line before "#S".
– Juhele
Dec 4 at 10:59

OK, "sed -z -e 's/rn#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" finally turns the multiple lines into one. Just blank line remains as the "N..." line ends with CR LF (this is fine) and then there is a blank line with CR.
– Juhele
Dec 4 at 11:09

add a comment |

up vote
4
down vote

With paste (this requires to always have groups of 4 lines):

 paste -s -d '   n' data

In slo-mo:

paste -s concatenates the lines from the file

-d specifies characters to be inserted as delimiters. When there are several characters, they are used in a round-robin fashion, so with 3 spaces and a LF:
- the first space is used on the first splice (N to #S),
- the second space is used on the second splice (#S to #S),
- the third space is used on the thrid splice (#S to blank line),
- the last delimiter, a LF, is used on the fourth splice (blank line to N)
- and the cycle repeats for the next 4 lines.

answered Nov 21 at 14:42

xenoid

3,5533718

add a comment |

up vote
4
down vote

This is a portable solution with POSIX sed, implementing the following rules:

empty lines shall be deleted;

any line starting with #S shall be merged with the previous non-empty line, with a single space character between them, unless there is no previous non-empty line.

The code:

<data sed '/^$/ d; :start; N; s/n$//; t start; s/n#S/ #S/; t start; P; D'

The same with comments (still working code):

<data sed '

  /^$/ d      # If empty line read, delete it and start a new cycle.

  :start      # A label.

  N           # Read additional line, there are now two lines in the pattern space.

  s/n$//     # If the second line is empty, replace the newline with nothing.

  t start     # If the above replacement occurred, go to start (to add another line).

              # Otherwise

  s/n#S/ #S/ # if the second line starts with #S, replace the newline with space.

  t start     # If the above replacement occurred, go to start (to add another line).

              # Otherwise

              # (i.e when non-empty line not starting with #S occurred)

  P           # print the pattern space up to the first newline and...

  D           # delete the initial segment of the pattern space

              # through the first newline (i.e. everything just printed),

              # and start the next cycle with the resultant pattern space

              # and without reading any new input

              # (in our case the new input will be explicitly read by N then).

  '

Note the solution uses sed pattern space to accumulate many input lines. This remark applies:

The pattern and hold spaces shall each be able to hold at least 8192 bytes.

Just before the P command the pattern space holds one (relatively long) line meant to be printed and a single (relatively short) input line, plus a newline in between. Obviously it depends on your data, whether or not such structure exceeds 8192 bytes at some point. If it does, some sed implementations may fail.

edited Nov 22 at 6:35

answered Nov 21 at 18:17

Kamil Maciorowski

23.1k155072

add a comment |

up vote
3
down vote

Using Perl:

perl -0 -ape 's/R(?=RN|#)/ /g' file.txt

N 12344;PE 9.9999999;... #S 0 0 31 44 75 130 165 196... #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...

N 12345;PE 9.9999999;... #S 0 0 34 57 84 133 152... #S_+ "1 0 1 1 2 3 0 0 0...

N 12346;PE 9.9999999;... #S 0 0 31 44 73 140 169... #S_+ "3 3 4 0 0 2 1 2 4...

N 25104;PE 9.9999999;... #S 0 0 36 52 102 108 145... #S_+ "1 1 0 1 0 0 3 0 1...

N 25105;PE 9.9999999;... #S 0 0 32 58 88 130 143...

Regex explain:

s/              : substitute

    R          : any kind of line break (ie. r, n, rn)

    (?=         : positive lookahead, zero-length assertion that make sure we have after

        RN     : a line break followed by letter N

      |         : OR

        #       : # character

    )           : end lookahead

/ /g            : replace with a space, global

edited Nov 21 at 16:31

answered Nov 21 at 15:58

Toto

3,37191126

add a comment |

up vote
3
down vote

awk (gawk ^[1])

As usually other than sed you can use awk (and in many different ways...)

awk 'ORS=" "; NR % 4 == 0 && ORS="n" ' data

where

ORS=" " fixes the output record separator, by default a newline, to a space (you can change)

NR % 4 == 0 && ORS="n" each 4th line it fixes back to the newline n

If nothing else is specified awk prints the full line

data is your data file.

If you want you can use regex as in sed (in a similar way).

A format check version with awk

Even if not requested, you may want to manage a truncated file eliminating the corrupted output line and generating an error and an error message.

awk '{a=$0; getline b; getline c; 

     if ( getline > 0 ) {print a, b, c, $0 } 

     else { print "Ohi " > "/dev/stderr" ; exit 65; }  }' data

where

a=$0; puts the full line in the variable a

getline b; reads a line and puts the variable b

getline c; obscure unfathomable command :-)

if (getline) if it is able to read a line...

..............{print a, b, c, $0} prints the 4 lines

else prints an error on the stderr device (screen or other) you can custom here...

exit 65 return an exit code different from 0 --->error

Bonus: why 65?

Searching for a good value for your exit code ^[2] you may found that it is suggested to see in /usr/include/sysexits.h among some C standards...

  #define EX_DATAERR      65      /* data format error */

65 is the most appropriate for the a data format error...

Honestly as answer I preferred 42,

but each value different from zero (and not reserved^[2]) could be good and 65 is the specific one...

edited Nov 22 at 11:49

answered Nov 21 at 22:28

Hastur

13k53266

One disadvantage though: the last pack of lines may consist of three of them (i.e. no empty line at the very end); or may not. If three, then the last character of your output is space, not a newline. POSIX defines "line" as a sequence of zero or more non- <newline> characters plus a terminating <newline> character. This will probably backfire if the output is parsed further.
– Kamil Maciorowski
Nov 22 at 9:36

Nice though, but the OP, among some other points not completely specified, states that are sets of 4 lines, last of them blank. With a truncated file the next unknown processing may be however compromised. A not requested formats check is out of this thread scope, and IMHO a good practice is to generate an error. If you require solidity it is better to opt for a script (awk,sed,perl are scripting languages) that also allows you to reproduce the data processing. Then you have to decide how to deal with errors, but that is another quesiton...:-) I just try to keep it simple.
– Hastur
Nov 22 at 10:45

@KamilMaciorowski ... nonetheless I added another version with error check...
– Hastur
Nov 22 at 11:29

add a comment |

up vote
0
down vote

You can do it with any text editor that support regular expressions like Notepad++.

The new line is just simple non-printable character or two characters. In Windows usually CarrigeReturn and LineFeed and in Unix based system usually LineFeed only.

To see them you need to turn on showing non-printable character (usually a Paragraph icon)
See here: https://imgur.com/cqiTvrp

Now what you need to do is to use regular expression replacer (CTRL + H) to replace CRLF#S to #S.
The symbol for CR is r and for LF is n. So you gonna end up with rn#S or n#S to #S.
https://imgur.com/GoeVn70

Or you can replace it to SPACE if you need.

answered Nov 21 at 14:15

KaRolthas

The question is tagged "Linux"....
– xenoid
Nov 21 at 14:16

I think regular expressions in Geany are the same. Is used Notepad++ as an example beacuse I am currently at Windows.
– KaRolthas
Nov 21 at 14:20

The question also asks for a command-line utility...
– xenoid
Nov 21 at 14:22

Nice, works. I need to somehow process at least few files now so even Notepad++ helps when I am working on my other machine with Windows. thanks
– Juhele
Nov 21 at 14:44

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1377291%2fhow-can-i-replace-end-line-with-fixed-text-when-the-next-line-begins-with-a-defi%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

6 Answers
6

active

oldest

votes

6 Answers
6

active

oldest

votes

up vote
4
down vote

With sed:

sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' data

In slow-mo:

-z makes sed consider the file as a single line (so the line ends are plain characters)

's/n#S/#S/g' replaces all LF's occurring just before a #S by a space

-e 's/nN /N /g' replaces all LFs before N (ie, the blank lines)

answered Nov 21 at 14:32

xenoid

3,5533718

hmm, looks I will have to adjust it a little bit as "sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" still does not do the same as replacing "rn#S" with "#S" in Notepad++ so there is still CR left on the previous line before "#S".
– Juhele
Dec 4 at 10:59

OK, "sed -z -e 's/rn#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" finally turns the multiple lines into one. Just blank line remains as the "N..." line ends with CR LF (this is fine) and then there is a blank line with CR.
– Juhele
Dec 4 at 11:09

add a comment |

up vote
4
down vote

With sed:

sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' data

In slow-mo:

-z makes sed consider the file as a single line (so the line ends are plain characters)

's/n#S/#S/g' replaces all LF's occurring just before a #S by a space

-e 's/nN /N /g' replaces all LFs before N (ie, the blank lines)

answered Nov 21 at 14:32

xenoid

3,5533718

hmm, looks I will have to adjust it a little bit as "sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" still does not do the same as replacing "rn#S" with "#S" in Notepad++ so there is still CR left on the previous line before "#S".
– Juhele
Dec 4 at 10:59

OK, "sed -z -e 's/rn#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" finally turns the multiple lines into one. Just blank line remains as the "N..." line ends with CR LF (this is fine) and then there is a blank line with CR.
– Juhele
Dec 4 at 11:09

add a comment |

up vote
4
down vote

With sed:

sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' data

In slow-mo:

-z makes sed consider the file as a single line (so the line ends are plain characters)

's/n#S/#S/g' replaces all LF's occurring just before a #S by a space

-e 's/nN /N /g' replaces all LFs before N (ie, the blank lines)

answered Nov 21 at 14:32

xenoid

3,5533718

With sed:

sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' data

In slow-mo:

-z makes sed consider the file as a single line (so the line ends are plain characters)

's/n#S/#S/g' replaces all LF's occurring just before a #S by a space

-e 's/nN /N /g' replaces all LFs before N (ie, the blank lines)

answered Nov 21 at 14:32

xenoid

3,5533718

answered Nov 21 at 14:32

xenoid

3,5533718

answered Nov 21 at 14:32

xenoid

3,5533718

answered Nov 21 at 14:32

xenoid

3,5533718

hmm, looks I will have to adjust it a little bit as "sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" still does not do the same as replacing "rn#S" with "#S" in Notepad++ so there is still CR left on the previous line before "#S".
– Juhele
Dec 4 at 10:59

OK, "sed -z -e 's/rn#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" finally turns the multiple lines into one. Just blank line remains as the "N..." line ends with CR LF (this is fine) and then there is a blank line with CR.
– Juhele
Dec 4 at 11:09

add a comment |

hmm, looks I will have to adjust it a little bit as "sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" still does not do the same as replacing "rn#S" with "#S" in Notepad++ so there is still CR left on the previous line before "#S".
– Juhele
Dec 4 at 10:59

OK, "sed -z -e 's/rn#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" finally turns the multiple lines into one. Just blank line remains as the "N..." line ends with CR LF (this is fine) and then there is a blank line with CR.
– Juhele
Dec 4 at 11:09

hmm, looks I will have to adjust it a little bit as "sed -z -e 's/n#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" still does not do the same as replacing "rn#S" with "#S" in Notepad++ so there is still CR left on the previous line before "#S".
– Juhele
Dec 4 at 10:59

OK, "sed -z -e 's/rn#S/ #S/g' -e 's/nN /N /g' test1.txt >> test1_mod.txt" finally turns the multiple lines into one. Just blank line remains as the "N..." line ends with CR LF (this is fine) and then there is a blank line with CR.
– Juhele
Dec 4 at 11:09

add a comment |

up vote
4
down vote

With paste (this requires to always have groups of 4 lines):

 paste -s -d '   n' data

In slo-mo:

paste -s concatenates the lines from the file

-d specifies characters to be inserted as delimiters. When there are several characters, they are used in a round-robin fashion, so with 3 spaces and a LF:
- the first space is used on the first splice (N to #S),
- the second space is used on the second splice (#S to #S),
- the third space is used on the thrid splice (#S to blank line),
- the last delimiter, a LF, is used on the fourth splice (blank line to N)
- and the cycle repeats for the next 4 lines.

answered Nov 21 at 14:42

xenoid

3,5533718

add a comment |

up vote
4
down vote

With paste (this requires to always have groups of 4 lines):

 paste -s -d '   n' data

In slo-mo:

paste -s concatenates the lines from the file

-d specifies characters to be inserted as delimiters. When there are several characters, they are used in a round-robin fashion, so with 3 spaces and a LF:
- the first space is used on the first splice (N to #S),
- the second space is used on the second splice (#S to #S),
- the third space is used on the thrid splice (#S to blank line),
- the last delimiter, a LF, is used on the fourth splice (blank line to N)
- and the cycle repeats for the next 4 lines.

answered Nov 21 at 14:42

xenoid

3,5533718

add a comment |

up vote
4
down vote

With paste (this requires to always have groups of 4 lines):

 paste -s -d '   n' data

In slo-mo:

paste -s concatenates the lines from the file

-d specifies characters to be inserted as delimiters. When there are several characters, they are used in a round-robin fashion, so with 3 spaces and a LF:
- the first space is used on the first splice (N to #S),
- the second space is used on the second splice (#S to #S),
- the third space is used on the thrid splice (#S to blank line),
- the last delimiter, a LF, is used on the fourth splice (blank line to N)
- and the cycle repeats for the next 4 lines.

answered Nov 21 at 14:42

xenoid

3,5533718

With paste (this requires to always have groups of 4 lines):

 paste -s -d '   n' data

In slo-mo:

paste -s concatenates the lines from the file

-d specifies characters to be inserted as delimiters. When there are several characters, they are used in a round-robin fashion, so with 3 spaces and a LF:
- the first space is used on the first splice (N to #S),
- the second space is used on the second splice (#S to #S),
- the third space is used on the thrid splice (#S to blank line),
- the last delimiter, a LF, is used on the fourth splice (blank line to N)
- and the cycle repeats for the next 4 lines.

answered Nov 21 at 14:42

xenoid

3,5533718

answered Nov 21 at 14:42

xenoid

3,5533718

answered Nov 21 at 14:42

xenoid

3,5533718

answered Nov 21 at 14:42

xenoid

3,5533718

add a comment |

up vote
4
down vote

This is a portable solution with POSIX sed, implementing the following rules:

empty lines shall be deleted;

any line starting with #S shall be merged with the previous non-empty line, with a single space character between them, unless there is no previous non-empty line.

The code:

<data sed '/^$/ d; :start; N; s/n$//; t start; s/n#S/ #S/; t start; P; D'

The same with comments (still working code):

<data sed '

  /^$/ d      # If empty line read, delete it and start a new cycle.

  :start      # A label.

  N           # Read additional line, there are now two lines in the pattern space.

  s/n$//     # If the second line is empty, replace the newline with nothing.

  t start     # If the above replacement occurred, go to start (to add another line).

              # Otherwise

  s/n#S/ #S/ # if the second line starts with #S, replace the newline with space.

  t start     # If the above replacement occurred, go to start (to add another line).

              # Otherwise

              # (i.e when non-empty line not starting with #S occurred)

  P           # print the pattern space up to the first newline and...

  D           # delete the initial segment of the pattern space

              # through the first newline (i.e. everything just printed),

              # and start the next cycle with the resultant pattern space

              # and without reading any new input

              # (in our case the new input will be explicitly read by N then).

  '

Note the solution uses sed pattern space to accumulate many input lines. This remark applies:

The pattern and hold spaces shall each be able to hold at least 8192 bytes.

edited Nov 22 at 6:35

answered Nov 21 at 18:17

Kamil Maciorowski

23.1k155072

add a comment |

up vote
4
down vote

This is a portable solution with POSIX sed, implementing the following rules:

empty lines shall be deleted;

any line starting with #S shall be merged with the previous non-empty line, with a single space character between them, unless there is no previous non-empty line.

The code:

<data sed '/^$/ d; :start; N; s/n$//; t start; s/n#S/ #S/; t start; P; D'

The same with comments (still working code):

<data sed '

  /^$/ d      # If empty line read, delete it and start a new cycle.

  :start      # A label.

  N           # Read additional line, there are now two lines in the pattern space.

  s/n$//     # If the second line is empty, replace the newline with nothing.

  t start     # If the above replacement occurred, go to start (to add another line).

              # Otherwise

  s/n#S/ #S/ # if the second line starts with #S, replace the newline with space.

  t start     # If the above replacement occurred, go to start (to add another line).

              # Otherwise

              # (i.e when non-empty line not starting with #S occurred)

  P           # print the pattern space up to the first newline and...

  D           # delete the initial segment of the pattern space

              # through the first newline (i.e. everything just printed),

              # and start the next cycle with the resultant pattern space

              # and without reading any new input

              # (in our case the new input will be explicitly read by N then).

  '

Note the solution uses sed pattern space to accumulate many input lines. This remark applies:

The pattern and hold spaces shall each be able to hold at least 8192 bytes.

edited Nov 22 at 6:35

answered Nov 21 at 18:17

Kamil Maciorowski

23.1k155072

add a comment |

up vote
4
down vote

This is a portable solution with POSIX sed, implementing the following rules:

empty lines shall be deleted;

any line starting with #S shall be merged with the previous non-empty line, with a single space character between them, unless there is no previous non-empty line.

The code:

<data sed '/^$/ d; :start; N; s/n$//; t start; s/n#S/ #S/; t start; P; D'

The same with comments (still working code):

<data sed '

  /^$/ d      # If empty line read, delete it and start a new cycle.

  :start      # A label.

  N           # Read additional line, there are now two lines in the pattern space.

  s/n$//     # If the second line is empty, replace the newline with nothing.

  t start     # If the above replacement occurred, go to start (to add another line).

              # Otherwise

  s/n#S/ #S/ # if the second line starts with #S, replace the newline with space.

  t start     # If the above replacement occurred, go to start (to add another line).

              # Otherwise

              # (i.e when non-empty line not starting with #S occurred)

  P           # print the pattern space up to the first newline and...

  D           # delete the initial segment of the pattern space

              # through the first newline (i.e. everything just printed),

              # and start the next cycle with the resultant pattern space

              # and without reading any new input

              # (in our case the new input will be explicitly read by N then).

  '

Note the solution uses sed pattern space to accumulate many input lines. This remark applies:

The pattern and hold spaces shall each be able to hold at least 8192 bytes.

edited Nov 22 at 6:35

answered Nov 21 at 18:17

Kamil Maciorowski

23.1k155072

This is a portable solution with POSIX sed, implementing the following rules:

empty lines shall be deleted;

any line starting with #S shall be merged with the previous non-empty line, with a single space character between them, unless there is no previous non-empty line.

The code:

<data sed '/^$/ d; :start; N; s/n$//; t start; s/n#S/ #S/; t start; P; D'

The same with comments (still working code):

<data sed '

  /^$/ d      # If empty line read, delete it and start a new cycle.

  :start      # A label.

  N           # Read additional line, there are now two lines in the pattern space.

  s/n$//     # If the second line is empty, replace the newline with nothing.

  t start     # If the above replacement occurred, go to start (to add another line).

              # Otherwise

  s/n#S/ #S/ # if the second line starts with #S, replace the newline with space.

  t start     # If the above replacement occurred, go to start (to add another line).

              # Otherwise

              # (i.e when non-empty line not starting with #S occurred)

  P           # print the pattern space up to the first newline and...

  D           # delete the initial segment of the pattern space

              # through the first newline (i.e. everything just printed),

              # and start the next cycle with the resultant pattern space

              # and without reading any new input

              # (in our case the new input will be explicitly read by N then).

  '

Note the solution uses sed pattern space to accumulate many input lines. This remark applies:

The pattern and hold spaces shall each be able to hold at least 8192 bytes.

edited Nov 22 at 6:35

answered Nov 21 at 18:17

Kamil Maciorowski

23.1k155072

edited Nov 22 at 6:35

answered Nov 21 at 18:17

Kamil Maciorowski

23.1k155072

answered Nov 21 at 18:17

Kamil Maciorowski

23.1k155072

answered Nov 21 at 18:17

Kamil Maciorowski

23.1k155072

add a comment |

up vote
3
down vote

Using Perl:

perl -0 -ape 's/R(?=RN|#)/ /g' file.txt

N 12344;PE 9.9999999;... #S 0 0 31 44 75 130 165 196... #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...

N 12345;PE 9.9999999;... #S 0 0 34 57 84 133 152... #S_+ "1 0 1 1 2 3 0 0 0...

N 12346;PE 9.9999999;... #S 0 0 31 44 73 140 169... #S_+ "3 3 4 0 0 2 1 2 4...

N 25104;PE 9.9999999;... #S 0 0 36 52 102 108 145... #S_+ "1 1 0 1 0 0 3 0 1...

N 25105;PE 9.9999999;... #S 0 0 32 58 88 130 143...

Regex explain:

s/              : substitute

    R          : any kind of line break (ie. r, n, rn)

    (?=         : positive lookahead, zero-length assertion that make sure we have after

        RN     : a line break followed by letter N

      |         : OR

        #       : # character

    )           : end lookahead

/ /g            : replace with a space, global

edited Nov 21 at 16:31

answered Nov 21 at 15:58

Toto

3,37191126

add a comment |

up vote
3
down vote

Using Perl:

perl -0 -ape 's/R(?=RN|#)/ /g' file.txt

N 12344;PE 9.9999999;... #S 0 0 31 44 75 130 165 196... #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...

N 12345;PE 9.9999999;... #S 0 0 34 57 84 133 152... #S_+ "1 0 1 1 2 3 0 0 0...

N 12346;PE 9.9999999;... #S 0 0 31 44 73 140 169... #S_+ "3 3 4 0 0 2 1 2 4...

N 25104;PE 9.9999999;... #S 0 0 36 52 102 108 145... #S_+ "1 1 0 1 0 0 3 0 1...

N 25105;PE 9.9999999;... #S 0 0 32 58 88 130 143...

Regex explain:

s/              : substitute

    R          : any kind of line break (ie. r, n, rn)

    (?=         : positive lookahead, zero-length assertion that make sure we have after

        RN     : a line break followed by letter N

      |         : OR

        #       : # character

    )           : end lookahead

/ /g            : replace with a space, global

edited Nov 21 at 16:31

answered Nov 21 at 15:58

Toto

3,37191126

add a comment |

up vote
3
down vote

Using Perl:

perl -0 -ape 's/R(?=RN|#)/ /g' file.txt

N 12344;PE 9.9999999;... #S 0 0 31 44 75 130 165 196... #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...

N 12345;PE 9.9999999;... #S 0 0 34 57 84 133 152... #S_+ "1 0 1 1 2 3 0 0 0...

N 12346;PE 9.9999999;... #S 0 0 31 44 73 140 169... #S_+ "3 3 4 0 0 2 1 2 4...

N 25104;PE 9.9999999;... #S 0 0 36 52 102 108 145... #S_+ "1 1 0 1 0 0 3 0 1...

N 25105;PE 9.9999999;... #S 0 0 32 58 88 130 143...

Regex explain:

s/              : substitute

    R          : any kind of line break (ie. r, n, rn)

    (?=         : positive lookahead, zero-length assertion that make sure we have after

        RN     : a line break followed by letter N

      |         : OR

        #       : # character

    )           : end lookahead

/ /g            : replace with a space, global

edited Nov 21 at 16:31

answered Nov 21 at 15:58

Toto

3,37191126

Using Perl:

perl -0 -ape 's/R(?=RN|#)/ /g' file.txt

N 12344;PE 9.9999999;... #S 0 0 31 44 75 130 165 196... #S_+ "2 5 2 3 3 1 1 2 3 1 2 2...

N 12345;PE 9.9999999;... #S 0 0 34 57 84 133 152... #S_+ "1 0 1 1 2 3 0 0 0...

N 12346;PE 9.9999999;... #S 0 0 31 44 73 140 169... #S_+ "3 3 4 0 0 2 1 2 4...

N 25104;PE 9.9999999;... #S 0 0 36 52 102 108 145... #S_+ "1 1 0 1 0 0 3 0 1...

N 25105;PE 9.9999999;... #S 0 0 32 58 88 130 143...

Regex explain:

s/              : substitute

    R          : any kind of line break (ie. r, n, rn)

    (?=         : positive lookahead, zero-length assertion that make sure we have after

        RN     : a line break followed by letter N

      |         : OR

        #       : # character

    )           : end lookahead

/ /g            : replace with a space, global

edited Nov 21 at 16:31

answered Nov 21 at 15:58

Toto

3,37191126

edited Nov 21 at 16:31

answered Nov 21 at 15:58

Toto

3,37191126

answered Nov 21 at 15:58

Toto

3,37191126

answered Nov 21 at 15:58

Toto

3,37191126

add a comment |

up vote
3
down vote

awk (gawk ^[1])

As usually other than sed you can use awk (and in many different ways...)

awk 'ORS=" "; NR % 4 == 0 && ORS="n" ' data

where

ORS=" " fixes the output record separator, by default a newline, to a space (you can change)

NR % 4 == 0 && ORS="n" each 4th line it fixes back to the newline n

If nothing else is specified awk prints the full line

data is your data file.

If you want you can use regex as in sed (in a similar way).

A format check version with awk

Even if not requested, you may want to manage a truncated file eliminating the corrupted output line and generating an error and an error message.

awk '{a=$0; getline b; getline c; 

     if ( getline > 0 ) {print a, b, c, $0 } 

     else { print "Ohi " > "/dev/stderr" ; exit 65; }  }' data

where

a=$0; puts the full line in the variable a

getline b; reads a line and puts the variable b

getline c; obscure unfathomable command :-)

if (getline) if it is able to read a line...

..............{print a, b, c, $0} prints the 4 lines

else prints an error on the stderr device (screen or other) you can custom here...

exit 65 return an exit code different from 0 --->error

Bonus: why 65?

Searching for a good value for your exit code ^[2] you may found that it is suggested to see in /usr/include/sysexits.h among some C standards...

  #define EX_DATAERR      65      /* data format error */

65 is the most appropriate for the a data format error...

Honestly as answer I preferred 42,

but each value different from zero (and not reserved^[2]) could be good and 65 is the specific one...

edited Nov 22 at 11:49

answered Nov 21 at 22:28

Hastur

13k53266

One disadvantage though: the last pack of lines may consist of three of them (i.e. no empty line at the very end); or may not. If three, then the last character of your output is space, not a newline. POSIX defines "line" as a sequence of zero or more non- <newline> characters plus a terminating <newline> character. This will probably backfire if the output is parsed further.
– Kamil Maciorowski
Nov 22 at 9:36

Nice though, but the OP, among some other points not completely specified, states that are sets of 4 lines, last of them blank. With a truncated file the next unknown processing may be however compromised. A not requested formats check is out of this thread scope, and IMHO a good practice is to generate an error. If you require solidity it is better to opt for a script (awk,sed,perl are scripting languages) that also allows you to reproduce the data processing. Then you have to decide how to deal with errors, but that is another quesiton...:-) I just try to keep it simple.
– Hastur
Nov 22 at 10:45

@KamilMaciorowski ... nonetheless I added another version with error check...
– Hastur
Nov 22 at 11:29

add a comment |

up vote
3
down vote

awk (gawk ^[1])

As usually other than sed you can use awk (and in many different ways...)

awk 'ORS=" "; NR % 4 == 0 && ORS="n" ' data

where

ORS=" " fixes the output record separator, by default a newline, to a space (you can change)

NR % 4 == 0 && ORS="n" each 4th line it fixes back to the newline n

If nothing else is specified awk prints the full line

data is your data file.

If you want you can use regex as in sed (in a similar way).

A format check version with awk

Even if not requested, you may want to manage a truncated file eliminating the corrupted output line and generating an error and an error message.

awk '{a=$0; getline b; getline c; 

     if ( getline > 0 ) {print a, b, c, $0 } 

     else { print "Ohi " > "/dev/stderr" ; exit 65; }  }' data

where

a=$0; puts the full line in the variable a

getline b; reads a line and puts the variable b

getline c; obscure unfathomable command :-)

if (getline) if it is able to read a line...

..............{print a, b, c, $0} prints the 4 lines

else prints an error on the stderr device (screen or other) you can custom here...

exit 65 return an exit code different from 0 --->error

Bonus: why 65?

Searching for a good value for your exit code ^[2] you may found that it is suggested to see in /usr/include/sysexits.h among some C standards...

  #define EX_DATAERR      65      /* data format error */

65 is the most appropriate for the a data format error...

Honestly as answer I preferred 42,

but each value different from zero (and not reserved^[2]) could be good and 65 is the specific one...

edited Nov 22 at 11:49

answered Nov 21 at 22:28

Hastur

13k53266

One disadvantage though: the last pack of lines may consist of three of them (i.e. no empty line at the very end); or may not. If three, then the last character of your output is space, not a newline. POSIX defines "line" as a sequence of zero or more non- <newline> characters plus a terminating <newline> character. This will probably backfire if the output is parsed further.
– Kamil Maciorowski
Nov 22 at 9:36

Nice though, but the OP, among some other points not completely specified, states that are sets of 4 lines, last of them blank. With a truncated file the next unknown processing may be however compromised. A not requested formats check is out of this thread scope, and IMHO a good practice is to generate an error. If you require solidity it is better to opt for a script (awk,sed,perl are scripting languages) that also allows you to reproduce the data processing. Then you have to decide how to deal with errors, but that is another quesiton...:-) I just try to keep it simple.
– Hastur
Nov 22 at 10:45

@KamilMaciorowski ... nonetheless I added another version with error check...
– Hastur
Nov 22 at 11:29

add a comment |

up vote
3
down vote

awk (gawk ^[1])

As usually other than sed you can use awk (and in many different ways...)

awk 'ORS=" "; NR % 4 == 0 && ORS="n" ' data

where

ORS=" " fixes the output record separator, by default a newline, to a space (you can change)

NR % 4 == 0 && ORS="n" each 4th line it fixes back to the newline n

If nothing else is specified awk prints the full line

data is your data file.

If you want you can use regex as in sed (in a similar way).

A format check version with awk

Even if not requested, you may want to manage a truncated file eliminating the corrupted output line and generating an error and an error message.

awk '{a=$0; getline b; getline c; 

     if ( getline > 0 ) {print a, b, c, $0 } 

     else { print "Ohi " > "/dev/stderr" ; exit 65; }  }' data

where

a=$0; puts the full line in the variable a

getline b; reads a line and puts the variable b

getline c; obscure unfathomable command :-)

if (getline) if it is able to read a line...

..............{print a, b, c, $0} prints the 4 lines

else prints an error on the stderr device (screen or other) you can custom here...

exit 65 return an exit code different from 0 --->error

Bonus: why 65?

Searching for a good value for your exit code ^[2] you may found that it is suggested to see in /usr/include/sysexits.h among some C standards...

  #define EX_DATAERR      65      /* data format error */

65 is the most appropriate for the a data format error...

Honestly as answer I preferred 42,

but each value different from zero (and not reserved^[2]) could be good and 65 is the specific one...

edited Nov 22 at 11:49

answered Nov 21 at 22:28

Hastur

13k53266

awk (gawk ^[1])

As usually other than sed you can use awk (and in many different ways...)

awk 'ORS=" "; NR % 4 == 0 && ORS="n" ' data

where

ORS=" " fixes the output record separator, by default a newline, to a space (you can change)

NR % 4 == 0 && ORS="n" each 4th line it fixes back to the newline n

If nothing else is specified awk prints the full line

data is your data file.

If you want you can use regex as in sed (in a similar way).

A format check version with awk

Even if not requested, you may want to manage a truncated file eliminating the corrupted output line and generating an error and an error message.

awk '{a=$0; getline b; getline c; 

     if ( getline > 0 ) {print a, b, c, $0 } 

     else { print "Ohi " > "/dev/stderr" ; exit 65; }  }' data

where

a=$0; puts the full line in the variable a

getline b; reads a line and puts the variable b

getline c; obscure unfathomable command :-)

if (getline) if it is able to read a line...

..............{print a, b, c, $0} prints the 4 lines

else prints an error on the stderr device (screen or other) you can custom here...

exit 65 return an exit code different from 0 --->error

Bonus: why 65?

Searching for a good value for your exit code ^[2] you may found that it is suggested to see in /usr/include/sysexits.h among some C standards...

  #define EX_DATAERR      65      /* data format error */

65 is the most appropriate for the a data format error...

Honestly as answer I preferred 42,

but each value different from zero (and not reserved^[2]) could be good and 65 is the specific one...

edited Nov 22 at 11:49

answered Nov 21 at 22:28

Hastur

13k53266

edited Nov 22 at 11:49

answered Nov 21 at 22:28

Hastur

13k53266

answered Nov 21 at 22:28

Hastur

13k53266

answered Nov 21 at 22:28

Hastur

13k53266

One disadvantage though: the last pack of lines may consist of three of them (i.e. no empty line at the very end); or may not. If three, then the last character of your output is space, not a newline. POSIX defines "line" as a sequence of zero or more non- <newline> characters plus a terminating <newline> character. This will probably backfire if the output is parsed further.
– Kamil Maciorowski
Nov 22 at 9:36

Nice though, but the OP, among some other points not completely specified, states that are sets of 4 lines, last of them blank. With a truncated file the next unknown processing may be however compromised. A not requested formats check is out of this thread scope, and IMHO a good practice is to generate an error. If you require solidity it is better to opt for a script (awk,sed,perl are scripting languages) that also allows you to reproduce the data processing. Then you have to decide how to deal with errors, but that is another quesiton...:-) I just try to keep it simple.
– Hastur
Nov 22 at 10:45

@KamilMaciorowski ... nonetheless I added another version with error check...
– Hastur
Nov 22 at 11:29

add a comment |

One disadvantage though: the last pack of lines may consist of three of them (i.e. no empty line at the very end); or may not. If three, then the last character of your output is space, not a newline. POSIX defines "line" as a sequence of zero or more non- <newline> characters plus a terminating <newline> character. This will probably backfire if the output is parsed further.
– Kamil Maciorowski
Nov 22 at 9:36

Nice though, but the OP, among some other points not completely specified, states that are sets of 4 lines, last of them blank. With a truncated file the next unknown processing may be however compromised. A not requested formats check is out of this thread scope, and IMHO a good practice is to generate an error. If you require solidity it is better to opt for a script (awk,sed,perl are scripting languages) that also allows you to reproduce the data processing. Then you have to decide how to deal with errors, but that is another quesiton...:-) I just try to keep it simple.
– Hastur
Nov 22 at 10:45

@KamilMaciorowski ... nonetheless I added another version with error check...
– Hastur
Nov 22 at 11:29

One disadvantage though: the last pack of lines may consist of three of them (i.e. no empty line at the very end); or may not. If three, then the last character of your output is space, not a newline. POSIX defines "line" as a sequence of zero or more non- <newline> characters plus a terminating <newline> character. This will probably backfire if the output is parsed further.
– Kamil Maciorowski
Nov 22 at 9:36

Nice though, but the OP, among some other points not completely specified, states that are sets of 4 lines, last of them blank. With a truncated file the next unknown processing may be however compromised. A not requested formats check is out of this thread scope, and IMHO a good practice is to generate an error. If you require solidity it is better to opt for a script (awk,sed,perl are scripting languages) that also allows you to reproduce the data processing. Then you have to decide how to deal with errors, but that is another quesiton...:-) I just try to keep it simple.
– Hastur
Nov 22 at 10:45

@KamilMaciorowski ... nonetheless I added another version with error check...
– Hastur
Nov 22 at 11:29

add a comment |

up vote
0
down vote

You can do it with any text editor that support regular expressions like Notepad++.

The new line is just simple non-printable character or two characters. In Windows usually CarrigeReturn and LineFeed and in Unix based system usually LineFeed only.

To see them you need to turn on showing non-printable character (usually a Paragraph icon)
See here: https://imgur.com/cqiTvrp

Or you can replace it to SPACE if you need.

answered Nov 21 at 14:15

KaRolthas

The question is tagged "Linux"....
– xenoid
Nov 21 at 14:16

I think regular expressions in Geany are the same. Is used Notepad++ as an example beacuse I am currently at Windows.
– KaRolthas
Nov 21 at 14:20

The question also asks for a command-line utility...
– xenoid
Nov 21 at 14:22

Nice, works. I need to somehow process at least few files now so even Notepad++ helps when I am working on my other machine with Windows. thanks
– Juhele
Nov 21 at 14:44

add a comment |

up vote
0
down vote

You can do it with any text editor that support regular expressions like Notepad++.

The new line is just simple non-printable character or two characters. In Windows usually CarrigeReturn and LineFeed and in Unix based system usually LineFeed only.

To see them you need to turn on showing non-printable character (usually a Paragraph icon)
See here: https://imgur.com/cqiTvrp

Or you can replace it to SPACE if you need.

answered Nov 21 at 14:15

KaRolthas

The question is tagged "Linux"....
– xenoid
Nov 21 at 14:16

I think regular expressions in Geany are the same. Is used Notepad++ as an example beacuse I am currently at Windows.
– KaRolthas
Nov 21 at 14:20

The question also asks for a command-line utility...
– xenoid
Nov 21 at 14:22

Nice, works. I need to somehow process at least few files now so even Notepad++ helps when I am working on my other machine with Windows. thanks
– Juhele
Nov 21 at 14:44

add a comment |

up vote
0
down vote

You can do it with any text editor that support regular expressions like Notepad++.

The new line is just simple non-printable character or two characters. In Windows usually CarrigeReturn and LineFeed and in Unix based system usually LineFeed only.

To see them you need to turn on showing non-printable character (usually a Paragraph icon)
See here: https://imgur.com/cqiTvrp

Or you can replace it to SPACE if you need.

answered Nov 21 at 14:15

KaRolthas

You can do it with any text editor that support regular expressions like Notepad++.

The new line is just simple non-printable character or two characters. In Windows usually CarrigeReturn and LineFeed and in Unix based system usually LineFeed only.

To see them you need to turn on showing non-printable character (usually a Paragraph icon)
See here: https://imgur.com/cqiTvrp

Or you can replace it to SPACE if you need.

answered Nov 21 at 14:15

KaRolthas

answered Nov 21 at 14:15

KaRolthas

answered Nov 21 at 14:15

KaRolthas

answered Nov 21 at 14:15

KaRolthas

The question is tagged "Linux"....
– xenoid
Nov 21 at 14:16

I think regular expressions in Geany are the same. Is used Notepad++ as an example beacuse I am currently at Windows.
– KaRolthas
Nov 21 at 14:20

The question also asks for a command-line utility...
– xenoid
Nov 21 at 14:22

Nice, works. I need to somehow process at least few files now so even Notepad++ helps when I am working on my other machine with Windows. thanks
– Juhele
Nov 21 at 14:44

add a comment |

The question is tagged "Linux"....
– xenoid
Nov 21 at 14:16

I think regular expressions in Geany are the same. Is used Notepad++ as an example beacuse I am currently at Windows.
– KaRolthas
Nov 21 at 14:20

The question also asks for a command-line utility...
– xenoid
Nov 21 at 14:22

Nice, works. I need to somehow process at least few files now so even Notepad++ helps when I am working on my other machine with Windows. thanks
– Juhele
Nov 21 at 14:44

The question is tagged "Linux"....
– xenoid
Nov 21 at 14:16

I think regular expressions in Geany are the same. Is used Notepad++ as an example beacuse I am currently at Windows.
– KaRolthas
Nov 21 at 14:20

The question also asks for a command-line utility...
– xenoid
Nov 21 at 14:22

Nice, works. I need to somehow process at least few files now so even Notepad++ helps when I am working on my other machine with Windows. thanks
– Juhele
Nov 21 at 14:44

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Super User!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Vrftsjtryk