I have a massive amount of text I have to parse through. Basically, a plain-text email, about 400-lines long, with some fields I need to pull out.
I found the best way to do this is to explode everything along newlines, then use pre_grep to get me to the right place (so, in the email, someone puts in "Username: dave", I grep "username" and match out "dave" to an array. The problem is, some input wraps around on a newline, thus grep-ping for the start of the proper line does me no good as it will now miss everything on the next line (next key in the array).
I cannot simply replace ALL the newlines, as there are parts where I require the newlines to be there.
I want to strip out the newlines ONLY if they are followed by any string off text. I was thinking maybe the positive lookaround asertion, but I cannot get it. So, anybody a guru-regexer out there?
Go through a file, convert any newlines that are followed by any carchter (or number) to spaces, leaving all other newlines (and carriage returns) intact.