Sed awesomeness and inline file inclusion.
Sed is one of my favourite tools. It goes to a pretty much every single one-liner I write. Did you know that sed is so powerful that a single sed statement can turn cat into cement? Try:
echo cat | sed statement
Today I learnt something new about it, hence my first post in a few years (after that I’ll most likely go silent for another few). I wanted to replace part of text with the content of the file the text was referring to.
In other words I’d like to turn:
blah blah INCLUDE:xxx blah blah
blah blah $(cat xxx) blah blah
Sed happens to have a built-in command for including files. From info sed:
`r FILENAME' As a GNU extension, this command accepts two addresses. Queue the contents of FILENAME to be read and inserted into the output stream at the end of the current cycle, or when the next input line is read. Note that if FILENAME cannot be read, it is treated as if it were an empty file, without any error indication. As a GNU `sed' extension, the special value `/dev/stdin' is supported for the file name, which reads the contents of the standard input.
This works fine if the file name is static:
$ echo abc > f $ echo foo REPLACEME bar | sed '/REPLACEME/ r f' foo REPLACEME bar abc
, however in my application I needed to use part of the matched text as a file name. So something like:
$ echo foo REPLACEME:f bar \ | sed '/REPLACEME:\(\S\+\)/ r \1'
Unfortunately, it seems that backreferences can’t be used after regular expression is terminated (it seems so, because above does not work). I started digging in sed manual and came across this awesome flag to s/ command:
`e' This command allows one to pipe input from a shell command into pattern space. If a substitution was made, the command that is found in pattern space is executed and pattern space is replaced with its output. A trailing newline is suppressed; results are undefined if the command to be executed contains a NUL character. This is a GNU `sed' extension.
So, how does it work? It will apply replace and then eval the whole line in shell. This means that we should match from beginning of the line. If I wrote:
$ echo foo REPLACEME:f bar \ | sed 's/REPLACEME:\(\S\+\)/cat \1/e'
, then my result would be “foo f bar”, which is (for most of us) not a valid command.
$ echo foo REPLACEME:f bar \ | sed 's/REPLACEME:\(\S\+\)/cat \1/e' sh: foo: command not found
This is not exactly what the manual says (note the command that is found in pattern space is executed part), but there is two easy workarounds.
First one would be to pre-add new line characters before and after matched pattern, but that requires removing them at later on (if you need to):
$ echo foo REPLACEME:f bar \ | sed 's/\(REPLACEME:\S\+\)/\n\1\n/g' \ | sed 's/REPLACEME:\(\S\+\)$/cat \1/e' foo abc bar
We can also recreate the whole line in shell:
$ echo foo REPLACEME:f bar \ | sed 's/^\(.*\)REPLACEME:\(\S\+\)\(.*\)$/echo "\1"`cat \2`"\3"/e' foo abc bar
Awesome, isn’t it?
Don’t do it on the input you don’t trust!