Understanding bash overridden process replacement with a little awk
I can't figure out what's going on here. I have 4 versions of the same code, the only difference is the order of the code blocks / lines. My initial expectation was that the order of the redirection doesn't matter, but it doesn't seem right. I also assumed that the () function has some escaping properties from file descriptor shenanigans, but not ...
Please don't ask what I'm using it for and I don't need alternative solutions, I want to understand THIS part of the code. Or my faith in my understanding of process substitution will be broken forever ...
Metacode:
1_cmd_producing_both_stdout_and_stderr
|
+-stdout-> 2_cmd_producing_both_stdout_and_stderr
| |
| +-stdout-> A_awk_writing_stdout_to_file_producing_stderr
| |
| +-stderr-> B_awk_writing_stdout_to_file_producing_stderr
|
+-stdout-> 3_cmd_producing_both_stdout_and_stderr
|
+-stdout-> C_awk_writing_stdout_to_file_producing_stderr
|
+-stderr-> D_awk_writing_stdout_to_file_producing_stderr
Version 1: 1 2 AB 3 CD
Version 2: 1 2 BA 3 DC
Version 3: 1 3 DC 2 BA
Version 4: 1 3 CD 2 AB
Note:
I tried 2 AB and 2 BA, they also produce consistent outputs similar to version 1.
AWK: GNU Awk 4.1.1, API: 1.1 (GNU MPFR 3.1.2-p3, GNU MP 6.0.0)
bash: GNU bash version 4.3.30 (1) -release (x86_64-pc-linux-gnu)
Version 1 , this gives the expected output:
( echo log; echo err 1>&2; ) \
1> >( ( echo -n '1.'; cat; echo '1.ERR' 1>&2 ; ) \
1> >( awk 'BEGIN { print "error 1" >"/dev/stderr" } { print $0 }' >out.out ) \
2> >( awk 'BEGIN { print "error 2" >"/dev/stderr" } { print $0 }' >out.err )
) \
2> >( ( echo -n '2.'; cat; echo '2.ERR' 1>&2 ; ) \
1> >( awk 'BEGIN { print "error 3" >"/dev/stderr" } { print $0 }' >err.out ) \
2> >( awk 'BEGIN { print "error 4" >"/dev/stderr" } { print $0 }' >err.err )
)
file contents:
out.out 1.log
out.err 1.ERR
err.out 2.err
err.err 2.ERR
output:
error 4
error 2
error 1
error 3
Version 2:
NOTE. 1st with 2nd, 3rd with 4th level indentation lines are exchanged compared to version 1.
( echo log; echo err 1>&2; ) \
1> >( ( echo -n '1.'; cat; echo '1.ERR' 1>&2 ; ) \
2> >( awk 'BEGIN { print "error 1" >"/dev/stderr" } { print $0 }' >out.err ) \
1> >( awk 'BEGIN { print "error 2" >"/dev/stderr" } { print $0 }' >out.out )
) \
2> >( ( echo -n '2.'; cat; echo '2.ERR' 1>&2 ; ) \
2> >( awk 'BEGIN { print "error 3" >"/dev/stderr" } { print $0 }' >err.err ) \
1> >( awk 'BEGIN { print "error 4" >"/dev/stderr" } { print $0 }' >err.out )
)
file contents:
(!) out.err error 2\n1.ERR
out.out 1.log
(!) err.err 2.ERR\nerror 4
err.out 2.err
output:
error 3
error 1
Version 3:
NOTE. First-level indented code blocks are swapped since version 2.
( echo log; echo err 1>&2; ) \
2> >( ( echo -n '2.'; cat; echo '2.ERR' 1>&2 ; ) \
2> >( awk 'BEGIN { print "error 1" >"/dev/stderr" } { print $0 }' >err.err ) \
1> >( awk 'BEGIN { print "error 2" >"/dev/stderr" } { print $0 }' >err.out )
) \
1> >( ( echo -n '1.'; cat; echo '1.ERR' 1>&2 ; ) \
2> >( awk 'BEGIN { print "error 3" >"/dev/stderr" } { print $0 }' >out.err ) \
1> >( awk 'BEGIN { print "error 4" >"/dev/stderr" } { print $0 }' >out.out )
)
file contents:
(!) err.err error 2\n2.ERR
(!) err.out 2.err\nerror 3
(!) out.err 1.ERR\nerror 4
out.out 1.log
output:
error 1
(!)
Version 4:
NOTE. 1st to 2nd, 3rd to 4th level indented lines are swapped over version 3.
( echo log; echo err 1>&2; ) \
2> >( ( echo -n '2.'; cat; echo '2.ERR' 1>&2 ; ) \
1> >( awk 'BEGIN { print "error 1" >"/dev/stderr" } { print $0 }' >err.out ) \
2> >( awk 'BEGIN { print "error 2" >"/dev/stderr" } { print $0 }' >err.err )
) \
1> >( ( echo -n '1.'; cat; echo '1.ERR' 1>&2 ; ) \
1> >( awk 'BEGIN { print "error 3" >"/dev/stderr" } { print $0 }' >out.out ) \
2> >( awk 'BEGIN { print "error 4" >"/dev/stderr" } { print $0 }' >out.err )
)
file contents:
(!) err.out 2.err\nerror 4\nerror 3
err.err 2.ERR
out.out 1.log
out.err 1.ERR
output:
error 2
error 1
(!)
What's going on in version 2, 3, 4 ??
source to share
Your assumption that the ordering of the output redirection doesn't matter is wrong. Order matters a lot. Let's just look at the line "error 2" in Code Example 2 and see why it gets written to the file out.err
. Code:
( echo log; echo err 1>&2; ) \
1> >( ( echo -n '1.'; cat; echo '1.ERR' 1>&2 ; ) \
2> >( awk 'BEGIN { print "error 1" >"/dev/stderr" } { print $0 }' >out.err ) \
1> >( awk 'BEGIN { print "error 2" >"/dev/stderr" } { print $0 }' >out.out )
) \
2> >( ( echo -n '2.'; cat; echo '2.ERR' 1>&2 ; ) \
2> >( awk 'BEGIN { print "error 3" >"/dev/stderr" } { print $0 }' >err.err ) \
1> >( awk 'BEGIN { print "error 4" >"/dev/stderr" } { print $0 }' >err.out )
)
The line "error 2" is written by process B to its stderr. Since the string that the shell processes to start process B does not contain a redirection to stderr, process B inherits its stderr from its parent. Its parent is the subshell that you marked as 2
. This process has 2 redirects, each of which refers to subprocesses. When process B started, process 2's stderr was directed to process A, so that's where process B will write the string. Process A reads it and writes a line to its stdout, which is a file out.err
. If the redirection order is reversed, then the stderr of the process 2
has not yet been redirected, so the stderr of the process B
will be the same as the stderr of the origin process (e.g. your tty)
source to share