Should I always use GAWK over AWK?
awk
can refer to many things. There's awk
-the-standard and there are many different implementations, one of which is gawk
.
Not using implementation-specific functions means that you have a high probability that your code will work unchanged in other implementations of awk
-the-language.
gawk
, being one of the awk
-language implementations , claims to conform to the awk
-the-standard, adding some additional functionality.
$ man awk … DESCRIPTION Gawk is the GNU Project implementation of the AWK programming language. It conforms to the definition of the language in the POSIX 1003.1 Standard. This version in turn is based on the description in The AWK Programming Language, by Aho, Kernighan, and Weinberger. Gawk provides the additional features found in the current version of Brian Kernighan awk and a number of GNU-specific extensions. …
As far as speed is concerned, using gawk
as "plain" awk
shouldn't matter - often, when gawk
set, awk
will just be a symbolic link to gawk
, which means they'll be exactly the same program.
However, using gawk
-specific functions will mean that you will be blocked for that particular implementation - so if (hypothetically) you find a faster implementation, you probably have to adapt your script instead of just replacing the binary. (There are implementations that are faster, but I don't know, as I've never needed to run scripts awk
faster.)
Personally, I tend to stick with "simple" awk
and not use gawk
-specific functions, but if you don't care about switching to a different implementation, using extensions gawk
can make your script easier to write and save you time to do so.
source to share
Currently, the most common AWK version is gawk, and perhaps the second most common is mawk, at least because the AWK system is on debian.
To quote the conclusion apt-cache show mawk
Mawk is smaller and much faster than gawk. It has some compilation times like NF = 32767 and buffer sprintf = 1020.
There are more well thought out extensions on the gawk side and I think better error management and better error reporting, which are a real bonus when debugging a complex script and might be a good reason to use gawk even if you are not interested in its extensions.
On the other hand, if you have a debugged script, if you don't need a particular extension, if you can live with mawk's built-in constraints (which are a lot of ifs), and you want to squeeze the last bit of performance without going out of AWK comfort, then mawk is way.
source to share