Delete all Bash comments

How can I match and remove all comments from a string? I can delete comments starting on a new line or those not quoted using sed. But my script doesn't work in the following examples

This one "# this is not a comment" # but this "is a comment"


Can sed handle this case? if so what is the regular expression?


  • Input:

    This one "# this is not a comment" # but this "is a comment" 

  • Output:

    This one "# this is not a comment"


source to share

2 answers

If we assume that # is not a comment when it is quoted or escaped with a backslash, then we can define the following regex:




ES - escape sequence: \ followed by 1 char



RT - non-standard regular text



QT - quoted text



C - a comment that starts with an unescaped, unordered hash character # and ends with the end of the line



Possible solution using sed:

sed 's/^\(\(\\.\|[^"\\#]*\|"[^"]*"\)*\)#.*$/\1/'




You can use a lexical analyzer like Flex applied directly to the script. In his manual, you can find " How can I match C style comments? And I think you can adapt that part to suit your problem."

If you need an in-depth tutorial, you can find it here ; in the Lexical Analysis section you can find a pdf that will introduce you to the tool and an archive with some practical examples, including "c99-comment-eater" from which you can draw inspiration.



All Articles