Delete all Bash comments

How can I match and remove all comments from a string? I can delete comments starting on a new line or those not quoted using sed. But my script doesn't work in the following examples

This one "# this is not a comment" # but this "is a comment"

      

Can sed handle this case? if so what is the regular expression?

Example:

  • Input:

    This one "# this is not a comment" # but this "is a comment" 
    
          

  • Output:

    This one "# this is not a comment"
    
          

+3


source to share


2 answers


If we assume that # is not a comment when it is quoted or escaped with a backslash, then we can define the following regex:

(ES|RT|QT)*C?

      

Where

ES - escape sequence: \ followed by 1 char

\\.

      

RT - non-standard regular text

[^"\\#]*

      



QT - quoted text

"[^"]*"

      

C - a comment that starts with an unescaped, unordered hash character # and ends with the end of the line

#.*

      

Possible solution using sed:

sed 's/^\(\(\\.\|[^"\\#]*\|"[^"]*"\)*\)#.*$/\1/'

      

+1


source


You can use a lexical analyzer like Flex applied directly to the script. In his manual, you can find " How can I match C style comments? And I think you can adapt that part to suit your problem."



If you need an in-depth tutorial, you can find it here ; in the Lexical Analysis section you can find a pdf that will introduce you to the tool and an archive with some practical examples, including "c99-comment-eater" from which you can draw inspiration.

+1


source







All Articles