How do I match spaces that are not a multiple of 4?
I have reformatted the Python script using notepad ++, but some lines do not indent 4 (or 8, 12, 16, etc.) spaces.
So I need to match consecutive leading white spaces (i.e. indentation at the start of each line) that are NOT in multiple of 4 , i.e. spaces in the number 1, 2, 3, 5, 6, 7, 9, 10, 11, etc.
eg.
>>> a = 1 # match this, as there're 3 spaces at the beginning
>>> b = a # match this too, as indent by 7 spaces
>>> c = 2 # but not this, since it indented exactly by 4 spaces
>>> d = c # not this either, since indented by 8 spaces
I was able to match whitespace in multiple of the four using something like:
^( {16}| {12}| {8}| {4})
then I tried to match this with something like:
^[^( {16}| {12}| {8}| {4})]
but it only matches an empty string or the beginning of a string with a character, not what I want.
I'm a complete newbie to regex but I've been looking for hours with no luck. I know I could always match all of the listed numbers other than 4, but I was hoping someone could help and provide a less cumbersome method.
Thank.
Update 1
using regex (@ user2864740)
^(?:\s{4})*\s{1,3}\S
or (@alpha bravo)
^(?!(\s{4})+\S)(.*)
matches not several of 4 indents, as well as an empty line with 4 (8, 16, etc.) spaces and the first character of the first non-empty line following them.
for example (at regex101.com)
How can you avoid matching these situations described in the example above?
source to share
A character class can only contain a set of characters and is therefore [^..]
not suitable for general negation. The regex is the [^( {16}| {12}| {8}| {4})]
equivalent [^( {16}|284]
that matches every character not specified.
Now, to match non-multiples of 4 spaces, this is the same as searching for spaces n mod 4 = {1, 2, 3}
(or nothing else n mod 4 = 0
). This can be done with a template such as:
(?:\s{4})*\s{1,3}\S
Explanation:
(?:\s{4})* - match any number of whole groups of 4 spaces and then .. \s{1,3} - match any count of 1, 2, or 3 spaces such that .. \S - they are not followed by a space
The regex may need an endpoint-all ( .*
) or a leading anchor-string ( ^
), depending on how it is used.
source to share
I could suggest a python script that will tell you which lines are indented incorrectly:
with open('path/to/code/file') as infile:
for i,line in enumerate(infile,1):
total = len(line)
whitespace = total-len(line.lstrip(' '))
if whitespace%4:
print("Inconsistent indenting on line", i)
source to share