By using Regex OR operator to place user input "A" or "An"

I am trying to validate user input as 'a' | 'an' which will satisfy the if statement. If it fails, the elif block checks to see if the second word is "about", if not "about" it will check "anyone". Unfortunately, "about" and "each" start with the letters "a" or "an", so I need to add a "space" after the end of "a" and "an" for Regex to detect the difference.

# Receive User input.
secrets = {}
secrets['text'] = request.GET.get('text')

regex_a = re.compile("(a|an)")
regex_about = re.compile('about')
regex_anyone = re.compile('anyone')

# Get second word from secrets[text]
secondword = secrets['text'].split()[1]
# If 2nd word is == 'a/an'
if regex_a.match(secondword):
    return HttpResponse("Text was (a) or (an)")

# Else if 2nd word is == about
elif regex_about.match(secondword):
    return HttpResponse("Second word was (about)")

elif regex_anyone.match(secondword):
    return HttpResponse("Second word was (anyone)")

else:
    return HttpResponse("Failed to interpret user input")

      

The current Regex ("(a|an)")

returns Text was (a) or (an)

even if the user enters "about" or "someone" as the second word, this is expected.

So, I also tried ("(a\s|an\s)")

, which returns Failed to interpret user input

when the input for the second word is 'a' or 'an'. However, it returns the correct answer for "about" and "any". What is really confusing ...

Then I also tried ("(a_|an_)")

, which returns the same results as the previous test.

Apart from these three tests, I have tried many others, but I will not list them here as there are so many of them.

+3


source to share


4 answers


You can use:

regex_a = re.compile("(a|an)$")

      



This tells the regular expression that the string must end right there to match.

The regex ("(a\s|an\s)")

will never work because it expects substrings to match 'a '

and 'an '

, and the problem is that split()

in secondword = secrets['text'].split()[1]

returns strings truncated by spaces.

+1


source


Use this:

(a\b|an\b)

      



\b

is a word boundary corresponding to the end of a word.

Demo is here . Welcome to Stack Overflow! Take a tour of the site in the Help section if you haven't already! :-)

+5


source


import re
sentence = "I am about to finish my Engineering and become an Engineer and a responsible person."
re.findall(r'(a|an)\s', sentence)

      

The extra space will treat it as a separate word.

+1


source


Use re.match

:

pattern = "^(A|An)$|^((?!^A$|^An$)about)$|^((?!^((?!^A$|^An$)about)$)anyone)$"
m = re.match(pattern, secondword)

if m:
    print(m.groups())

      

Here's a breakdown of this expression:

^(A|An)$

      

It matches "A"

or "An"

as isolated words. If this does not match, then it moves on to the next case.

^((?!^A$|^An$)about)$

      

It matches a word "about"

, but only if "A"

or "An"

does not occur as isolated words. If this does not match, then it moves on to the next case.

^((?!^((?!^A$|^An$)about)$)anyone)$

      

This matches a word "anyone"

, but only if "about"

it does not occur as an isolated word (in the absence of "A"

and "An"

).

You can test the regex here .

0


source







All Articles