Match single quotes from python re
The following regular expression finds all single words enclosed in quotation marks:
In [6]: re.findall(r"'(\w+)'", s)
Out[6]: ['Tom', 'Harry', 'rock']
Here:
-
'
matches a single quote; -
\w+
matches one or more characters in a word; -
'
matches a single quote; - the parentheses form a capturing group: they define the part of the match that is returned
findall()
.
If you only want to find words that start with a capital letter, the regex can be modified like this:
In [7]: re.findall(r"'([A-Z]\w*)'", s)
Out[7]: ['Tom', 'Harry']
source to share
I suggest
r = re.compile(r"\B'\w+'\B")
apos = r.findall("This hasn't been much that much of a twist and turn to 'Tom','Harry' and u know who..yes its 'rock'")
Result:
>>> apos
["'Tom'", "'Harry'", "'rock'"]
Negative word boundaries ( \B
) prevent matches such as 'n'
in words like Rock'n'Roll
.
Explanation:
\B # make sure that we're not at a word boundary
' # match a quote
\w+ # match one or more alphanumeric characters
' # match a quote
\B # make sure that we're not at a word boundary
source to share
^
('hat' or 'caret', among other names) in the regex means "start of line" (or, if specific options are given, "start of line") that you don't like, Omitting this, your regex works fine:
>>> re.findall(r'\'+\w+\'', s)
["'Tom'", "'Harry'", "'rock'"]
Regular expressions suggested by others may be better for what you are trying to achieve, this is a minimal change to fix your problem.
source to share