Filename matching with regex in python

Question

Filename matching with regex in python

I'm looking for a regex command to map file names in a folder. I already have all the filenames in the list. Now I want to match the pattern in a loop (file matches a string):

./test1_word1_1.1_1.2_1.3.csv

from:

match = re.search(r'./{([\w]+)}_word1_{([0-9.]+)}_{([0-9.]+)}_{([0-9.]+)}*',file)

I used regex, but in this special case it just doesn't work. Can you help me?

I want to continue matching the regex like this (I wrote the result here):

match[0] = test1
match[1] = 1.1
match[2] = 1.2
match[3] = 1.3

The grieving brackets are my fault. They don't make any sense. excuse me

Regards, Sebastian

+4

python regex

sebastian 18 jul. 17 at 8:56

source to share

3 answers

Since test_word <>. csv is the filename and the contents inside <> will always change and contain dots and dots, can you try this?

r "test1_word [_0-9.] *. csv" r

Sample code and test lines

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"test1_word[_0-9.]*.csv"

test_str = ("./test1_word1_1.1_1.2_1.3.csv\n"
    "./test1_word1_1.31.2_1.555.csv\n"
    "./test1_word1_10.31.2_2000.00.csv")

matches = re.finditer(regex, test_str)

for matchNum, match in enumerate(matches):
    matchNum = matchNum + 1

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

Want to check it out? https://regex101.com/ will help you.

+1

Ajay2588 18 jul. 17 at 9:27 am

source to share

from glob import glob

glob("+([0-9a-zA-Z])_word1_+([0-9.])_+([0-9.])_([0-9.]).*")

0

leturn cloud Jul 25 At 9:58 am

source to share

Wiktor Stribiżew · Accepted Answer · 2017-07-18T09:17:20+0000

you can use

r'\./([^\W_]+)_word1_([0-9.]+)_([0-9.]+)_([0-9]+(?:\.[0-9]+)*)'

See regex demo

More details

\.

- literal dot (if not displayed, it matches any char other than a char string break)
/

- character /

(don't need to be escaped in Python regex pattern)
([^\W_]+)

- Group 1 matches 1 or more letters or numbers (if you want to match a piece containing _

, keep your original template (\w+)

)
_word1_

- literal substring
([0-9.]+)

- Group 1 matches 1 or more digits and / or .

symbols
_

- underscore character
([0-9.]+)

- Group 2 matches 1 or more digits and / or .

symbols
_

- underscore character
([0-9]+(?:\.[0-9]+)*)

- Group 3 corresponding to 1 or more digits followed by 0+ sequences .

and 1 or more digits

Demo Python version :

import re
rx = r"\./([^\W_]+)_word1_([0-9.]+)_([0-9.]+)_([0-9]+(?:\.[0-9]+)*)"
s = "./test1_word1_1.1_1.2_1.3.csv"
m = re.search(rx, s)
if m:
    print("Part1: {}\nPart2: {}\nPart3: {}\nPart4: {}".format(m.group(1), m.group(2), m.group(3), m.group(4) ))

Output:

Part1: test1
Part2: 1.1
Part3: 1.2
Part4: 1.3

Filename matching with regex in python

More articles: