Group files based on common prefix

Consider the following set of files:

/wr_vjxeacn/lzx/vjx/rkkelkwrvkjl.o
/wr_vjxeacn/lzx/vjx/wllnxncvr.o
/wr_hvlx/lzx/hvlx/wlxkjjlnr/Sbisln.xww
/wr_hvlx/lzx/wllqepse/lzx/xww/ANTLR/evi
/wr_hvlx/lzx/wllqepse/lzx/xww/ANTLR/zajrvhn/sjrez3x.cee
/wr_hvlx/lzx/wllqepse/lzx/xww/ivj/GNUhstnmven
/wr_hvlx/eklr+mkajc/sjrez3x64.evi.7ss153m930724031i252iic841n68i6i
/wr_hvlx/lzx/wllqepse/lzx/xww/ANTLR/evi/sjrez3x.evi
/wnkwenrkkel/lzx
/wnkwenrkkel/lzx/GNUhstnmven.xkhhkj
/wnkwenrkkel/lzx/GNUhstnmven.cnwl
/wnkwenrkkel/lzx/GNUhstnmven.evlr
/wnkwenrkkel/lzx/GNUhstnmven.gvjckgl-vs32

      

What I was trying to figure out is the optimal way to group items with a common dirname prefix

common prefix dirname = os.path.dirname(os.path.commonprefix(...))

      

So, ideally, after grouping, the above should look like

/wr_vjxeacn/lzx/vjx
/wr_vjxeacn/lzx/vjx/rkkelkwrvkjl.o
/wr_vjxeacn/lzx/vjx/wllnxncvr.o
*************************************************************
/wr_hvlx/lzx/hvlx
/wr_hvlx/lzx/hvlx/wlxkjjlnr/Sbisln.xww
*************************************************************
/wr_hvlx/lzx/wllqepse
/wr_hvlx/lzx/wllqepse/lzx/xww/ANTLR/evi
/wr_hvlx/lzx/wllqepse/lzx/xww/ANTLR/zajrvhn/sjrez3x.cee
/wr_hvlx/lzx/wllqepse/lzx/xww/ivj/GNUhstnmven
*************************************************************
/wr_hvlx/eklr+mkajc
/wr_hvlx/eklr+mkajc/sjrez3x64.evi.7ss153m930724031i252iic841n68i6i
*************************************************************
/wr_hvlx/lzx/wllqepse/lzx/xww/ANTLR/evi
/wr_hvlx/lzx/wllqepse/lzx/xww/ANTLR/evi/sjrez3x.evi
*************************************************************
/wnkwenrkkel
/wnkwenrkkel/lzx
*************************************************************
/wnkwenrkkel/lzx
/wnkwenrkkel/lzx/GNUhstnmven.xkhhkj
/wnkwenrkkel/lzx/GNUhstnmven.cnwl
/wnkwenrkkel/lzx/GNUhstnmven.evlr
/wnkwenrkkel/lzx/GNUhstnmven.gvjckgl-vs32

      

What i tried

  • itertools.groupby

    but it doesn't have lookahead or lookbehind.
  • iteration and prefix violation. It is not possible to find a working solution to handle all edge cases.

Motivation

I have a list of checked files that I want to group based on modules to identify the respective owners of the modules.

+3
python python-2.7 group-by


source to share


No one has answered this question yet

See similar questions:

47
Longest common substring of more than two strings - Python
18
Trie (Prefix Tree) in Python

or similar:

5116
How can I check if a file exists without exceptions?
3474
How to list all files in a directory?
2112
How do I copy a file in Python?
2028
How to read a file line by line in a list?
1848
Deleting a file or folder
1419
Select rows from DataFrame based on values ​​in column in pandas
1414
How do you add a file in Python?
1043
Find all files in directory with .txt extension in Python
420
Convert Pandas GroupBy output from Series to DataFrame
353
What is the general format of Python file headers?



All Articles
Loading...
X
Show
Funny
Dev
Pics