Python invoice files and max display
I am tying to a count of files in a directory. And then count them and show how many there are. What's the best way to do this? This script will be really great with other lines of code pdf= 0
.
Also how can I display the output of the number of files, from high to low.
import os
pdf = 0
doc = 0
docx = 0
xls = 0
xlsx = 0
ppt = 0
pptx = 0
for file in os.listdir("C:\\Users\\joey\\Desktop\\school\\ICOMMH"):
if file.endswith(".pdf"):
pdf += 1
print(file)
if file.endswith(".doc"):
doc += 1
print(file)
if file.endswith(".docx"):
docx += 1
print(file)
if file.endswith(".xls"):
xls += 1
print(file)
if file.endswith(".xlsx"):
xlsx += 1
print(file)
if file.endswith(".ppt"):
ppt += 1
print(file)
if file.endswith(".pptx"):
pptx += 1
print(file)
print(pdf)
print(doc)
print(docx)
source to share
You can use a dictionary:
import os
exts = {}
my_exts = ('pdf', 'doc', 'docx', 'xls', 'xlsx', 'ppt', 'pptx')
for file in os.listdir("C:\\Users\\joey\\Desktop\\school\\ICOMMH"):
ext = os.path.splitext(file)[1]
if ext and ext[1:] in my_exts:
exts[ext] = exts.get(ext, 0) + 1
print sorted(exts.items(), key=lambda x: x[1], reverse=True)
The output will be:
[('.doc', 4), ('.pdf', 2), ('.xlsx', 1)]
source to share
Here you should use collections.defaultdict
from collections import defaultdict
import os.path
d = defaultdict(int)
for fname in os.listdir("C:\\Users\\joey\\Desktop\\school\\ICOMMH"):
_, ext = os.path.splitext(fname)
d[ext] += 1
Then you get a dictionary that looks like this:
{'.pdf': 7, # or however many...
'.doc': 3,
'.docx': 2, ...}
Then you can display the most often:
max(d, key=lambda k: d[k])
source to share
You can replace filenames with these formats and then use collections.Counter
:
from colections import Counter
import re
print Counter([re.sub(r'.*\.(\w+'),r'\1',i) for i in os.listdir("C:\\Users\\joey\\Desktop\\school\\ICOMMH")]
Or, as @Adam Smith mentioned, you can use os.path.splitext(i)[1]
instead re.sub
:
print Counter([os.path.splitext(i)[1] for i in os.listdir("C:\\Users\\joey\\Desktop\\school\\ICOMMH")]
And for displaying from high to low, you can use the method most_common
:
count=Counter([re.sub(r'.*\.(\w+'),r'\1',i) for i in os.listdir("C:\\Users\\joey\\Desktop\\school\\ICOMMH")]
for i, j in count.most_common():
print i,j
source to share
Using splitext
both and Counter
and it most_common
since you said you want high to low.
import os, collections
extensions = (os.path.splitext(f)[1] for f in os.listdir())
for ext, cnt in collections.Counter(extensions).most_common():
print(ext, cnt)
Printing, for example:
.txt 33
.csv 12
.py 10
.png 8
4
.json 2
.class 1
.c 1
.pl 1
.exe 1
.java 1
.sh 1
source to share
In my opinion, it will be easier if you use the "glob" library (link: https://docs.python.org/2/library/glob.html ). Using this library is simple, as simple as you send in a filename and extension pattern, and it will return a list of files in the same directory of the .py file that matches the pattern.
Example:
import glob;
pdf_files_list = glob.glob("*.pdf"); # star "*" represents wild card.
then you can find out the number of files using:
len(pdf_files_list); # len is a function that returns the length of a list.
source to share