How do I handle a small data grid in Python?

I need to process all MP3 files in a folder by reading / changing some ID3 tags as well as getting file size specifics etc. The end goal is to create an RSS file so that this MP3 is a regular podcast. I see maybe there could be up to 200 files (rows?) And 5 or 6 pieces of data (columns?) For each file. You need to read all the data, use the data to determine the sort order, and create an rss / xml file. Unclear best approach in Python regarding the way data is processed.

Saw this idea of ​​code for a dictionary of dictionaries, but it looks a little awkward?

mydict = {'MP3_File_1.mp3': 
          {'SIZE': '123456789','MODDATE': '20120508', 'TRKNUM': '152'}, 
           'MP3_File_2.mp3': 
          {'SIZE': '45689654', 'MODDATE': '20120515', 'TRKNUM': '003'}, 
           'MP3_File_3.mp3': 
          {'SIZE': '98754651', 'MODDATE': '20130101', 'TRKNUM': '062'}}

      

Either a real database or pyTables seems like overkill. I'm also considering creating a custom class, but I don't have enough experience in Python (yet). Is there a module / best practice that I am missing?

+3


source to share


4 answers


I would create a custom class containing one MP3 file, one variable for each field. This way, you can easily write functions to change these fields. Then I would construct one object per file (using the filename as a parameter to the constructor and using the constructor to populate the fields) and put all the objects in a list. This class will contain the necessary function for sorting objects. Finally, I would write a custom function to generate an XML file from this list.

This is not the only way to do it, but I do it.



class Mp3file(object):
    def __init__(self, filename):
        # read the file
        self.name = filename
        self.size = ...
        self.moddate = ...
        self.track_num = ...
        ...

    def to_xml(self):
        return ...

    def __lt__(self):
         ....
    def __eq__(self):
         ....
    ...

mp3list = []
for filename in directory:
    mp3list.append(Mp3file(filename))

def mp3list_to_xml(mylist):
    # write preamble
    for mf in sorted(mylist):
        x = mf.to_xml()
        # Add x to xml
    # write footer

      

+2


source


The list of dictionaries makes more sense to me.

mp3s = [
      {'NAME': 'lalala.mp3', 'SIZE': '123456789','MODDATE': '20120508', 'TRKNUM': '152'}, 
      {'NAME': 'lelele.mp3', 'SIZE': '45689654', 'MODDATE': '20120515', 'TRKNUM': '003'}, 
      {'NAME': 'lululu.mp3', 'SIZE': '98754651', 'MODDATE': '20130101', 'TRKNUM': '062'}]

      



If you want to keep the sorting simple:

sor = sorted(mp3s, key=lambda x: x['NAME'])

      

+2


source


Why not just use sqlite , it comes bundled for free.

You get sorting and searching built in, and the databases are one-off, so additional processes are required to manage.

Also the chances are that as your code evolves, you might want to add additional attributes to you and then a dict, etc. become unmanageable.

It helps to be able to choose a database and think

Yes, my data looks good - the next part should be simple.

+1


source


List NamedTuple

s
, perhaps? Tuple should be one of the least memorable types in Python AFAIK.

0


source







All Articles