How can I create separate Pandas DataFrames for each CSV file and give them meaningful names?

I have searched thoroughly and cannot find the guide I am looking for on this subject, so I hope this question will not be superfluous. I have several CSV files that represent bitmaps. I would like to do some statistical analysis on them, so I am trying to create a Pandas framework for each file so that I can slice the "em dice" em and plot "em ..." but I am having trouble going through the list of files to create DF with a meaningful name for each file.

Here's what I have so far:

import glob
import os
from pandas import *

#list of .csv files
#I'd like to turn each file into a dataframe
dataList = glob.glob(r'C:\Users\Charlie\Desktop\Qvik\textRasters\*.csv')

#name that I'd like to use for each data frame
nameList = []
for raster in dataList:
    path_list = raster.split(os.sep)
    name = path_list[6][:-4]
    nameList.append(name)

#zip these lists into a dict

dataDct = {}
for k, v in zip(nameList,dataList):
    dataDct[k] = dataDct.get(k,"") + v
dataDct

      

So now I have a dict where the key is the name I want for each dataframe and the value is the path for read_csv (path):

{'Aspect': 'C:\\Users\\Charlie\\Desktop\\Qvik\\textRasters\\Aspect.csv',
 'Curvature': 'C:\\Users\\Charlie\\Desktop\\Qvik\\textRasters\\Curvature.csv',
 'NormalZ': 'C:\\Users\\Charlie\\Desktop\\Qvik\\textRasters\\NormalZ.csv',
 'Slope': 'C:\\Users\\Charlie\\Desktop\\Qvik\\textRasters\\Slope.csv',
 'SnowDepth': 'C:\\Users\\Charlie\\Desktop\\Qvik\\textRasters\\SnowDepth.csv',
 'Vegetation': 'C:\\Users\\Charlie\\Desktop\\Qvik\\textRasters\\Vegetation.csv',
 'Z': 'C:\\Users\\Charlie\\Desktop\\Qvik\\textRasters\\Z.csv'}

      

My instinct was to try variations of this:

for k, v in dataDct.iteritems():
    k = read_csv(v)

      

but that leaves me with a single 'k' dataframe that is filled with data from the last file read in the loop.

I'm probably missing something fundamental here, but I'm starting to spin my wheels, so I thought I'd ask you ... any ideas appreciated!

Greetings.

+3


source to share


2 answers


Are you trying to get all data frames separately in a dictionary, one data frame per key? If so, it will leave you with the dict you specified, but instead will have data from each key.

dataDct = {}
for k, v in zip(nameList,dataList):
    dataDct[k] = read_csv(v)

      



So now you can do this, for example:

dataDct['SnowDepth'][['cola','colb']].plot()

      

+2


source


It's not clear why you are overwriting your object here, I think you want either a list or a dict dfs:

df_list=[]
for k, v in dataDct.iteritems():
    df_list.append(read_csv(v))

      



or

df_dict={}
for k, v in dataDct.iteritems():
    df_dict[k] = read_csv(v)

      

+1


source







All Articles