How can I create separate Pandas DataFrames for each CSV file and give them meaningful names?
I have searched thoroughly and cannot find the guide I am looking for on this subject, so I hope this question will not be superfluous. I have several CSV files that represent bitmaps. I would like to do some statistical analysis on them, so I am trying to create a Pandas framework for each file so that I can slice the "em dice" em and plot "em ..." but I am having trouble going through the list of files to create DF with a meaningful name for each file.
Here's what I have so far:
import glob
import os
from pandas import *
#list of .csv files
#I'd like to turn each file into a dataframe
dataList = glob.glob(r'C:\Users\Charlie\Desktop\Qvik\textRasters\*.csv')
#name that I'd like to use for each data frame
nameList = []
for raster in dataList:
path_list = raster.split(os.sep)
name = path_list[6][:-4]
nameList.append(name)
#zip these lists into a dict
dataDct = {}
for k, v in zip(nameList,dataList):
dataDct[k] = dataDct.get(k,"") + v
dataDct
So now I have a dict where the key is the name I want for each dataframe and the value is the path for read_csv (path):
{'Aspect': 'C:\\Users\\Charlie\\Desktop\\Qvik\\textRasters\\Aspect.csv',
'Curvature': 'C:\\Users\\Charlie\\Desktop\\Qvik\\textRasters\\Curvature.csv',
'NormalZ': 'C:\\Users\\Charlie\\Desktop\\Qvik\\textRasters\\NormalZ.csv',
'Slope': 'C:\\Users\\Charlie\\Desktop\\Qvik\\textRasters\\Slope.csv',
'SnowDepth': 'C:\\Users\\Charlie\\Desktop\\Qvik\\textRasters\\SnowDepth.csv',
'Vegetation': 'C:\\Users\\Charlie\\Desktop\\Qvik\\textRasters\\Vegetation.csv',
'Z': 'C:\\Users\\Charlie\\Desktop\\Qvik\\textRasters\\Z.csv'}
My instinct was to try variations of this:
for k, v in dataDct.iteritems():
k = read_csv(v)
but that leaves me with a single 'k' dataframe that is filled with data from the last file read in the loop.
I'm probably missing something fundamental here, but I'm starting to spin my wheels, so I thought I'd ask you ... any ideas appreciated!
Greetings.
source to share
Are you trying to get all data frames separately in a dictionary, one data frame per key? If so, it will leave you with the dict you specified, but instead will have data from each key.
dataDct = {}
for k, v in zip(nameList,dataList):
dataDct[k] = read_csv(v)
So now you can do this, for example:
dataDct['SnowDepth'][['cola','colb']].plot()
source to share