How to load data using numpy without fixed column size

How can we load a tab-delimited text file but without a fixed column size in the sense that missing values ​​are completely skipped with a list / array or any other container containing numpy arrays for each row (or an integer numpy array? -> might not be possible because numpy requires fixed sizes)?

Is this only possible when reading in each line with python and then converting from loadtxt

to string to 1D array?

list=[]
for lineString in file:
    list.append( np.loadtxt(lineString) )

      

or perhaps in some way with txt loading?

+1


source to share


1 answer


Perhaps you could use pandas

If your file looks like this:

1   2   3   4   5   6
1   2
8.0 9   97  54

      

Then do the following:

import pandas as pd
pd.read_csv('yourfile.txt',sep='\t')

      



gives:

   1  2   3   4   5   6
0  1  2 NaN NaN NaN NaN
1  8  9  97  54 NaN NaN

      

To convert to numpy array:

np.array(pd.read_csv('yourfile.txt',sep='\t'))


array([[  1.,   2.,  nan,  nan,  nan,  nan],
       [  8.,   9.,  97.,  54.,  nan,  nan]])

      

+1


source







All Articles