How to load data using numpy without fixed column size
How can we load a tab-delimited text file but without a fixed column size in the sense that missing values ββare completely skipped with a list / array or any other container containing numpy arrays for each row (or an integer numpy array? -> might not be possible because numpy requires fixed sizes)?
Is this only possible when reading in each line with python and then converting from loadtxt
to string to 1D array?
list=[]
for lineString in file:
list.append( np.loadtxt(lineString) )
or perhaps in some way with txt loading?
+1
source to share
1 answer
Perhaps you could use pandas
If your file looks like this:
1 2 3 4 5 6
1 2
8.0 9 97 54
Then do the following:
import pandas as pd
pd.read_csv('yourfile.txt',sep='\t')
gives:
1 2 3 4 5 6
0 1 2 NaN NaN NaN NaN
1 8 9 97 54 NaN NaN
To convert to numpy array:
np.array(pd.read_csv('yourfile.txt',sep='\t'))
array([[ 1., 2., nan, nan, nan, nan],
[ 8., 9., 97., 54., nan, nan]])
+1
source to share