Read ascii file into numpy array
I have a file ascii
and I want to read it into numpy array
. But it failed and for the first number in the file it returns "NaN" when I use numpy.genfromtxt
. Then I tried to use the following way to read the file into an array:
lines = file('myfile.asc').readlines()
X = []
for line in lines:
s = str.split(line)
X.append([float(s[i]) for i in range(len(s))])
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
ValueError: could not convert string to float: 15.514
when i was typing the first line of the file it looks like this:
>>> s
['\xef\xbb\xbf15.514', '15.433', '15.224', '14.998', '14.792', '15.564', '15.386', '15.293', '15.305', '15.132', '15.073', '15.005', '14.929', '14.823', '14.766', '14.768', '14.789']
how could i read a file like this in numpy array
without problem and any presumption about the number of rows and columns?
source to share
The file is utf-8 encoded with the BOM. Use codecs.open
c utf-8-sig
coding to handle it correctly (to exclude the spec \xef\xbb\xbf
).
import codecs
X = []
with codecs.open('myfile.asc', encoding='utf-8-sig') as f:
for line in f:
s = line.split()
X.append([float(s[i]) for i in range(len(s))])
UPDATE You don't need to use an index at all:
with codecs.open('myfile.asc', encoding='utf-8-sig') as f:
X = [[float(x) for x in line.split()] for line in f]
By the way, instead of using an unbound method, str.split(line)
use line.split()
unless you have a specific reason to do so.
source to share
Based on @falsetru's answer, I want to provide a solution with Numpy file reading capabilities:
import numpy as np
import codecs
with codecs.open('myfile.asc', encoding='utf-8-sig') as f:
X = np.loadtxt(f)
It loads the file into an open file instance using the correct encoding. Numpy uses this type of descriptor (it can also use descriptors from open()
and works just like any other case.
source to share