Python - Parse file into magic number / length based outputs
I'm a complete newbie to coding - just started 3 weeks ago and in fact I only have a codecademy Python course under my belt - simple explanations so easy would really be appreciated!
I am trying to write a python script that reads a file as a HEX string and then parses the file into separate output files based on looking for a "magic number" inside the HEX string.
EG: if my HEX string was "0011AABB00BBAACC00223344" then I may need to parse that line into new output files based on the magic number "00" and tell python that each output should be 8 characters long. The result for the example line above should be 3 files that contain HEX values:
"0011AABB" "00BBAACC" "00223344"
Here is what I have so far (assuming in this case the line above is contained in the file "hextests"
import os
import binascii
filename = "hextests"
# read file as a binary string
with open(filename, 'rb') as f:
content = f.read()
# convert binary string to hex string
hexString = binascii.hexlify(content)
# define magic number as "00"
magic_N = "00"
# attempting to create a new substring called newFile that is equal to each instance magic_N repeats throughout the file for a length of 8 characters
for chars in hexString:
newFile = ""
if chars == magic_N:
newFile += chars.len(9)
# attempting to create a series of new output files for each instance of newFile - while incrementing the output file name
if newFile != "":
i = 0
while os.path.exists("output_file%s.xyz" % i):
i += 1
fh = with open("output_file%s.xyz" % i, "wb"):
newFile
I'm sure I have a lot of mistakes to deal with this - and it is more difficult than I think .... but my main question has to do with the correct way of defining chars
and newFile
. I'm pretty sure python chars
only sees single characters in a string, so it doesn't work because I'm trying to search magic_N
more than 1 character long. Will I fix that this is part of the problem?
Also, if you understand the main purpose of this script, any other thoughts on things I should be doing differently?
Many thanks for the help!
source to share
You can try something like this:
filename = "hextests"
# read file as a binary string
with open(filename, "rb") as f:
content = f.read()
# You don't need this part if you want
# to parse the hex string as it is given in the file
# convert binary string to hex string
# hexString = binascii.hexlify(content)
# Remove the newline at the end of the string
hexString = content.strip()
# define magic number as "00"
magic_N = "00"
i = 0
j = 0
while i < len(hexString) - 1:
index = hexString.find(magic_N, i)
# This is the part which was incorrect in your code.
with open("output_file_%s.xyz" % j, "wb") as output:
output.write(hexString[i:i+8])
i += 8
j += 1
Note that you need to explicitly call the method write
to write data to the output file.
This assumes that the chunks of data have exactly 8 hexadecimal characters and always start with 00
. This is not a flexible solution, but it does give you an idea of how to fix this problem.
source to share