Python - Parse file into magic number / length based outputs

Question

Python - Parse file into magic number / length based outputs

I'm a complete newbie to coding - just started 3 weeks ago and in fact I only have a codecademy Python course under my belt - simple explanations so easy would really be appreciated!

I am trying to write a python script that reads a file as a HEX string and then parses the file into separate output files based on looking for a "magic number" inside the HEX string.

EG: if my HEX string was "0011AABB00BBAACC00223344" then I may need to parse that line into new output files based on the magic number "00" and tell python that each output should be 8 characters long. The result for the example line above should be 3 files that contain HEX values:

"0011AABB" "00BBAACC" "00223344"

Here is what I have so far (assuming in this case the line above is contained in the file "hextests"

import os
import binascii

filename = "hextests"

# read file as a binary string
with open(filename, 'rb') as f:
    content = f.read()

# convert binary string to hex string
hexString = binascii.hexlify(content)

# define magic number as "00"
magic_N = "00"

# attempting to create a new substring called newFile that is equal to each instance magic_N repeats throughout the file for a length of 8 characters
for chars in hexString:
    newFile = ""
    if chars == magic_N:
        newFile += chars.len(9)

# attempting to create a series of new output files for each instance of newFile - while incrementing the output file name
    if newFile != "":
        i = 0
        while os.path.exists("output_file%s.xyz" % i):
          i += 1
        fh = with open("output_file%s.xyz" % i, "wb"):
            newFile

I'm sure I have a lot of mistakes to deal with this - and it is more difficult than I think .... but my main question has to do with the correct way of defining chars

and newFile

. I'm pretty sure python chars

only sees single characters in a string, so it doesn't work because I'm trying to search magic_N

more than 1 character long. Will I fix that this is part of the problem?

Also, if you understand the main purpose of this script, any other thoughts on things I should be doing differently?

Many thanks for the help!

+3

python string

occvtech May 25 '17 at 4:05

source to share

1 answer

Nurjan · Accepted Answer · 2017-05-25T04:42:43+0000

You can try something like this:

filename = "hextests"

# read file as a binary string
with open(filename, "rb") as f:
    content = f.read()

# You don't need this part if you want
# to parse the hex string as it is given in the file   
# convert binary string to hex string
# hexString = binascii.hexlify(content)

# Remove the newline at the end of the string
hexString = content.strip()


# define magic number as "00"
magic_N = "00"

i = 0
j = 0
while i < len(hexString) - 1:
    index = hexString.find(magic_N, i)

    # This is the part which was incorrect in your code.
    with open("output_file_%s.xyz" % j, "wb") as output:
        output.write(hexString[i:i+8])

    i += 8
    j += 1

Note that you need to explicitly call the method write

to write data to the output file.

This assumes that the chunks of data have exactly 8 hexadecimal characters and always start with 00

. This is not a flexible solution, but it does give you an idea of how to fix this problem.

Python - Parse file into magic number / length based outputs

More articles: