Interpret PNG Pixel Data
Looking at the PNG spec , it appears that the PNG pixel dataset starts with IDAT
and ends with IEND
(slightly clearer explanation here ). There are meanings in the middle that don't make sense to make sense to me.
How can I get useful RGB values ββfrom this, without using any libraries (i.e. from a raw binary)?
As an example, I made a 2x2px image with four black rgb(0,0,0)
pixels in Photoshop:
Here's the resulting data (in original binary input, hexadecimal values, and human-readable ASCII):
BINARY HEX ASCII
01001001 49 'I'
01000100 44 'D'
01000001 41 'A'
01010100 54 'T'
01111000 78 'x'
11011010 DA '\xda'
01100010 62 'b'
01100000 60 '`'
01000000 40 '@'
00000110 06 '\x06'
00000000 00 '\x00'
00000000 00 '\x00'
00000000 00 '\x00'
00000000 00 '\x00'
11111111 FF '\xff'
11111111 FF '\xff'
00000011 03 '\x03'
00000000 00 '\x00'
00000000 00 '\x00'
00001110 0E '\x0e'
00000000 00 '\x00'
00000001 01 '\x01'
10000011 83 '\x83'
11010100 D4 '\xd4'
11101100 EC '\xec'
10001110 8E '\x8e'
00000000 00 '\x00'
00000000 00 '\x00'
00000000 00 '\x00'
00000000 00 '\x00'
01001001 49 'I'
01000101 45 'E'
01001110 4E 'N'
01000100 44 'D'
source to share
You missed a pretty important detail in both specs:
Official:
.. The IDAT block contains the actual image data, which is the output stream of the compression algorithm.
[...]
Pressure compressed data streams in PNG are stored in the "zlib" format.
Wikipedia:
IDAT contains an image that can be split across multiple IDAT chunks. This splitting increases the file size slightly, but allows the PNG to be streamed. The IDAT block contains the actual image data, which is the output stream of the compression algorithm.
Both states indicate that the raw image data is compressed. Looking at your data, the first 2 bytes
78 DA
contain compression flags as specified in RFC1950 . The rest of the data is compressed.
Decompressing this with a generic zlib
compatible routine shows 14 bytes of output:
00 00 00 00 00 00 00
00 00 00 00 00 00 00
where every first byte is a PNG line filter (0 for both lines) and then 2 RGB triplets (0,0,0) for the two lines of your image.
"Without using any libraries, you need 3 separate routines to:
- read and parse the PNG add-on; it provides compressed data
IDAT
as well as necessary information such as width, height and color depth; - unpack the part
zlib
into raw binary data; - parse the unpacked data, process the Adam-7 shift if necessary, and apply row filters.
Only after completing these three steps will you have access to the raw image data. Of these, you seem to understand step (1) well. Step (2) is much more difficult to "do"; I personally cheated and used miniz
PNG in my own processing programs. Step 3, again, is just a matter of definition. All the necessary bits of information can be found on the internet, but it takes a while to get things in the right order. (I recently ran into a bug with the rarely used Paeth string filter - it went unnoticed because it is rarely used in real world imagery.)
See Generating Quick PNG Encoder Errors for a similar discussion and Attempting to Understand zlib / deflate in PNG Files for an in-depth look at the Deflate schema.
source to share