Interpret PNG Pixel Data

Looking at the PNG spec , it appears that the PNG pixel dataset starts with IDAT

and ends with IEND

(slightly clearer explanation here ). There are meanings in the middle that don't make sense to make sense to me.

How can I get useful RGB values ​​from this, without using any libraries (i.e. from a raw binary)?

As an example, I made a 2x2px image with four black rgb(0,0,0)

pixels in Photoshop:
Just four black pixels ...

Here's the resulting data (in original binary input, hexadecimal values, and human-readable ASCII):

BINARY      HEX ASCII
01001001    49  'I'
01000100    44  'D'
01000001    41  'A'
01010100    54  'T'
01111000    78  'x'
11011010    DA  '\xda'
01100010    62  'b'
01100000    60  '`'
01000000    40  '@'
00000110    06  '\x06'
00000000    00  '\x00'
00000000    00  '\x00'
00000000    00  '\x00'
00000000    00  '\x00'
11111111    FF  '\xff'
11111111    FF  '\xff'
00000011    03  '\x03'
00000000    00  '\x00'
00000000    00  '\x00'
00001110    0E  '\x0e'
00000000    00  '\x00'
00000001    01  '\x01'
10000011    83  '\x83'
11010100    D4  '\xd4'
11101100    EC  '\xec'
10001110    8E  '\x8e'
00000000    00  '\x00'
00000000    00  '\x00'
00000000    00  '\x00'
00000000    00  '\x00'
01001001    49  'I'
01000101    45  'E'
01001110    4E  'N'
01000100    44  'D'

      

+3


source to share


1 answer


You missed a pretty important detail in both specs:

Official:

.. The IDAT block contains the actual image data, which is the output stream of the compression algorithm.
[...]
Pressure compressed data streams in PNG are stored in the "zlib" format.

Wikipedia:

IDAT contains an image that can be split across multiple IDAT chunks. This splitting increases the file size slightly, but allows the PNG to be streamed. The IDAT block contains the actual image data, which is the output stream of the compression algorithm.

Both states indicate that the raw image data is compressed. Looking at your data, the first 2 bytes

78 DA

      

contain compression flags as specified in RFC1950 . The rest of the data is compressed.



Decompressing this with a generic zlib

compatible routine shows 14 bytes of output:

00 00 00 00 00 00 00
00 00 00 00 00 00 00

      

where every first byte is a PNG line filter (0 for both lines) and then 2 RGB triplets (0,0,0) for the two lines of your image.

"Without using any libraries, you need 3 separate routines to:

  • read and parse the PNG add-on; it provides compressed data IDAT

    as well as necessary information such as width, height and color depth;
  • unpack the part zlib

    into raw binary data;
  • parse the unpacked data, process the Adam-7 shift if necessary, and apply row filters.

Only after completing these three steps will you have access to the raw image data. Of these, you seem to understand step (1) well. Step (2) is much more difficult to "do"; I personally cheated and used miniz

PNG in my own processing programs. Step 3, again, is just a matter of definition. All the necessary bits of information can be found on the internet, but it takes a while to get things in the right order. (I recently ran into a bug with the rarely used Paeth string filter - it went unnoticed because it is rarely used in real world imagery.)

See Generating Quick PNG Encoder Errors for a similar discussion and Attempting to Understand zlib / deflate in PNG Files for an in-depth look at the Deflate schema.

+6


source







All Articles