Make sure the PDF color or color font or black and white

What are the ways to check if a PDF file is in color or grayscale or black and white?

+3


source to share


3 answers


You can use Ghostscript inkcov

to get color information for each PDF page. Here is an example command for a sample PDF (cmyk.pdf) of mine with its output:

gs -o - -sDEVICE=inkcov cmyk.pdf

   GPL Ghostscript 9.10 (2013-08-30)
   Processing pages 1 through 5.

   Page 1
    0.00000  0.00000  0.00000  0.02231 CMYK OK
   Page 2
    0.02360  0.02360  0.02360  0.02360 CMYK OK
   Page 3
    0.02525  0.02525  0.02525  0.00000 CMYK OK
   Page 4
    0.00000  0.00000  0.00000  0.01983 CMYK OK
   Page 5
    0.13274  0.13274  0.13274  0.03355 CMYK OK

      

If you add a parameter -q

, the result is the following:

gs -q -o - -sDEVICE=inkcov cmyk.pdf

    0.00000  0.00000  0.00000  0.02231 CMYK OK
    0.02360  0.02360  0.02360  0.02360 CMYK OK
    0.02525  0.02525  0.02525  0.00000 CMYK OK
    0.00000  0.00000  0.00000  0.01983 CMYK OK
    0.13274  0.13274  0.13274  0.03355 CMYK OK

      

How should these numbers be interpreted?

  • Each column represents a color from left to right: Cyan ( C ), Magenta ( M ), Yellow ( Y ), and Black ( K ).
  • Value 0.00000

    means zero color.
    The value 1.00000

    means 100% coverage with the corresponding color for the sheet. The value 0.02360

    for each ink color on page 2 means each color spans 2.36% of a full page (including black).

You can see the page 1 values : the same value 0.00000

,, for cyan, magenta, and yellow, but 0.02231

for black. This means: Page 1 uses only black ink and 2.231% of the page area is covered with black ink.

Take page 2 : here each of the 4 inks is listed with a value 0.02360

. Each ink covers 2.36% of the full page.

Look also at the meanings on page 3 : 0.02525

for C, M and Y and 0.00000

for black. Thus, this page does not use a black inkwell at all, but uses the same holder for each color ink to cover an area with the same size of 2.525% of the full page.

Page 4 : The result is similar to page 1.



Page 5 : see for yourself ...

Caveats:

  • The product inkcov

    always prints CMYK values, not RGB values. The reason for this is that it converts all RGB colors to CMYK before analyzing the color coverage of the pages. This, of course, introduces some inaccuracies (which you should consider before relying on this tool).
  • You need to use Ghostscript version 9.05 or newer (if you are on MS Windows: v9.07 or newer). Previous versions did not have a device inkcov

    .
  • You will most likely come across PDF pages that appear to contain no color, but only grayscale when viewed in a PDF viewer or when printed on paper. This is because shades of gray can be compiled using the same number of different colors.

Update

The following picture reproduces the 5 PDF pages above the one used cmyk.pdf

. This should give a rough impression of what they look like in a PDF viewer. This should make it easier to understand how the various values ​​for ink coverage above are added:

Image representing the 5 pages of <code> cmyk.pdf </code>
      <br>
        <script async src=
" data-src="/img/f8192bdd052452ec639f6d624d1273bd.png" class=" lazyloaded" src="https://fooobar.com//img/f8192bdd052452ec639f6d624d1273bd.png">

Here is the Ghostscript command I originally used to create the above cmyk.pdf

:

gs                  \
  -o cmyk.pdf       \
  -sDEVICE=pdfwrite \
  -g5950x2105       \
  -c "/F1 {100 100 moveto /Helvetica findfont 42 scalefont setfont} def" \
  -c "F1                        (100% 'pure' black)    show showpage"    \
  -c "F1 .5 .5 .5   setrgbcolor  (50% 'rich' rgbgray)  show showpage"    \
  -c "F1 .5 .5 .5 0 setcmykcolor (50% 'rich' cmykgray) show showpage"    \
  -c "F1 .5         setgray      (50% 'pure' gray)     show showpage"    \
  -c "   1 0 0 0 setcmykcolor 100 130 64 64 rectfill"                    \
  -c "   0 1 0 0 setcmykcolor 200 130 64 64 rectfill"                    \
  -c "   0 0 1 0 setcmykcolor 300 130 64 64 rectfill"                    \
  -c "   0 0 0 1 setcmykcolor 400 130 64 64 rectfill"                    \
  -c "   0 1 1 0 setcmykcolor 100  30 64 64 rectfill"                    \
  -c "   1 0 1 0 setcmykcolor 200  30 64 64 rectfill"                    \
  -c "   1 1 0 0 setcmykcolor 300  30 64 64 rectfill"                    \
  -c "   1 1 1 0 setcmykcolor 400  30 64 64 rectfill        showpage"

      

+2


source


The traditional way to do this would be to use a preview tool like the tools from the callas software (caveat: I'm affiliated with this company). But if that aspect of PDF is the only aspect you want to check, it will probably be overkill.



I would think the second possible approach would be to use a tool that can convert PDF to images and then parse the images (convert to CMYK image) and then see if there is anything on the C, M or Y channels in this generated image).

+2


source


Amine,

This is Mohammad from LEADTOOLS support. I noticed that you posted a similar question on our LEADTOOLS support forums. I already posted an answer there, and here is a slightly modified copy of that answer:

/ ****************************************** / p>

If the PDF page has only black text on a white background, loading it using the default settings will result in grayscale around the edges of the text for a smoother display, as shown in the accompanying image.

If you want black text to be rasterized as pure black without grayscale, change the settings before loading with LEADTOOLS v18 as follows:

  • Set the UsePdfEngine property for PDF download options like this:

    RasterCodecs.Options.Pdf.Load.UsePdfEngine = true;

  • Set the TextAlpha property for PDF download options as follows:

    RasterCodecs.Options.Pdf.Load.TextAlpha = 1;

  • Download the PDF file using the default bits per pixel (24 bits):

    RasterCodecs.Load ("BlackTextWhiteBackground.pdf");

  • Count the unique colors in the file using the ColorCountCommand Class function. If the number of colors is more than two, the image will not be black and white. This can happen if it contains non-black text or other colored images or graphics:

    ColorCountCommand MyCommand = new ColorCountCommand (); MyCommand.Run (_viewer.Image);

Make sure "Leadtools.PdfEngine.dll" is in the output folder of your project (next to the EXE).

/ **************************************** / Black text rendered with gray shades

Edit answer to comment about gray page detection:

You can specify whether the page is color or pure grayscale. Add the following code after loading as 24 bits and counting colors:

if (MyCommand.ColorCount > 2 && MyCommand.ColorCount <= 256) //could be gray
{
   ColorResolutionCommand colorRes = new ColorResolutionCommand(ColorResolutionCommandMode.InPlace, 8, 
      RasterByteOrder.Bgr,RasterDitheringMethod.None, ColorResolutionCommandPaletteFlags.Optimized, null);
   colorRes.Run(_viewer.Image);
   if(_viewer.Image.GrayscaleMode == RasterGrayscaleMode.None)
      MessageBox.Show("image is NOT grayscale");
   else
      MessageBox.Show("image is grayscale, its mode is: " + _viewer.Image.GrayscaleMode);
}

      

+1


source







All Articles