How to check if file content is valid

To determine the real file type based on the file content (not the extension), I am using apache Tika.

I wrote the following code:

    InputStream theInputStream = new FileInputStream("D:\\video.mp4");
    try (InputStream is = theInputStream;
            BufferedInputStream bis = new BufferedInputStream(is);) {
        AutoDetectParser parser = new AutoDetectParser();
        Detector detector = parser.getDetector();
        Metadata md = new Metadata();
        MediaType mediaType = detector.detect(bis, md);
        mediaType.getBaseType().compareTo(MediaType))
        System.out.println(mediaType);
    }

      

this code outputs image/jpeg

.

This is true because I changed the file extension.
Now I want to check that the file is an image.
I cannot find an enum in the MediaType class.
Now I only know the following way:

mediaType.toString().startsWith("image");

      

But this code looks ugly.
Can you recommend a nicer solution?

+3


source to share


2 answers


You will see that the MediaType

method has getType()

and getSubtype()

. What you are looking for is the type (i.e. "image/*"

). The subtype in this case will be "jpeg"

.

So your test should be:



if (mediaType.getType().equals("image")) {
   // Deal with image
}

      

0


source


Ok AFAIK - the only way to check is a little more reliable if the file is a real gif, png, or whatever file you need to check the unique "magic" byte sequence of each of those files.

If you are using Java 7 you can find this solution here: https://odoepner.wordpress.com/2013/07/29/transparently-improve-java-7-mime-type-recognition-with-apache-tika/



I am not the author of this and have not tested it!

-1


source







All Articles