How can I unlock a specific file from TAR using apache commons?

I am using Apache Commons 1.4.1 library to unpack ".tar" files.

Problem: I don't need to extract all files. I need to extract specific files from a specific location into a tar archive. I only need to extract a few .xml files where the TAR file size is around 300MB and it is a waste of resources when unpacking the entire content.

I am stuck and confused, should I be doing a nested directory, or is there some way?

Note. the location of the .XML (required files) is always the same.

TAR structure:

directory:E:\Root\data
 file:E:\Root\datasheet.txt
directory:E:\Root\map
     file:E:\Root\mapers.txt
directory:E:\Root\ui
     file:E:\Root\ui\capital.txt
     file:E:\Root\ui\info.txt
directory:E:\Root\ui\sales
     file:E:\Root\ui\sales\Reqest_01.xml
     file:E:\Root\ui\sales\Reqest_02.xml
     file:E:\Root\ui\sales\Reqest_03.xml
     file:E:\Root\ui\sales\Reqest_04.xml
directory:E:\Root\ui\sales\stores
directory:E:\Root\ui\stores
directory:E:\Root\urls
directory:E:\Root\urls\fullfilment
     file:E:\Root\urls\fullfilment\Cams_01.xml
     file:E:\Root\urls\fullfilment\Cams_02.xml
     file:E:\Root\urls\fullfilment\Cams_03.xml
     file:E:\Root\urls\fullfilment\Cams_04.xml
directory:E:\Root\urls\fullfilment\profile
directory:E:\Root\urls\fullfilment\registration
     file:E:\Root\urls\options.txt
directory:E:\Root\urls\profile

      

Limitation: I cannot use JDK 7 and stick with the Apache community library.

My current solution:

public static void untar(File[] files) throws Exception {
        String path = files[0].toString();
        File tarPath = new File(path);
        TarEntry entry;
        TarInputStream inputStream = null;
        FileOutputStream outputStream = null;
        try {
            inputStream = new TarInputStream(new FileInputStream(tarPath));
            while (null != (entry = inputStream.getNextEntry())) {
                int bytesRead;
                System.out.println("tarpath:" + tarPath.getName());
                System.out.println("Entry:" + entry.getName());
                String pathWithoutName = path.substring(0, path.indexOf(tarPath.getName()));
                System.out.println("pathname:" + pathWithoutName);
                if (entry.isDirectory()) {
                    File directory = new File(pathWithoutName + entry.getName());
                    directory.mkdir();
                    continue;
                }
                byte[] buffer = new byte[1024];
                outputStream = new FileOutputStream(pathWithoutName + entry.getName());
                while ((bytesRead = inputStream.read(buffer, 0, 1024)) > -1) {
                    outputStream.write(buffer, 0, bytesRead);
                }
                System.out.println("Extracted " + entry.getName());
            }

        }

      

+3


source to share


1 answer


The TAR format is intended to be written or read as a stream (i.e. to / from a streamer) and does not have a centralized header. No, there is no way to read the entire file to extract individual records.



If you want random access, you must use the ZIP format and open it with the JDK ZipFile

. Assuming you have enough virtual memory, the file will be mapped to memory, making fast access very fast (I haven't looked to see if it will use a random access file if it is incapable of memory stick).

+1


source







All Articles