Curl to resin to zip up pipes

I want to download an archive tar.gz

, extract it and compress it into a file zip

one command at a time using a bash script. The reason for this is to be independent of temporary files.

The code I'm using:

curl -L "someURL" | tar xOz --strip-components=1 | zip -@ test.zip

      

gives a lot of output for STDOUT, so I guess it zip

doesn't accept the pipe.

Maybe I'm missing something, but the zip man page doesn't give me more information than using -@

or -

as well as the internet.

+3


source to share


2 answers


tar will send all file data to stdout (but not filenames).

zip can't do anything about it (forbidding a giant zip blob to kill all file contents in a single zip file, and I can't imagine you wanting that).



You need to extract the files to disk if you want to create a zip archive of them.

I was going to say that you could iterate over the entries in the tarball (by name) and extract each one into a pipe (although that would be very costly in terms of the number of times it takes to scan through the tarball), but I don't really see , on the man page for zip I have at least a way to get the zip to compress the data given to it via stdin. It looks like the filenames are taken this way.

+2


source


The manpage for zip

says (at least on my system):

If the file list is specified as - @ [Not on MacOS], zip displays a list of input files from standard input rather than from the command line. For example,
zip -@ foo


will store files listed one line per stdin in foo.zip.

Mask for tar

-O, --to-stdout


              extract files to standard output.

So, in short:

tar -O

can output files (but not their names) in one long stream up to stdout

. But it zip

expects a list of filenames on stdin

. So I'm not going to work. And it's hard to figure out how to make it work because bash is just unstructured strings, but to transfer information from tar to zip, you need to add some structure, even if it's minimal:

[filename][filedata][filename][filedata]...

      



Both sender ( tar

) and receiver ( zip

) would have to agree on the format of this structure. What won't happen.

However, you can use interfaces for tar

and zip

besides command line utilities. For example, if you have python installed, the following should work:

#!/usr/bin/python
import sys
import tarfile
import zipfile
tarf = tarfile.open(sys.argv[1], "r:*")
zipf = zipfile.ZipFile(sys.argv[2], "w", zipfile.ZIP_DEFLATED)
for m in tarf:
  if m.isreg():
    zipf.writestr(m.path, tarf.extractfile(m).read())

      

(It takes a lot of error checking. As written, it just fails on any error.)

You can do this in a one-very-long liner shell, although personally I would just use the python script above.

 python -c "$(printf %s \
   'import sys;import tarfile;import zipfile;' \
   'T=tarfile.open(sys.argv[1],"r:*")' \
   'Z=zipfile.ZipFile(sys.argv[2],"w",zipfile.ZIP_DEFLATED);' \
   '[Z.writestr(m.path,T.extractfile(m).read()) for m in T if m.isreg()]')" \
   input.tar output.zip

      

(If you want to connect to it from curl, use it /dev/stdin

as input. I think this will avoid Python trying to interpret it stdin

as a UTF-8 stream.)

+1


source







All Articles