How to put the GCM authentication tag at the end of the encryption stream requires internal buffering during decryption?
In Java, the "default" AES / GCM provider SunJCE will - during the decryption process - an internal buffer 1) the encrypted bytes used as input, or 2) the decrypted bytes received as a result. The decryption application code will notice that it is Cipher.update(byte[])
returning an empty byte array and Cipher.update(ByteBuffer, ByteBuffer)
return the written length of 0. Then, when the process finishes, it Cipher.doFinal()
will return all decoded bytes.
First question is which bytes are buffered, number 1 or number 2 above?
I assume that buffering only happens during decryption, not encryption, because firstly, the problems arising from this buffering (briefly described) do not occur in my Java client performing encryption of files read from disk, it always goes to the server, receiving these files and doing decryption. Second, it says here. From my own experience, I can't be sure because my client is using CipherOutputStream
. The client does not explicitly use the methods on the Cipher instance. Hence, I cannot determine if internal buffering is in use or not, because I cannot see what the updated and final method returns.
My real problems arise when the encrypted files that I transfer from client to server get large. By and large I mean over 100MB.
What happens is then Cipher.update () issues OutOfMemoryError
. Obviously due to the growth and growth of the internal buffer.
Also, despite the internal buffering and lack of result bytes from Cipher.update (), Cipher.getOutputSize (int) is constantly reported increasing the target buffer length. Hence my application code is forced to allocate the ever-growing ByteBuffer
one that is fed into the Cipher.update (ByteBuffer, ByteBuffer). If I try to cheat and transfer to a byte buffer with less capacity, then the update method will throw # 1 . Knowing that I create huge byte buffers for uselessness is pretty demoralizing.ShortBufferException
Given that internal buffering is the root of all evil, then the obvious solution for me here is to split the files into chunks of, say, 1MB each - I've never had a problem sending small files, just large ones. But I have a hard time understanding why internal buffering happens in the first place.
The previously linked SO answer says that the GCM authentication tag is: "added at the end of the ciphertext" but that it "doesn't need to be put" at the end, and this practice is something that will "mess up the online nature of GCM decryption".
Why does putting the tag at the end only confuse the decryption server?
This is how I reason. The client uses some sort of hash function to compute the authentication tag or MAC, if you do. Apparently MessageDigest.update()
not using the ever-growing internal buffer.
Then on the receiving side, the server can't do the same? First, it can decrypt the bytes, albeit not validated, and pass them to its hash algorithm update function, and when the tag arrives, finish the digest and check the MAC address sent by the client.
I'm not a crypto guy, so please talk to me like I'm both dumb and crazy, but loving enough to take care of them =) I sincerely thank you for taking the time to read this question and maybe even shed some light!
UPDATE # 1
I am not using AD (associated data).
UPDATE # 2
Wrote software demonstrating AES / GCM encryption using Java as well as Secure Remote Protocol (SRP) and binary file transfers in Java EE. The front client is written in JavaFX and can be used to dynamically change the encryption configuration or send files using fragments. At the end of the file transfer, statistics about the time used to transfer the file and the time to decrypt the server are provided. The repository also has a doc with some of my own research on GCM and Java.
Enjoy: https://github.com/MartinanderssonDotcom/secure-login-file-transfer/
# 1
It is interesting to note that if my server, which does the decryption, does not process the cipher itself, instead it uses it CipherInputStream
, then no OutOfMemoryError is thrown. Instead, the client controls the transfer of all the bytes over the wire, but somewhere during decryption, the request stream hangs endlessly and I see that one Java stream (maybe the same stream) is completely using the CPU core, all leaving the file to disk is not available and with a message file size 0. Then, after an extremely long amount of time, the source is Closeable
closed and my catch clause can catch an IOException thrown: "javax.crypto.AEADBadTagException: Input is too short - tag needed".
What makes this situation odd is that small file transfers work flawlessly with the same piece of code, so it's obvious that the tag can be validated correctly. The problem must have the same root cause as when using explicit encryption, i.e. Constantly growing internal buffer. I can't keep track of on the server how many bytes have been successfully read / decrypted because as soon as the initial encryption stream starts reading, then compiler reordering or other JIT optimizations will cause all my log statements to evaporate into thin air. They [obviously] don't get executed at all.
Please note that this GitHub project and related blog post says CipherInputStream is not working. But the tests provided by this project don't undermine me when using Java 8u25 and the SunJCE provider. And as said, everything works for me as long as I use small files.
source to share
The short answer is that update () cannot tell the ciphertext from the tag. The final () function can.
Long Answer: Since the Sun specification requires the tag to be appended to the ciphertext, the tag must be removed from the original buffer (ciphertext) during (or rather, before) decryption. However, since the ciphertext can be provided within a few calls to update (), the Sun code doesn't know when to unset the tag (in the context of update ()). The last call to update () does not know that it is the last call to update ().
While waiting for final () to actually do the crypto, it knows that the full ciphertext + tag has been provided, and it can easily strip the tag from the end, given the length of the tag (which is specified in the parameter specification). It cannot do crypto during update because it will treat some ciphertext as a tag or vice versa.
Basically, this is the downside of just adding the tag to the ciphertext. Most other implementations (like OpenSSL) will provide the ciphertext and the tag as separate outputs (final () returns the ciphertext and another get () function returns the tag). Sun has undoubtedly decided to do it in such a way that the GCM matches their API (and does not require any special GCM code from developers).
The reason for encryption is simpler in that there is no need to change your input (plaintext) as decryption does. It just accepts all data as plain text. During the finale, the tag is easily attached to the ciphertext output.
What @blaze says about protecting you from yourself is possible rational, but it's not true that nothing can be returned until all the ciphertext is known. Only one ciphertext block is required (for example, OpenSSL will provide it for you). Sun's implementation is just waiting because it cannot know that the first ciphertext block is only the first ciphertext block. For everything it knows, you encrypt less than a block (needs to be filled in) and provide the tag all at once. Of course, even if this gives you plain text gradually, you cannot be sure of authenticity until final (). This requires all the ciphertext...
There are, of course, any number of ways Sun could get the job done. Passing and retrieving a tag using special functions that require the length of the ciphertext during init (), or requiring a tag to be passed in when calling final () will work. But as I said, they probably wanted to make the usage as close as possible to other Cipher implementations and keep the API consistent.
source to share
I don't know why, but the current implementation writes every encoded byte you throw at it into the buffer before doFinal (), no matter what you do.
The source can be found here: GaloisCounterMode.java
This method is called from update
and assigned bytes (+ buffered) and is expected to be decrypted in case it can.
int decrypt(byte[] in, int inOfs, int len, byte[] out, int outOfs) {
processAAD();
if (len > 0) {
// store internally until decryptFinal is called because
// spec mentioned that only return recovered data after tag
// is successfully verified
ibuffer.write(in, inOfs, len);
}
return 0;
}
but it just adds data to ibuffer
( ByteArrayOutputStream
) and returns 0 as the number of bytes decrypted. It then performs the full decryption into doFinal.
Given this implementation, your only choice is to avoid this encryption or manually create chunks of data that you know your server can handle. There is no way to provide the tag data in advance and make it nicer.
source to share
Until the entire ciphertext is known, the algorithm cannot determine whether it was correct or faked. No decrypted bytes can be returned for use until decryption and authentication is complete.
Ciphertext buffering can be caused by @NameSpace reasons stated, but plaintext buffering is meant to keep you from shooting your own foot.
Your best option is to encrypt the data in small chunks. And don't forget to change the nonce value in between.
source to share