Java nio: how to read characters from memory mapped file with correct encoding
for a new project, I have to read the characters of the file (with custom encoding) to process the input. Since some of these files can be quite large (> 100MB), I would like to test Java nio's capabilities for memory stick files for faster access.
However, I have not been able to figure out how I can create something "Reader" for example to read from MappedByteBuffer with correct encoding decoding.
To create a MappedByteBuffer I am currently using:
RandomAccessFile raFile = new RandomAccessFile("myFile.bla", "r");
FileChannel channel = raFile.getChannel();
MappedByteBuffer mappedByteBuffer = channel.map(MapMode.READ_ONLY, 0, channel.size());
I know I can use getChar () to get the character from the MappedByteBuffer, but how can I specify the encoding? The javadoc says that always two bytes are read and concatenated into one char, but what about ASCII encoded files?
I also found the Channels.newReader (...) methods, which, however, can only handle a channel and not a memory mapped file. Is there something similar for the MappedByteBuffer?
Just to be sure, I know that memory mapping is somewhat expensive and therefore only useful for large files. I haven't made a decision (yet) whether to use it or not, but I want to evaluate it for my special use.
Thanks a lot in advance + best wishes Andreas
source to share
You can use CharsetDecoder
extracted from your loved one Charset
with Charset#newDecoder()
.
StandardCharsets.UTF_8.newDecoder().decode(mappedByteBuffer)
This returns CharBuffer
from which you can get char
values .
Please note that this consumes full MappedByteBuffer
. If you only want a few bytes, create a new one ByteBuffer
from several bytes of the original MappedByteBuffer
and decode that.
source to share