C ++: working with bytes
Before you get started, make sure you understand endianess , C ++ type sizes , and how weird they can be.
unsigned char
is the only type that is a fixed size (natural byte of the machine, usually 8 bits). Therefore, if you are creating portability, this is a safe bet. But it's not hard to just use, unsigned int
or even long long
to speed things up and use size_of
to find out how many bits you get in each read, although the code becomes more complex this way.
You should be aware that for true portability, none of the C ++ internal types are fixed. An unsigned char can be 9 bits and an int can be as small as in the range 0 to 65535 as stated in this and this answer
Another alternative, as suggested by user 1200129, is to use the boost integer library to reduce all these uncertainties. This is if you have support on your platform. Although, if you are going to use external libraries, there are many serializing libraries.
But above all, before you even start optimizing, do something simple that will work. Then you can start profiling when you start having timing issues.
source to share
It really depends on what you want to do, but I would say, in general, the best speed would be in the size of the integers your program compiled. So if you have 32 bit program then choose 32 bit integers, and if you have 64 bit choose 64 bit.
It might be different if you have multiple bytes or integers in your file. Without knowing the exact structure of your file, it is difficult to determine the optimal value.
source to share
If you are dealing with bytes, the best way to do this is to use a type of a specific size.
#include <algorithm>
#include <iterator>
#include <cinttypes>
#include <vector>
#include <fstream>
int main()
{
std::vector<int8_t> file_data;
std::ifstream file("file_name", std::ios::binary);
//read
std::copy(std::istream_iterator<int8_t>(file),
std::istream_iterator<int8_t>(),
std::back_inserter(file_data));
//write
std::ofstream out("outfile");
std::copy(file_data.begin(), file_data.end(),
std::ostream_iterator<int8_t>(out));
}
Fixed EDIT error
source to share
If you need to ensure that the number of bits is in an integer type, you must use a header <stdint.h>
. It is present in both C and C ++. It defines the type of a type uint8_t
(unsigned 8-bit integer) that is guaranteed to be resolved for the corresponding type on the platform. It also tells other programmers who are reading your code that the number of bits is important.
If you're worried about performance, you can use types with larger than 8-bit types, for example uint32_t
. However, when reading and writing files, you need to pay attention to the ultimate goal of your system. It is noteworthy that if you have a little-endian system (for example, x86, most of all ARM), then the 32-bit value 0x12345678
will be written to the file as four bytes 0x78 0x56 0x34 0x12
, and if you have a big-endian (for example, Sparc, PowerPC, Cell , some ARM and Internet), it will be written as 0x12 0x34 0x56 0x78
. (the same happens or is read). Of course, you can work with 8-bit types and eliminate this problem entirely.
source to share