C ++ Any faster method to write a large binary?
purpose
My goal is to quickly create a file from a large binary string (a string containing only 1s and 0s).
Straight to the point
I need a function that can achieve my goal. If I'm not clear enough, please read.
Example
Test.exe is running...
.
Inputted binary string:
1111111110101010
Writing to: c:\users\admin\desktop\Test.txt
Done!
File(Test.txt) In Byte(s):
0xFF, 0xAA
.
Test.exe executed successfully!
Explanation
- First, Test.exe asked the user to enter a binary string.
- Then it converts the input binary string to hexadecimal.
- Finally, he wrote the converted value to the Test.txt file.
I tried
As an attempt at failure to achieve my goal, I created this simple (and arguably terrible) function (hey, at least I tried):
void BinaryStrToFile( __in const char* Destination,
__in std::string &BinaryStr )
{
std::ofstream OutputFile( Destination, std::ofstream::binary );
for( ::UINT Index1 = 0, Dec = 0;
// 8-Bit binary.
Index1 != BinaryStr.length( )/8;
// Get the next set of binary value.
// Write the decimal value as unsigned char to file.
// Reset decimal value to 0.
++ Index1, OutputFile << ( ::BYTE )Dec, Dec = 0 )
{
// Convert the 8-bit binary to hexadecimal using the
// positional notation method - this is how its done:
// http://www.wikihow.com/Convert-from-Binary-to-Decimal
for( ::UINT Index2 = 7, Inc = 1; Index2 + 1 != 0; -- Index2, Inc += Inc )
if( BinaryStr.substr( Index1 * 8, 8 )[ Index2 ] == '1' ) Dec += Inc;
}
OutputFile.close( );
};
Usage example
#include "Global.h"
void BinaryStrToFile( __in const char* Destination,
__in std::string &BinaryStr );
int main( void )
{
std::string Bin = "";
// Create a binary string that is a size of 9.53674 mb
// Note: The creation of this string will take awhile.
// However, I only start to calculate the speed of writing
// and converting after it is done generating the string.
// This string is just created for an example.
std::cout << "Generating...\n";
while( Bin.length( ) != 80000000 )
Bin += "10101010";
std::cout << "Writing...\n";
BinaryStrToFile( "c:\\users\\admin\\desktop\\Test.txt", Bin );
std::cout << "Done!\n";
#ifdef IS_DEBUGGING
std::cout << "Paused...\n";
::getchar( );
#endif
return( 0 );
};
Problem
Again, this was my unsuccessful attempt to achieve my goal. The problem is speed. It's too slow. It took over 7 minutes. Is there a way to quickly create a file from a large binary string?
Thanks in advance,
CLearner
source to share
I suggest removing the call substr
in the inner loop. You assign a newline and then kill it for every character you process. Replace this code:
for(::UINT Index2 = 7, Inc = 1; Index2 + 1 != 0; -- Index2, Inc += Inc )
if( BinaryStr.substr( Index1 * 8, 8 )[ Index2 ] == '1' )
Dec += Inc;
something like:
for(::UINT Index2 = 7, Inc = 1; Index2 + 1 != 0; -- Index2, Inc += Inc )
if( BinaryStr[Index1 * 8 + Index2 ] == '1' )
Dec += Inc;
source to share
I think I count something like this as a starting point:
#include <bitset>
#include <fstream>
#include <algorithm>
int main() {
std::ifstream in("junk.txt", std::ios::binary | std::ios::in);
std::ofstream out("junk.bin", std::ios::binary | std::ios::out);
std::transform(std::istream_iterator<std::bitset<8> >(in),
std::istream_iterator<std::bitset<8> >(),
std::ostream_iterator<unsigned char>(out),
[](std::bitset<8> const &b) { return b.to_ulong();});
return 0;
}
By doing a quick test, the input file processing is 80 million bytes in 6 seconds on my machine. If your files are much larger than what you mentioned in your question, I assume that this is speed enough and the simplicity will be hard to beat.
source to share
So, instead of converting back and forth between std::string
s, why don't you use a bunch of text-formatted integers for quick access?
const size_t bufsz = 1000000;
uint32_t *buf = new uint32_t[bufsz];
memset(buf, 0xFA, sizeof(*buf) * bufsz);
std::ofstream ofile("foo.bin", std::ofstream::binary);
int i;
for (i = 0; i < bufsz; i++) {
ofile << hex << setw(8) << setfill('0') << buf[i];
// or if you want raw binary data instead of formatted hex:
ofile.write(reinterpret_cast<char *>(&buf[i]), sizeof(buf[i]));
}
delete[] buf;
For me, this is done in a split second.
Something completely different from this should be significantly faster:
void
text_to_binary_file(const std::string& text, const char *fname)
{
unsigned char wbuf[4096]; // 4k is a good size of "chunk to write to file"
unsigned int i = 0, j = 0;
std::filebuf fp; // dropping down to filebufs may well be faster
// for this problem
fp.open(fname, std::ios::out|std::ios::trunc);
memset(wbuf, 0, 4096);
for (std::string::iterator p = text.begin(); p != text.end(); p++) {
wbuf[i] |= (1u << (CHAR_BIT - (j+1)));
j++;
if (j == CHAR_BIT) {
j = 0;
i++;
}
if (i == 4096) {
if (fp.sputn(wbuf, 4096) != 4096)
abort();
memset(wbuf, 0, 4096);
i = 0;
j = 0;
}
}
if (fp.sputn(wbuf, i+1) != i+1)
abort();
fp.close();
}
Correct error handling is left as an exercise.
source to share
Late though, I want to post my example for handling strings like this. Architecture-specific optimizations can use inconsistent loads of characters across multiple registers to "squeeze" bits in parallel. This untested example code does not validate characters and avoids alignment and match requirements. It assumes that the characters of this binary string represent contiguous octets (bytes) with the most significant bit first, rather than words and doublewords, etc., where their particular memory representation (and on that string) would require special handling for portability.
//THIS CODE HAS NEVER BEEN TESTED! But I hope you get the idea.
//set up an ofstream with a 64KiB buffer
std::vector<char> buffer(65536);
std::ofstream ofs("out.bin", std::ofstream::binary|std::ofstream::out|std::ofstream::trunc);
ofs.rdbuf()->pubsetbuf(&buffer[0],buffer.size());
std::string::size_type bits = Bin.length();
std::string::const_iterator cIt = Bin.begin();
//You may treat cases, where (bits % 8 != 0) as error
//Initialize with the first iteration
uint8_t byte = uint8_t(*cIt++) - uint8_t('0');
byte <<= 1;
for(std::string::size_type i = 1;i < (bits & (~std::string::size_type(0x7)));++i,++cIt)
{
if(i & 0x7) //bit 7 ... 1
{
byte |= uint8_t(*cIt) - uint8_t('0');
byte <<= 1;
}
else //bit 0: write and advance to the the next most significant bit of an octet
{
byte |= uint8_t(*cIt) - uint8_t('0');
ofs.put(byte);
//advance
++i;
++cIt;
byte = uint8_t(*cIt) - uint8_t('0');
byte <<= 1;
}
}
ofs.flush();
source to share
This makes the file 76.2 MB (80,000,000 bytes) 1010101010101 ......
#include <stdio.h>
#include <iostream>
#include <fstream>
using namespace std;
int main( void )
{
char Bin=0;
ofstream myfile;
myfile.open (".\\example.bin", ios::out | ios::app | ios::binary);
int c=0;
Bin = 0xAA;
while( c!= 80000000 ){
myfile.write(&Bin,1);
c++;
}
myfile.close();
cout << "Done!\n";
return( 0 );
};
source to share