C ++ Any faster method to write a large binary?

purpose

My goal is to quickly create a file from a large binary string (a string containing only 1s and 0s).

Straight to the point

I need a function that can achieve my goal. If I'm not clear enough, please read.

Example

Test.exe is running...
.
Inputted binary string:
        1111111110101010
Writing to: c:\users\admin\desktop\Test.txt
        Done!
File(Test.txt) In Byte(s):
        0xFF, 0xAA
.
Test.exe executed successfully!

      

Explanation

  • First, Test.exe asked the user to enter a binary string.
  • Then it converts the input binary string to hexadecimal.
  • Finally, he wrote the converted value to the Test.txt file.

I tried

As an attempt at failure to achieve my goal, I created this simple (and arguably terrible) function (hey, at least I tried):

void BinaryStrToFile( __in const char* Destination,
                      __in std::string &BinaryStr )
{
    std::ofstream OutputFile( Destination, std::ofstream::binary );

    for( ::UINT Index1 = 0, Dec = 0;
         // 8-Bit binary.
         Index1 != BinaryStr.length( )/8;

         // Get the next set of binary value.
         // Write the decimal value as unsigned char to file.
         // Reset decimal value to 0.
         ++ Index1, OutputFile << ( ::BYTE )Dec, Dec = 0 )
    {
        // Convert the 8-bit binary to hexadecimal using the
        // positional notation method - this is how its done:
        // http://www.wikihow.com/Convert-from-Binary-to-Decimal
        for( ::UINT Index2 = 7, Inc = 1; Index2 + 1 != 0; -- Index2, Inc += Inc )
            if( BinaryStr.substr( Index1 * 8, 8 )[ Index2 ] == '1' ) Dec += Inc;
    }
    OutputFile.close( );
};

      

Usage example

#include "Global.h"

void BinaryStrToFile( __in const char* Destination,
                      __in std::string &BinaryStr );

int main( void )
{
    std::string Bin = "";

    // Create a binary string that is a size of 9.53674 mb
    // Note: The creation of this string will take awhile.
    // However, I only start to calculate the speed of writing
    // and converting after it is done generating the string.
    // This string is just created for an example.
    std::cout << "Generating...\n";
    while( Bin.length( ) != 80000000 )
        Bin += "10101010";

    std::cout << "Writing...\n";
    BinaryStrToFile( "c:\\users\\admin\\desktop\\Test.txt", Bin );

    std::cout << "Done!\n";
#ifdef IS_DEBUGGING
    std::cout << "Paused...\n";
    ::getchar( );
#endif

    return( 0 );
};

      

Problem

Again, this was my unsuccessful attempt to achieve my goal. The problem is speed. It's too slow. It took over 7 minutes. Is there a way to quickly create a file from a large binary string?

Thanks in advance,

CLearner

+3


source to share


7 replies


I suggest removing the call substr

in the inner loop. You assign a newline and then kill it for every character you process. Replace this code:

for(::UINT Index2 = 7, Inc = 1; Index2 + 1 != 0; -- Index2, Inc += Inc )
    if( BinaryStr.substr( Index1 * 8, 8 )[ Index2 ] == '1' )
        Dec += Inc;

      



something like:

for(::UINT Index2 = 7, Inc = 1; Index2 + 1 != 0; -- Index2, Inc += Inc )
    if( BinaryStr[Index1 * 8 + Index2 ] == '1' )
        Dec += Inc;

      

+4


source


Most of your time is spent here:

   for( ::UINT Index2 = 7, Inc = 1; Index2 + 1 != 0; -- Index2, Inc += Inc )
        if( BinaryStr.substr( Index1 * 8, 8 )[ Index2 ] == '1' ) Dec += Inc;

      



When I comment this, the file is written in seconds. I think you need to refine your transformation.

+3


source


I think I count something like this as a starting point:

#include <bitset>
#include <fstream>
#include <algorithm>

int main() { 
    std::ifstream in("junk.txt", std::ios::binary | std::ios::in);
    std::ofstream out("junk.bin", std::ios::binary | std::ios::out);

    std::transform(std::istream_iterator<std::bitset<8> >(in),
                   std::istream_iterator<std::bitset<8> >(),
                   std::ostream_iterator<unsigned char>(out),
                   [](std::bitset<8> const &b) { return b.to_ulong();});
    return 0;
}

      

By doing a quick test, the input file processing is 80 million bytes in 6 seconds on my machine. If your files are much larger than what you mentioned in your question, I assume that this is speed enough and the simplicity will be hard to beat.

+2


source


So, instead of converting back and forth between std::string

s, why don't you use a bunch of text-formatted integers for quick access?

const size_t bufsz = 1000000;

uint32_t *buf = new uint32_t[bufsz];
memset(buf, 0xFA, sizeof(*buf) * bufsz);
std::ofstream ofile("foo.bin", std::ofstream::binary);

int i;
for (i = 0; i < bufsz; i++) {
    ofile << hex << setw(8) << setfill('0') << buf[i];
    // or if you want raw binary data instead of formatted hex:
    ofile.write(reinterpret_cast<char *>(&buf[i]), sizeof(buf[i]));
}

delete[] buf;

      

For me, this is done in a split second.

+1


source


Something completely different from this should be significantly faster:

void
text_to_binary_file(const std::string& text, const char *fname)
{
    unsigned char wbuf[4096];  // 4k is a good size of "chunk to write to file"
    unsigned int i = 0, j = 0;
    std::filebuf fp;           // dropping down to filebufs may well be faster
                               // for this problem
    fp.open(fname, std::ios::out|std::ios::trunc);
    memset(wbuf, 0, 4096);

    for (std::string::iterator p = text.begin(); p != text.end(); p++) {
        wbuf[i] |= (1u << (CHAR_BIT - (j+1)));
        j++;
        if (j == CHAR_BIT) {
            j = 0;
            i++;
        }
        if (i == 4096) {
            if (fp.sputn(wbuf, 4096) != 4096)
                abort();
            memset(wbuf, 0, 4096);
            i = 0;
            j = 0;
        }
    }
    if (fp.sputn(wbuf, i+1) != i+1)
        abort();
    fp.close();
}

      

Correct error handling is left as an exercise.

+1


source


Late though, I want to post my example for handling strings like this. Architecture-specific optimizations can use inconsistent loads of characters across multiple registers to "squeeze" bits in parallel. This untested example code does not validate characters and avoids alignment and match requirements. It assumes that the characters of this binary string represent contiguous octets (bytes) with the most significant bit first, rather than words and doublewords, etc., where their particular memory representation (and on that string) would require special handling for portability.

//THIS CODE HAS NEVER BEEN TESTED! But I hope you get the idea.

//set up an ofstream with a 64KiB buffer
std::vector<char> buffer(65536);
std::ofstream ofs("out.bin", std::ofstream::binary|std::ofstream::out|std::ofstream::trunc);
ofs.rdbuf()->pubsetbuf(&buffer[0],buffer.size());

std::string::size_type bits = Bin.length();
std::string::const_iterator cIt = Bin.begin();

//You may treat cases, where (bits % 8 != 0) as error

//Initialize with the first iteration
uint8_t byte = uint8_t(*cIt++) - uint8_t('0');
byte <<= 1;
for(std::string::size_type i = 1;i < (bits & (~std::string::size_type(0x7)));++i,++cIt)
{
    if(i & 0x7) //bit 7 ... 1
    {
        byte |= uint8_t(*cIt) - uint8_t('0');
        byte <<= 1;
    }
    else //bit 0: write and advance to the the next most significant bit of an octet
    {
        byte |= uint8_t(*cIt) - uint8_t('0');
        ofs.put(byte);

        //advance
        ++i;
        ++cIt;
        byte = uint8_t(*cIt) - uint8_t('0');
        byte <<= 1;
    }
}

ofs.flush();

      

+1


source


This makes the file 76.2 MB (80,000,000 bytes) 1010101010101 ......

#include <stdio.h>
#include <iostream>
#include <fstream>

using namespace std;

int main( void )
{
    char Bin=0;
    ofstream myfile;
    myfile.open (".\\example.bin", ios::out | ios::app | ios::binary);
    int c=0;
    Bin = 0xAA;
    while( c!= 80000000 ){
        myfile.write(&Bin,1);
        c++;
    }
    myfile.close();
    cout << "Done!\n";
    return( 0 );
};

      

Here is the file first bytes

-1


source







All Articles