Collatz and other sequences: how to achieve more precision and avoid segfault?

Here's a simple C program that I created from some old Java code (my first C program, so be nice;)). This is much faster than the corresponding Java code, but Java allows me to be more precise (I also use longs in Java) before it crashes.

To my surprise, Java code doesn't crash until the input is over 170,000,000. C code cannot handle anything over 1,050,000. Any suggestions for making this C code work better without requiring some crazy libraries? Thank!

#include <stdio.h>
#include <string.h> /* lets me use memset */
#include <stdlib.h> /* home of strtoull */

/* segfault 11 after 1047993 (Not anymore!!) */ 

unsigned long long getnext( unsigned long long x );

unsigned long long getnext( unsigned long long x )
{
    if (x == 1) { return 1; }
    if (x % 2 == 0) { return x/2; }
    return (3 * x + 1);
}

int main(int argc, char *argv[])
{
    int argind;
    for ( argind = 0; argind < argc; argind++ )
    {
        unsigned long long intrange;
        intrange = strtoull(argv[argind], NULL, 10);

       /** Improper allocation of memory. Fixed directly below. Thanks Marcus!
        * unsigned long long lengths[intrange+1];
        * memset(lengths, 0, sizeof(lengths));
        */ 

        unsigned long long *lengths= malloc((intrange+1) * sizeof(unsigned long long)); /* ptr points to mem. location of lengths array. */
        memset(lengths, 0, (intrange+1) * sizeof(unsigned long long));

        unsigned long long longestlen[2];
        unsigned long long seqindex;
        unsigned long long origdex;

        longestlen[0] = 0;
        longestlen[1] = 0;

        for (origdex = 2; origdex <= intrange; origdex++) 
        {
            seqindex = origdex;
            while ( seqindex > 1 )
            {
                lengths[ origdex ] += 1;
                seqindex = getnext(seqindex);

                if ( seqindex <= intrange )
                {
                    if ( lengths[ seqindex ] > 0)
                    {
                        lengths[ origdex ] =  lengths[ origdex ] + lengths[ seqindex ];
                        if ( lengths[origdex] > longestlen[1] ) { longestlen[0] = origdex; longestlen[1] = lengths[origdex]; }
                        seqindex = 1;
                        continue;
                    }
                }
            }
        }
        if(longestlen[0] > 0 ){ printf("Longest Collatz sequence for first %llu positive integers found at %llu with length %llu. \n", intrange, longestlen[0], longestlen[1]); }

    }
    return 0 ;
}

      

+3


source to share


1 answer


If you let your program run through the debugger (in my case:

gcc -o collatz -g collatz.c
gdb --args collatz 2000000
$run
...segfault at memset(lengths,0,sizeof(lengths))...

      

) you will see your segfault happens when you try to access the length!

The point is that you are not allocating heap for arrays correctly; which is a bit tricky in C (which is why I usually recommend that people coming from other languages ​​learn modern C ++, not C).

You must do



unsigned long long *lengths = malloc(...);

      

to provide you with fresh allocated memory.

Also, sizeof(lengths)

that is wrong, since it will be the size of the pointer to the address of the beginning of your array. Use instead sizeof(unsigned long long)

. Of course, your code then has to read memset(lengths, 0, sizeof(unsigned long long)*(intrange+1))

to set intrange+1

times the number of bytes a unsigned long long

for a byte 0

.

More on malloc

you can use man malloc

if you are logged in with man

; Typically, you should "free" the memory that you allocate malloc

after use. In your case, at the end of the program, this is done automatically, but for more complex software that does not call free(ptr)

what you purchased type *ptr = malloc(...)

, after you no longer need memory, it is a memory leak.

Again, this is a very typical C problem. If you write your code in C ++ you get almost no performance penalty, get the ability to use things like std::vector

to provide you with variable-sized arrays with automatic construction and value padding. which will automatically be deallocated when exiting the area in which it was declared.

+3


source







All Articles