Ruby sum stdin integers

I have:

$ ruby -v
ruby 2.0.0p648 (2015-12-16 revision 53162) [universal.x86_64-darwin16]

      

Suppose you have a sequence of integers, a 1..n

ruby newbie sums up the sequence like this:

$ ruby -e 's=0
     for i in 1..500000
        s+=i
     end
     puts s'
125000250000

      

Now let's say I have the same sequence coming from stdin

:

$ seq 1 500000 | ruby -lne 'BEGIN{s=0}
                            s+=$_.to_i
                            END{puts s} '   
125000250000

      

So far so good.

Now increase the terminal value from 500,000 to 5,000,000:

$ ruby -e 's=0
         for i in 1..5000000
            s+=i
         end
         puts s'
12500002500000   <=== CORRECT

$ seq 1 5000000 | ruby -lne 'BEGIN{s=0}
                             s+=$_.to_i
                             END{puts s} '
500009500025     <=== WRONG!

      

He produces a different amount.

awk

and perl

give the correct result with the same sequence:

$ seq 1 5000000 | awk '{s+=$1} END{print s}'
12500002500000
$ seq 1 5000000 | perl -nle '$s+=$_; END{print $s}'
12500002500000

      

Why is the ruby ​​producing the wrong amount? I do not think it overflow, as awk

and perl

working properly on the same input.


Conclusions:

Thanks to David Aldridge for diagnosing this.

  • OS X and BSD are seq

    converted to floating point output at 1,000,000, whereas GNU seq

    supports arbitrary precision integers. OS X is seq

    useless as a source of integers greater than 1,000,000. OS X example:

    $ seq  999999 1000002
    999999
    1e+06
    1e+06
    1e+06
    
          

  • Method ruby .to_i

    noiselessly converts partial string to an integer, and in this case it is an error. Example:

    irb(main):002:0> '5e+06'.to_i
    #=> 5
    
          

  • The "correct" line in the script is to use $_.to_f.to_i

    to use floats, or to use Integer($_)

    so that the script doesn't crash silently. awk

    and perl

    split 5e + 06 by float, ruby

    not implied:

    $ echo '5e+06' | awk '{print $1+0}'
    5000000
    $ echo '5e+06' | ruby -lne 'print $_.to_i+0'
    5
    
          

  • And thanks to Stefan Schüßler for opening the rubi function request for behavior .to_i

    .

+3


source to share


2 answers


I'm not sure if this is 100% the answer, but I noticed that:

seq 500000 500001 | ruby -lne 'BEGIN{}
                             puts $_
                             END{} '
500000
500001

      

... but ...

seq 5000000 5000001 | ruby -lne 'BEGIN{}
                             puts $_
                             END{} '
5e+06
5e+06

      

... so the "relaxed" approach that #to_i takes to convert values ​​to integers will still work ...

seq 5000000 5000001 | ruby -lne 'BEGIN{}
                             puts $_.to_i
                             END{} '
5
5

      

... but the stricter #to_int won't



seq 5000000 5000001 | ruby -lne 'BEGIN{}
                             puts $_.to_int
                             END{} '
-e:2:in `<main>': undefined method `to_int' for "5e+06":String (NoMethodError)

      

Edit: I also noticed:

seq 5000000 5000001

5e+06
5e+06

      

So, the flag -f

has to be passed to seq to get the integer format.

Edit again:

final answer:

seq -f %f 1 5000000 | ruby -lne 'BEGIN{s=0}
                                  s+=$_.to_i
                                 END{puts s} '

12500002500000

      

+5


source


To explain the output of e-notation, the OS X man page for seq

gives some insight:

Use printf (3) format to print each number . [...] The default is used %g

.



Hence the output is seq

equivalent to Ruby's:

sprintf('%g', 100000)
#=> "100000"

sprintf('%g', 1000000)
#=> "1e+06"

      

+1


source







All Articles