Ruby sum stdin integers
I have:
$ ruby -v
ruby 2.0.0p648 (2015-12-16 revision 53162) [universal.x86_64-darwin16]
Suppose you have a sequence of integers, a 1..n
ruby newbie sums up the sequence like this:
$ ruby -e 's=0
for i in 1..500000
s+=i
end
puts s'
125000250000
Now let's say I have the same sequence coming from stdin
:
$ seq 1 500000 | ruby -lne 'BEGIN{s=0}
s+=$_.to_i
END{puts s} '
125000250000
So far so good.
Now increase the terminal value from 500,000 to 5,000,000:
$ ruby -e 's=0
for i in 1..5000000
s+=i
end
puts s'
12500002500000 <=== CORRECT
$ seq 1 5000000 | ruby -lne 'BEGIN{s=0}
s+=$_.to_i
END{puts s} '
500009500025 <=== WRONG!
He produces a different amount.
awk
and perl
give the correct result with the same sequence:
$ seq 1 5000000 | awk '{s+=$1} END{print s}'
12500002500000
$ seq 1 5000000 | perl -nle '$s+=$_; END{print $s}'
12500002500000
Why is the ruby ββproducing the wrong amount? I do not think it overflow, as awk
and perl
working properly on the same input.
Conclusions:
Thanks to David Aldridge for diagnosing this.
-
OS X and BSD are
seq
converted to floating point output at 1,000,000, whereas GNUseq
supports arbitrary precision integers. OS X isseq
useless as a source of integers greater than 1,000,000. OS X example:$ seq 999999 1000002 999999 1e+06 1e+06 1e+06
-
Method ruby
.to_i
noiselessly converts partial string to an integer, and in this case it is an error. Example:irb(main):002:0> '5e+06'.to_i #=> 5
-
The "correct" line in the script is to use
$_.to_f.to_i
to use floats, or to useInteger($_)
so that the script doesn't crash silently.awk
andperl
split 5e + 06 by float,ruby
not implied:$ echo '5e+06' | awk '{print $1+0}' 5000000 $ echo '5e+06' | ruby -lne 'print $_.to_i+0' 5
-
And thanks to Stefan SchΓΌΓler for opening the rubi function request for behavior
.to_i
.
source to share
I'm not sure if this is 100% the answer, but I noticed that:
seq 500000 500001 | ruby -lne 'BEGIN{}
puts $_
END{} '
500000
500001
... but ...
seq 5000000 5000001 | ruby -lne 'BEGIN{}
puts $_
END{} '
5e+06
5e+06
... so the "relaxed" approach that #to_i takes to convert values ββto integers will still work ...
seq 5000000 5000001 | ruby -lne 'BEGIN{}
puts $_.to_i
END{} '
5
5
... but the stricter #to_int won't
seq 5000000 5000001 | ruby -lne 'BEGIN{}
puts $_.to_int
END{} '
-e:2:in `<main>': undefined method `to_int' for "5e+06":String (NoMethodError)
Edit: I also noticed:
seq 5000000 5000001
5e+06
5e+06
So, the flag -f
has to be passed to seq to get the integer format.
Edit again:
final answer:
seq -f %f 1 5000000 | ruby -lne 'BEGIN{s=0}
s+=$_.to_i
END{puts s} '
12500002500000
source to share