Python and Ruby Silently Discard Information?

Ovid on 2006-12-10T11:41:10

Any Python or Ruby programmers out there able to explain the following?

$ python -c 'print 7/2'
3

$ ruby -e 'puts 7/2'
3

$ perl -le 'print 7/2'
3.5

Apparently Python and Ruby default to integer math unless you explicitly use a float. For example:

$ python -c 'print 7.0/2'
3.5

By default Python and Ruby silently discard information? Now I can understand that in C. For example, run gcc -Wall on the following:

#include 

int main(int argc, char *argv[]) {
    printf("%f\n", 7/2);
    return 0;
}

You'll get the following warning (is this something I could enable in Python or Ruby?):

divide.c: In function 'main':
divide.c:4: warning: format '%f' expects type 'double', but argument 2 has type 'int'

C pretends to be a statically styped language and if you get confused about types, your language will misbehave. So you either want to be explicit and cast the result, (float)7/2, or better still, use floats in the first place (some argue that the ability to cast data types means a type system is broken). If you use integers in division, it will give you an integer result. If you use more than one data type, the result of the expression is the one which loses the least information:

#include 

int main(int argc, char *argv[]) {
    double numerator   = 7.0;
    int    denomenator = 2;

    // no warnings with -Wall because the result is a double
    printf("%f\n", numerator/denomenator);
    return 0;
}

In Perl, because it's dynamically typed, the result of an mathematical expression is simply whatever result has the most information (er, that's an oversimplification):

$ perl -le 'print 7/2'
3.5

That strikes me as far more intuitive for a 'dynamic' programming language. Do I have some fundamental misunderstanding about Python, Ruby, or C?


Cue the Mathematicians!

djberg96 on 2006-12-10T15:35:49

Ruby and Python aren't discarding information. Perl is adding information, but only when it thinks it should be added. After all, if 5/2 returns 2.5, why doesn't 4/2 return 2.0? Or so one could argue.

It boils down to a design decision. Guido and Matz decided that if you want integer division, you get integer division. Larry took a DWIM approach.

As a guy who doesn't care one way or the other really, I think Perl's approach is better for simple cases, and worse as you get into more complex operations. Obviously there are times when you don't want that behavior, which is why Perl offers the Integer() function (which I do see from time to time).

From a standards point of view, the Ruby and Perl behavior is IEEE compliant, while Perl's behavior is not. Or so I'm told - I haven't verified that.

From a design point of view, Ruby uses immediate values for integers, but a complex data type for floats. Thus, it's more efficient for Ruby to stay with integers in this case.

From a practical point of view, I think it's 6 one way and half a dozen the other. For example, if I want to convert a time value into hours, minutes and seconds, I don't want floats.

Lastly, if you want your '/' operator to act as Perl does, Ruby's open classes let you do this:

class Fixnum
   def /(num)
      self.to_f / num.to_f
   end
end
And now all of your integer division will return a float value. Just keep in mind that it will be slower than integer division. BTW, I'm not positive, but I don't think Python let's you do that.

My final comment is that I think this is just one of those things you adapt to if/when you switch from Perl to Ruby, along with other conventions you're used to, such as 0 != false. But, that's another topic. :)

Re:Cue the Mathematicians!

jhi on 2006-12-10T15:54:04

> Perl offers the Integer() function (which I do see from time to time)

I hope you mean the integer pragma.

Re:Cue the Mathematicians!

djberg96 on 2006-12-10T16:13:14

Either int() or Integer I guess. My Perl is rusty. :)

Re:Cue the Mathematicians!

jhi on 2006-12-10T15:59:20

> From a standards point of view, the Ruby and Perl behavior is IEEE compliant, while Perl's behavior is not

Huh? (I assume by IEEE we are talking IEEE 754, the floating point standard...) This has nothing to do with compliancy of languages since the IEEE 754 specifies the representation and behaviour of floating point, not how programming languages understand numeric constants (are they integer or float) and arithmetic operations (is division truncating or not).

Re:Cue the Mathematicians!

djberg96 on 2006-12-10T16:09:59

I was definitely parroting things I had read elsewhere on that point (as I indicated). I couldn't begin to tell you the relationship between IEEE and computer programming languages.

If it's not a valid point, then so be it. But, I thought I would at least bring it up for discussion. It's definitely something I wouldn't mind hearing more argument/philosophy on.

Re:Cue the Mathematicians!

rjbs on 2006-12-10T16:17:31

Ruby and Python aren't discarding information. Perl is adding information, but only when it thinks it should be added. After all, if 5/2 returns 2.5, why doesn't 4/2 return 2.0? Or so one could argue. One would be making a ridiculous argument. "2" does not mean "2 and maybe some fractional part." It means "2, exactly." "2.0" does not add information, unless your language has some sense of significant figures, which was not at issue here.

It boils down to a design decision. Guido and Matz decided that if you want integer division, you get integer division. Larry took a DWIM approach.
You're begging the question! Yes, they made this decision. Why?

As a guy who doesn't care one way or the other really, I think Perl's approach is better for simple cases, and worse as you get into more complex operations. Obviously there are times when you don't want that behavior, which is why Perl offers the Integer() function (which I do see from time to time).
You must not use it very often, or you'd know that it is "int" and not "Integer". Perhaps that's a bit of an indicator as to which division is used more often.

From a standards point of view, the Ruby and Perl behavior is IEEE compliant, while Perl's behavior is not. Or so I'm told - I haven't verified that.
IEEE 754, on floating points, refer to how floats are handled, not whether integer operations can produce floats.

From a design point of view, Ruby uses immediate values for integers, but a complex data type for floats. Thus, it's more efficient for Ruby to stay with integers in this case.
This is reasonable, I suppose. I'd like to see how much an impact it really has, but since I know it won't be "none," it is at least not ludicrous.

From a practical point of view, I think it's 6 one way and half a dozen the other. For example, if I want to convert a time value into hours, minutes and seconds, I don't want floats.
This is not a good argument. It's like saying that there's no reason that "2 + 3" should not be written as "integer_of_val(2) {plus_integer} integer_of_val(3)". One of them is more obnoxious , time consuming, and prone to introduce error, even if they are equivalent.

Lastly, if you want your '/' operator to act as Perl does, Ruby's open classes let you do this:

        class Fixnum
              def /(num)
                    self.to_f / num.to_f
              end
        end
Ha! You're kidding, right? Now any other module that you've loaded is going to get your modified division, and anyone else who is relying on int/int=int will be screwed.

Re:Cue the Mathematicians!

djberg96 on 2006-12-10T16:59:27

You must not use it very often, or you'd know that it is "int" and not "Integer". Perhaps that's a bit of an indicator as to which division is used more often.
You're right. I don't use it very often, because I don't use Perl very often any more, though I still maintain some old Perl code, and occasionally translate Perl modules into Ruby modules.

This is not a good argument. It's like saying that there's no reason that "2 + 3" should not be written as "integer_of_val(2) {plus_integer} integer_of_val(3)". One of them is more obnoxious , time consuming, and prone to introduce error, even if they are equivalent.
I find your analogy flawed. My point is simply that the default behavior you want may not always be what Perl provides. It may work to your benefit. It may not.

I guess I'm asking what Perl's philosophy is on this issue. Is it merely trying to be useful vs correct? Or does Larry view the current behavior as correct? I'm pretty sure Matz has discussed this issue directly before, but I can't find a link atm. I suspect it has more to do with internals than philsophy. I suspect the opposite is true in Guido's case.

Ha! You're kidding, right? Now any other module that you've loaded is going to get your modified division, and anyone else who is relying on int/int=int will be screwed.
Hey, I didn't say it was a wise, just possible. :)

Oh, and one thing I forgot to mention:

5.quo 2 => 2.5
:-D

Question Begging

DAxelrod on 2006-12-10T18:12:27

Very much tangential, but...

You're begging the question!

Thank you so much. I had started to think that there was nobody left on the internet who was able to use that phrase correctly. I salute you, rjbs.

Re:Cue the Mathematicians!

ctitusbrown on 2006-12-10T18:29:31

This is a known problem, initially built in because it's the way that C works.

Guido acknowledges this as one of his mistakes.

Python 3000 will fix this, but it won't be fixed before then because it would break existing programs. See http://www.python.org/doc/essays/foreword2/ for some discussion of how we in the Python world think about things like this... (search for "integer division").

Computers Silently Discard Information

DAxelrod on 2006-12-10T18:29:16

I'm sorry, but Perl also silently discards information. (Yes, I know that .9repeating is equal to 1, but perl's implementation only stores an imperfect representation of that).

The problem is that processors are designed to do approximate math quickly, so most computer math ends up being approximate. Perl just happens to be using a closer approximation to what you expect, in this case.

Personally, I would prefer that Perl's DWIM resulted in preserving all information available, like (IIRC) Haskell does (of course, this is still limited by memory availability). Yes, this would be slower. But Perl usually is designed to be correct first and fast second.

Obviously, it's too late to do this with Perl 5 because of backward compatability concerns, and I'm not sure if this behavior is desirable in Perl 6.

Re:Computers Silently Discard Information

speters on 2006-12-11T01:43:27

One thing to remember is that many compilers and operating systems may help you down to road to discard this information. Perl compiled Intel C++ on Linux, for example, failed several floating point tests in Perl. By removing optimizations (that is, making the compiler generate code to do the math correctly), the tests were made to pass. Various operating systems and compilers occasionally include "fast" math libraries as well that do a similarly good job of providing optimized approximations. I would suggest using them only if #11912 applies.

Re:Computers Silently Discard Information

DAxelrod on 2006-12-14T20:23:36

This is a good point. I think the central problem here is that you have to be careful what guarantees you make. If a datatype is defined as a lower-level language's datatype, you can only assume the guarantees that the lower-level language makes. (And thanks to nonstandard compilers, sometimes not even those...)

You can easily make Ruby work that way...

Adrian on 2006-12-11T14:31:55

Not a Python head, but on the Ruby front take a look at the (standard) 'mathn' module:

% ruby -e "require 'mathn'; puts 7/2"
7/2

(and that's a proper rational too - none of that floating point nonsense :-)