Propagation of NaN by Calculation
Usually NaN (not a number) is propagated through calculations, so I don't have to check NaN every step of the way. This almost always works, but apparently there are exceptions. For example:
>>> nan = float('nan')
>>> pow(nan, 0)
1.0
I found the following comment on this:
Extending silent NaNs through arithmetic operations allows errors to be detected at the end of a sequence of operations without extensive testing in between. Note, however, that depending on the language and function, NaNs may be removed in expressions that would give a constant result for all other floating point values, for example. NaN ^ 0, which can be defined as 1, so in a general later test for the flag set INVALID is needed to detect all cases where NaNs are entered.
To satisfy those desiring a stricter interpretation of how a function's authority should operate, the 2008 standard defines two additional function powers; pown (x, n), where the exponent must be an integer, and powr (x, y), which returns NaN when the parameter is NaN or exponentiation is undefined.
Is there a way to check the INVALID flag mentioned above via Python? Alternatively, is there any other approach for detecting cases where NaN does not propagate?
Motivation: I decided to use NaN for missing data. In my application, missing inputs can lead to no result. It works great except for what I've described.
source to share
I realize it has been a month since this was asked, but I ran into a similar problem (ie pow(float('nan'), 1)
throws an exception in some Python implementations, like Jython 2.52b2) and I found the answers given weren "What I looking for ".
Using the MissingData type suggested with 6502 seems to be the way to go, but I need a specific example. I tried Ethan Furman's NullType class, but found it didn't work with any arithmetic operations as it doesn't force data types (see below), and I also didn't like that he explicitly named every arithmetic function that was overridden ...
Starting with Ethan example and setup code I found here , I came to the class below. Although the class is heavily commented, you can see that there are actually only a few lines of functional code on it.
The key points are: 1. Use the coerce () function to return two NoData objects for mixed arithmetic operations (for example, NoData + float) and two strings for string-based operations (for example, concat). 2. Use getattr () to return a NoData () object to call all other attributes / accessors 3. Use the call () method to implement all other methods of the NoData () object: by returning a NoData () object
Here are some examples of its use.
>>> nd = NoData()
>>> nd + 5
NoData()
>>> pow(nd, 1)
NoData()
>>> math.pow(NoData(), 1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: nb_float should return float object
>>> nd > 5
NoData()
>>> if nd > 5:
... print "Yes"
... else:
... print "No"
...
No
>>> "The answer is " + nd
'The answer is NoData()'
>>> "The answer is %f" % (nd)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: float argument required, not instance
>>> "The answer is %s" % (nd)
'The answer is '
>>> nd.f = 5
>>> nd.f
NoData()
>>> nd.f()
NoData()
I noticed that using pow with NoData () calls the ** operator and therefore works with NoData, but using math.pow is not how it first tries to convert the NoData () object to float. I'm happy to use non math pow - hopefully 6502, etc. Used math.pow when they had problems with pow in their comments above.
Another problem that I can't think of a way to solve is the use with the format operator (% f) ... In this case, NoData methods are not called, the operator just fails if you don't provide a float. Anyway, here is the class itself.
class NoData():
"""NoData object - any interaction returns NoData()"""
def __str__(self):
#I want '' returned as it represents no data in my output (e.g. csv) files
return ''
def __unicode__(self):
return ''
def __repr__(self):
return 'NoData()'
def __coerce__(self, other_object):
if isinstance(other_object, str) or isinstance(other_object, unicode):
#Return string objects when coerced with another string object.
#This ensures that e.g. concatenation operations produce strings.
return repr(self), other_object
else:
#Otherwise return two NoData objects - these will then be passed to the appropriate
#operator method for NoData, which should then return a NoData object
return self, self
def __nonzero__(self):
#__nonzero__ is the operation that is called whenever, e.g. "if NoData:" occurs
#i.e. as all operations involving NoData return NoData, whenever a
#NoData object propagates to a test in branch statement.
return False
def __hash__(self):
#prevent NoData() from being used as a key for a dict or used in a set
raise TypeError("Unhashable type: " + self.repr())
def __setattr__(self, name, value):
#This is overridden to prevent any attributes from being created on NoData when e.g. "NoData().f = x" is called
return None
def __call__(self, *args, **kwargs):
#if a NoData object is called (i.e. used as a method), return a NoData object
return self
def __getattr__(self,name):
#For all other attribute accesses or method accesses, return a NoData object.
#Remember that the NoData object can be called (__call__), so if a method is called,
#a NoData object is first returned and then called. This works for operators,
#so e.g. NoData() + 5 will:
# - call NoData().__coerce__, which returns a (NoData, NoData) tuple
# - call __getattr__, which returns a NoData object
# - call the returned NoData object with args (self, NoData)
# - this call (i.e. __call__) returns a NoData object
#For attribute accesses NoData will be returned, and that it.
#print name #(uncomment this line for debugging purposes i.e. to see that attribute was accessed/method was called)
return self
source to share
If it just pow()
gives you headaches, you can easily override it to bring it back NaN
in any circumstance.
def pow(x, y):
return x ** y if x == x else float("NaN")
If NaN
you can use it as an exhibitor, you should also check it out; this throws ValueError
an exception unless the base is 1 (presumably the theory is that 1 for any cardinality, even one that is not a number, is 1).
(And of course pow()
actually accepts three operands, the third is optional, an omission I'll leave as an exercise ...)
Unfortunately, the operator **
has the same behavior and there is no way to override it for built-in numeric types. An opportunity to catch this is to write a subclass float
that implements __pow__()
and __rpow__()
and uses that class for your values NaN
.
Python does not seem to provide access to any flags set by the computation; even if it did, this is what you would need to check after every single operation.
In fact, upon further examination, I believe that the best solution might be to simply use an instance of a dummy class for the missing values. Python will stifle any operation you try to do on these values by throwing an exception, and you can catch the exception and return the default or whatever. There is no reason to continue with the rest of the calculation if the required value is missing, so the exception should be fine.
source to share
To answer your question: No, there is no way to check flags with regular floats. However, you can use a Decimal class, which provides much greater control .,. but a little slower.
Another option is to use a class EmptyData
or Null
such as this one:
class NullType(object):
"Null object -- any interaction returns Null"
def _null(self, *args, **kwargs):
return self
__eq__ = __ne__ = __ge__ = __gt__ = __le__ = __lt__ = _null
__add__ = __iadd__ = __radd__ = _null
__sub__ = __isub__ = __rsub__ = _null
__mul__ = __imul__ = __rmul__ = _null
__div__ = __idiv__ = __rdiv__ = _null
__mod__ = __imod__ = __rmod__ = _null
__pow__ = __ipow__ = __rpow__ = _null
__and__ = __iand__ = __rand__ = _null
__xor__ = __ixor__ = __rxor__ = _null
__or__ = __ior__ = __ror__ = _null
__divmod__ = __rdivmod__ = _null
__truediv__ = __itruediv__ = __rtruediv__ = _null
__floordiv__ = __ifloordiv__ = __rfloordiv__ = _null
__lshift__ = __ilshift__ = __rlshift__ = _null
__rshift__ = __irshift__ = __rrshift__ = _null
__neg__ = __pos__ = __abs__ = __invert__ = _null
__call__ = __getattr__ = _null
def __divmod__(self, other):
return self, self
__rdivmod__ = __divmod__
if sys.version_info[:2] >= (2, 6):
__hash__ = None
else:
def __hash__(yo):
raise TypeError("unhashable type: 'Null'")
def __new__(cls):
return cls.null
def __nonzero__(yo):
return False
def __repr__(yo):
return '<null>'
def __setattr__(yo, name, value):
return None
def __setitem___(yo, index, value):
return None
def __str__(yo):
return ''
NullType.null = object.__new__(NullType)
Null = NullType()
You can change the methods __repr__
and __str__
. Also, keep in mind that Null
it cannot be used as a dictionary key and is not stored in the set.
source to share