How do I set up a class with all methods and functions like inline like float, but contains additional data?
I am working with 2 datasets of order ~ 100,000 values. These 2 datasets are just lists. Each item in the list is a small class.
class Datum(object):
def __init__(self, value, dtype, source, index1=None, index2=None):
self.value = value
self.dtype = dtype
self.source = source
self.index1 = index1
self.index2 = index2
For every base element in one list, there is a corresponding null element in the other list, which is the same dtype, source, index1, and index2, which I use to sort the two datasets that they align. Then I do various jobs with the corresponding data point values ββthat are always floating.
Currently, if I want to determine the relative values ββof the floats in the same dataset, I am doing something like this.
minimum = min([x.value for x in data])
for datum in data:
datum.value -= minimum
However, it would be nice if my own class inherited from float and could act like this.
minimum = min(data)
data = [x - minimum for x in data]
I tried the following.
class Datum(float):
def __new__(cls, value, dtype, source, index1=None, index2=None):
new = float.__new__(cls, value)
new.dtype = dtype
new.source = source
new.index1 = index1
new.index2 = index2
return new
However, by doing
data = [x - minimum for x in data]
removes all additional attributes (dtype, source, index1, index2).
How do I set up a class that works like a float but contains additional data that I create for it?
UPDATE: I do many types of non-subtraction math operations, so rewriting all methods that work with floats would be very troublesome, and to be honest, I'm not sure if I could rewrite them correctly.
source to share
I suggest subclassing floats and using a couple of decorators to "grab" the float output from any method (except __new__
, of course) and return an object Datum
instead of an object float
.
First, we write a method decorator (which is not actually used as a decorator below, it is just a function that modifies the output of another function, AKA is a wrapper):
def mydecorator(f,cls):
#f is the method being modified, cls is its class (in this case, Datum)
def func_wrapper(*args,**kwargs):
#*args and **kwargs are all the arguments that were passed to f
newvalue = f(*args,**kwargs)
#newvalue now contains the output float would normally produce
##Now get cls instance provided as part of args (we need one
##if we're going to reattach instance information later):
try:
self = args[0]
##Now check to make sure new value is an instance of some numerical
##type, but NOT a bool or a cls type (which might lead to recursion)
##Including ints so things like modulo and round will work right
if (isinstance(newvalue,float) or isinstance(newvalue,int)) and not isinstance(newvalue,bool) and type(newvalue) != cls:
##If newvalue is a float or int, now we make a new cls instance using the
##newvalue for value and using the previous self instance information (arg[0])
##for the other fields
return cls(newvalue,self.dtype,self.source,self.index1,self.index2)
#IndexError raised if no args provided, AttributeError raised of self isn't a cls instance
except (IndexError, AttributeError):
pass
##If newvalue isn't numerical, or we don't have a self, just return what
##float would normally return
return newvalue
#the function has now been modified and we return the modified version
#to be used instead of the original version, f
return func_wrapper
The first decorator only applies to the method it is attached to. But we want it to decorate all (in fact, almost all) methods inherited from float
(well, those that appear in floats anyway __dict__
). This second decorator will apply our first decorator to all methods in the float subclass except those listed as exceptions ( see this answer ):
def for_all_methods_in_float(decorator,*exceptions):
def decorate(cls):
for attr in float.__dict__:
if callable(getattr(float, attr)) and not attr in exceptions:
setattr(cls, attr, decorator(getattr(float, attr),cls))
return cls
return decorate
We now write the subclass the same way as before, but decorate and exclude __new__
from embellishment (I think we could exclude as well __init__
, but __init__
won't return anything):
@for_all_methods_in_float(mydecorator,'__new__')
class Datum(float):
def __new__(klass, value, dtype="dtype", source="source", index1="index1", index2="index2"):
return super(Datum,klass).__new__(klass,value)
def __init__(self, value, dtype="dtype", source="source", index1="index1", index2="index2"):
self.value = value
self.dtype = dtype
self.source = source
self.index1 = index1
self.index2 = index2
super(Datum,self).__init__()
Here are our testing procedures; iteration works correctly:
d1 = Datum(1.5) d2 = Datum(3.2) d3 = d1+d2 assert d3.source == 'source' L=[d1,d2,d3] d4=max(L) assert d4.source == 'source' L = [i for i in L] assert L[0].source == 'source' assert type(L[0]) == Datum minimum = min(L) assert [x - minimum for x in L][0].source == 'source'
Notes:
- I am using Python 3. Not sure if this will affect you.
- This approach effectively overrides any float method other than exceptions, even those for which the result does not change. There might be side effects for this (subclassing the inline and then overriding all of its methods) eg. performance hit or something else; I really do not know.
- This will brighten up the nested classes as well.
- This same approach can also be implemented using a metaclass.
source to share
The problem is what you do:
x - minimum
in terms of the types you do:
datum - float, or datum - integer
Python doesn't know how to do any of these anyway, so what it does is look at the parent argument classes if possible. since datum is of type float, it can easily use float - and the calculation ends
float - float
which will obviously lead to a "float" - python has no way of knowing how to construct a datum object unless you tell it to.
To solve this problem, you either need to implement math operators so python knows how to do it datum - float
, or come up with a different design.
Assuming "dtype", "source", index1 and index2 should remain unchanged after calculation - then your class is needed as an example:
def __sub__(self, other):
return datum(value-other, self.dtype, self.source, self.index1, self.index2)
this should work - not tested
and now this will allow you to do it
d = datum(23.0, dtype="float", source="me", index1=1)
e = d - 16
print e.value, e.dtype, e.source, e.index1, e.index2
which should result in:
7.0 float me 1 None
source to share