Generalized generic function in numpy
I am trying to do a generic ufunc using the numpy API. Inputs represent a matrix (n x m)
and the scalar, and the outputs - two matrices ( (n x p)
and (p x m)
). But I don't know how to do it. Can someone help me? In the init function, I use a function PyUFunc_FromFuncAndDataAndSignature
with signature:
"(n,m),()->(n,p),(p,m)"
I can read the inputs (matrix and scalar), but I wanted to use the scalar input as the p dimension in the signature. Is it possible?
Here's some sample code that just prints the inputs:
#include "Python.h"
#include "math.h"
#include "numpy/ndarraytypes.h"
#include "numpy/ufuncobject.h"
static PyMethodDef nmfMethods[] = {
{NULL, NULL, 0, NULL}
};
static void double_nmf(char **args, npy_intp *dimensions,
npy_intp* steps, void* data)
{
npy_intp i, j,
n = dimensions[1], //dimensions of input matrix
m = dimensions[2]; //
printf("scalar: %d\n",*(int*)args[1]); // input scalar
// just print input matrix
printf("Input matrix:\n");
for(i=0;i<n;i++){
for(j=0;j<m;j++){
printf("%.1f ",*(double*)(args[0]+8*(i*m+j)));
}
printf("\n");
}
return;
}
static PyUFuncGenericFunction nmf_functions[] = { double_nmf };
static void * nmf_data[] = { (void *)NULL };
static char nmf_signatures[] = { PyArray_DOUBLE, PyArray_INT, PyArray_DOUBLE, PyArray_DOUBLE };
char *nmf_signature = "(n,m),()->(n,p),(p,m)";
PyMODINIT_FUNC initnmf(void)
{
PyObject *m, *d, *version, *nmf;
m = Py_InitModule("nmf", nmfMethods);
if (m == NULL) {
return;
}
import_array();
import_umath();
d = PyModule_GetDict(m);
version = PyString_FromString("0.1");
PyDict_SetItemString(d, "__version__", version);
Py_DECREF(version);
nmf = PyUFunc_FromFuncAndDataAndSignature(nmf_functions, nmf_data, nmf_signatures, 1,
2, 2, PyUFunc_None, "nmf",
"", 0, nmf_signature);
PyDict_SetItemString(d, "nmf", nmf);
Py_DECREF(nmf);
}
This code compiles and works. The python script is here:
#/usr/bin/python
import numpy as np
import nmf
x = np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15],[16,17,18,19,20]])
y,z = nmf.nmf(x,2)
print "Shapes of outputs: ", y.shape, z.shape
And the terminal output:
scalar: 2
Input matrix:
1.0 2.0 3.0 4.0 5.0
6.0 7.0 8.0 9.0 10.0
11.0 12.0 13.0 14.0 15.0
16.0 17.0 18.0 19.0 20.0
Shapes of outputs: (4, 1) (1, 5)
My doubt is how to use scalar input (2 in case) as p dimension of output matrices. In the example p = 1 and I am not setting it.
There is no way to set a dimension in gufunc other than by providing an array of a specific size. 1
is the value that all dimensions are initialized internally, and you shouldn't rely on the fact that it doesn't change. My personal view is that the undefined dimension should throw an error.
The only thing you can do to set p
is create an empty array with the correct shape and pass it as the output array. To achieve this, you would redefine yours nmf
to have a signature "(m,n)->(m,p),(p,n)"
and wrap it in some Python like:
def nmf_wrap(x, p):
x = np.asarray(x)
assert x.ndim >= 2
shape = x.shape[:-2]
m, n = x.shape[-2:]
out1 = np.empty(shape+(m, p), dtype=x.dtype)
out2 = np.empty(shape+(p, n), dtype=x.dtype)
return nmf.nmf(x, out1, out2)
There has been some discussion recently about extending the functionality provided by gufuncs signatures on the numpy-dev mailing list . What you described has some similarities to what I call "computed dimensions" there. If you want to see something implemented in numpy 1.10 it would be great if you could explain your use case in more detail in this list: we don't know about many (any?) Gufunc codecs in the wild!
Thanks @jaime for your answer, helped me a lot. I made the changes you suggested and it works. Here's a C code example. It just copies some elements of the input matrix into the outputs.
#include "Python.h"
#include "numpy/ndarraytypes.h"
#include "numpy/ufuncobject.h"
static PyMethodDef nmfMethods[] = {
{NULL, NULL, 0, NULL}
};
static void double_nmf(char **args, npy_intp *dimensions,
npy_intp* steps, void* data)
{
npy_intp i, j,
n = dimensions[1],
m = dimensions[2],
p = dimensions[3];
char *in = args[0], *out1 = args[1], *out2 = args[2];
for(i=0; i<n; i++){
for(j=0; j<p; j++){
*(double*)(out1 + 8*(j + p*i)) = *(double*)(in + 8*(j + m*i));
}
}
for(i=0; i<p; i++){
for(j=0; j<m; j++){
*(double*)(out2 + 8*(j + m*i)) = *(double*)(in + 8*(j + m*i));
}
}
return;
}
static PyUFuncGenericFunction nmf_functions[] = { double_nmf };
static void * nmf_data[] = { (void *)NULL };
static char nmf_signatures[] = { PyArray_DOUBLE, PyArray_DOUBLE, PyArray_DOUBLE };
char *nmf_signature = "(n,m)->(n,p),(p,m)";
PyMODINIT_FUNC initnmf(void)
{
PyObject *m, *d, *version, *nmf;
m = Py_InitModule("nmf", nmfMethods);
if (m == NULL) {
return;
}
import_array();
import_umath();
d = PyModule_GetDict(m);
version = PyString_FromString("0.1");
PyDict_SetItemString(d, "__version__", version);
Py_DECREF(version);
nmf = PyUFunc_FromFuncAndDataAndSignature(nmf_functions, nmf_data, nmf_signatures, 1,
1, 2, PyUFunc_None, "nmf",
"", 0, nmf_signature);
PyDict_SetItemString(d, "nmf", nmf);
Py_DECREF(nmf);
}
Here's some sample Python code and terminal output.
#/usr/bin/python
import numpy as np
import nmf
def nmf_wrap(x,p):
x = np.asarray(x)
assert x.ndim >=2
shape = x.shape[-2:]
n,m = shape[-2:]
out1 = np.empty((n, p), dtype=x.dtype)
out2 = np.empty((p, m), dtype=x.dtype)
return nmf.nmf(x, out1, out2)
x = np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15],[16,17,18,19,20]])
y,z = nmf_wrap(x,2)
print 'Input:\n', x
print 'Output 1:\n', y
print 'Output 2:\n', z
Input:
[[ 1 2 3 4 5]
[ 6 7 8 9 10]
[11 12 13 14 15]
[16 17 18 19 20]]
Output 1:
[[ 1 2]
[ 6 7]
[11 12]
[16 17]]
Output 2:
[[ 1 2 3 4 5]
[ 6 7 8 9 10]]
Now I can continue programming.
I am also writing non-ufunc using Python / C API (no numpy) as @ nathaniel-j-smith suggested (this is what I got). I hope to get some results and briefly compare the two approaches.