As an exit

I am using Theano to create a neural network, but when I try to return two lists of tensors at the same time in the list , I get the error:

#This is the line that causes the error
#type(nabla_w) == <type 'list'>
#type(nabla_w[0]) == <class 'theano.tensor.var.TensorVariable'>
backpropagate = function(func_inputs, [nabla_w, nabla_b])

TypeError: Outputs must be theano Variable or Out instances. Received [dot.0, dot.0, dot.0, dot.0] of type <type 'list'>

      

What structure should Anano use to return two tensors into an array so that I can restore them like this:

nabla_w, nabla_b = backpropagate(*args)

      

I have tried using some of the things I found on the Tensor functionality base page , but none of them work. (For example, I tried stack or stacks)

Here is the error I am getting using theano.tensor.stack or stacklists files:

ValueError: all the input array dimensions except for the concatenation axis must match exactly
Apply node that caused the error: Join(TensorConstant{0}, Rebroadcast{0}.0, Rebroadcast{0}.0, Rebroadcast{0}.0, Rebroadcast{0}.0)
Inputs shapes: [(), (1, 10, 50), (1, 50, 100), (1, 100, 200), (1, 200, 784)]
Inputs strides: [(), (4000, 400, 8), (40000, 800, 8), (160000, 1600, 8), (1254400, 6272, 8)]
Inputs types: [TensorType(int8, scalar), TensorType(float64, 3D), TensorType(float64, 3D), TensorType(float64, 3D), TensorType(float64, 3D)]
Use the Theano flag 'exception_verbosity=high' for a debugprint of this apply node.

      

A little extra context for the code:

weights = [T.dmatrix('w'+str(x)) for x in range(0, len(self.weights))]
biases = [T.dmatrix('b'+str(x)) for x in range(0, len(self.biases))]
nabla_b = []
nabla_w = []
# feedforward
x = T.dmatrix('x')
y = T.dmatrix('y')
activations = []
inputs = []
activations.append(x)
for i in xrange(0, self.num_layers-1):
    inputt = T.dot(weights[i], activations[i])+biases[i]
    activation = 1 / (1 + T.exp(-inputt))
    activations.append(activation)
    inputs.append(inputt)

delta = activations[-1]-y
nabla_b.append(delta)
nabla_w.append(T.dot(delta, T.transpose(inputs[-2])))

for l in xrange(2, self.num_layers):
    z = inputs[-l]
    spv = (1 / (1 + T.exp(-z))*(1 - (1 / (1 + T.exp(-z)))))
    delta = T.dot(T.transpose(weights[-l+1]), delta) * spv
    nabla_b.append(delta)
    nabla_w.append(T.dot(delta, T.transpose(activations[-l-1])))
T.set_subtensor(nabla_w[-l], T.dot(delta, T.transpose(inputs[-l-1])))
func_inputs = list(weights)
func_inputs.extend(biases)
func_inputs.append(x)
func_inputs.append(y)


backpropagate = function(func_inputs, [nabla_w, nabla_b])

      

+3


source to share


1 answer


This is not supported by Theano. When you call theano.function(inputs, outputs)

, the outputs can be just 2 things:

1) Thean variable 2) Anano variable list

(2) does not allow you to have a list in the top-level list, so you must align the lists in the outputs. This will return more than two outputs.

The tricky solution to your problem is to have the inner list copied into 1 variable.

tensor_nabla_w = theano.tensor.stack(*nabla_w).

      



This means that all items in nabla_w have the same shape. This will add an extra copy to the compute graph (so it might be a little slower).

Update 1: fix the call to stack ()

Update 2:

We have currently added the restriction that all elements will have different shapes, so the stack cannot be used. If they all have the same number of dimensions and dtype, you can use typed_list , otherwise you will need to modify Theano yourself or flatten the output of the lists.

+6


source







All Articles