Strange accuracy in keras multi-zone classification

I have a problem with multilayer classification, I used the following code, but in the first period the validation accuracy rises to 99%, which is strange given the complexity of the data, since the input features are retrieved from the initial model layer (pool3: 0) and the label [1000], ( here the file link contains sample functions and labels: https://drive.google.com/file/d/0BxI_8PO3YBPPYkp6dHlGeExpS1k/view?usp=sharing ), am I doing something wrong here ??

Note: Sparse vector labels only contain 1 ~ 10 entries and 1 contains zeros

model.compile(optimizer='adadelta', loss='binary_crossentropy', metrics=['accuracy']) 

      

The result of the forecast is zero!

What am I doing wrong when teaching a model to predict?

#input is the features file and labels file

def generate_arrays_from_file(path ,batch_size=100):
x=np.empty([batch_size,2048])
y=np.empty([batch_size,1000])
while True:
    f = open(path)
    i = 1  
    for line in f:
        # create Numpy arrays of input data
        # and labels, from each line in the file
        words=line.split(',')
        words=map(float, words[1:])
        x_= np.array(words[0:2048])
        y_=words[2048:]
        y_= np.array(map(int,y_))
        x_=x_.reshape((1, -1))
        #print np.squeeze(x_)
        y_=y_.reshape((1,-1))
        x[i]= x_
        y[i]=y_
        i += 1
        if i == batch_size:
            i=1
            yield (x, y)

    f.close()

model = Sequential()
model.add(Dense(units=2048, activation='sigmoid', input_dim=2048))
model.add(Dense(units=1000, activation="sigmoid", 
kernel_initializer="uniform"))
model.compile(optimizer='adadelta', loss='binary_crossentropy', metrics=
['accuracy'])

model.fit_generator(generate_arrays_from_file('train.txt'),
                validation_data= generate_arrays_from_file('test.txt'),
                validation_steps=1000,epochs=100,steps_per_epoch=1000, 
                  verbose=1)

      

+6


source to share


2 answers


I think the problem with accuracy is that your output is sparse.

Keras calculates the precision using this formula:

K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)

      



So in your case, with only 1 ~ 10 non-zero labels, predicting all 0s will give an accuracy of 99.9% ~ 99%.

As far as the issue is not being explored, I believe the issue is that you are using the sigmoid as the last activation and using 0 or 1 as the output value. This is bad practice because for a sigmoid to return 0 or 1, the values ​​it receives as input must be very large or very small, which is reflected in a grid with very large (in absolute magnitude) weights. In addition, since there is much less than 1 in each training release than in the network, the network will soon go to a stationary point where it just outputs all zeros (the loss in this case is also not very large, should be around 0.016-0.16).

What you can do is scale the output labels to be between (0.2, 0.8), for example, so the network weights don't get too big or too small. Alternatively, you can use the function relu

as an activation function.

+5


source


Have you tried using cosine similarity as a loss function?

I had the same problem with multiple labels + high dimension.

The cosine distance takes into account the orientation of the model output (prediction) and the required output vector (true class).



It is the normalized dot product between two vectors.

In keras, the cosine_proximity function is -1 * cosine_distance. This means that -1 matches two vectors with the same size and orientation.

0


source







All Articles