How do I change the shape of the input to feed it into a 1D convolutional layer for classifying sequences?

I have a csv file with 339732 rows and two columns:

  • the first of them is 29 characteristic values, i.e. X
  • second binary label value, i.e. Y

    dataframe = pd.read_csv ("features.csv", header = None) dataset = dataframe.values

    X = dataset[:, 0:29].astype(float)
    Y = dataset[:,29]
    X_train, y_train, X_test, y_test = train_test_split(X,Y, random_state = 42)
    
          

I am trying to train it on a 1D convolutional layer:

model = Sequential()
model.add(Conv1D(64, 3, activation='relu', input_shape=(X_train.shape[0], 29)))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Conv1D(128, 3, activation='relu'))
model.add(Conv1D(128, 3, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

model.fit(X_train, y_train, batch_size=16, epochs=2)
score = model.evaluate(X_test, y_test, batch_size=16)

      

Since the Conv1D layer expects a 3-D input, I transformed my input like this:

X_train = np.reshape(X_train, (1, X_train.shape[0], X_train.shape[1]))
X_test = np.reshape(X_test, (1, X_test.shape[0], X_test.shape[1]))

      

However, this still throws an error:

ValueError: negative dimension size caused by subtracting 3 from 1 for 'conv1d_1 / convolution / Conv2D' (op: 'Conv2D') with input forms: [?, 1,1,29], [1,3,29, 64].

Is there a way to properly file my input?

+3


source to share


2 answers


As far as I know, the 1D Convolution layer accepts Batchsize x Width x Channels form inputs. You change the shape with

X_train = np.reshape(X_train, (1, X_train.shape[0], X_train.shape[1]))



But X_train.shape[0]

- this is your batchsize . I guess the problem is here. Could you please tell me what is the shape of the X_train before the change?

0


source


You need to think about whether your data has any relationship between 339732 elements or 29 functions, that means if the order matters. If not, I don't think CNN is right for this case.

If 29 signs "indicate the progression of something":

X_train = X_train.reshape ((X_train.shape [0], X_train.shape [1], 1))

If the 29 functions are independent, then it looks like the channels in the image, but it doesn't make sense convolute with only 1.



X_train = X_train.reshape ((X_train.shape [0], 1, X_train.shape [1]))

If you want to select records 339732, for example, in blocks where order matters (fix with 339732 or add zero padding to divide by timeouts):

X_train = X_train.reshape ((int (X_train.shape [0] / timesteps), timesteps, X_train.shape [1], 1))

0


source







All Articles