FaceNet for dummies

Question

FaceNet for dummies

The FaceNet algorithm (described in this article ) uses a convolutional neural network to represent an image in 128-dimensional Euclidean space.

While reading the article, I did not understand:

How does the loss function affect the convolutional network (in conventional networks, the weights are slightly changed to minimize losses) backpropagation - so what happens in this case?)

How are triplets selected?

2.1. how i know negative image is tricky

2.2. why am I using loss function to define negative image

2.3. when I check my images for hardness relative to the anchor - I believe this is before I send the triplet to be processed by the network is correct.

+3

deep-learning machine-learning conv-neural-network

Hello lili 06 jul. 17 at 11:28

source to share

1 answer

vijay m · Accepted Answer · 2017-07-06T16:31:44+0000

Here are some of the answers that might categorize your doubts:

Even here the scales are adjusted to minimize loss, and simply losing tolerance is a little tricky. The loss is in two parts (separated by the + symbol in the equation), first part

is a picture of a person versus another picture of the same person. second part

- the image of a person in comparison with the image of another person. We want the loss to first part

be less than the loss second part

, and the loss equation essentially reflects this. So here you basically want to tune the scale to be same person error

less and different person error

more.
The term "loss" includes three images: the screen (anchor) x_a

and its positive couple: x_p

and its negative pair: x_n

. hardest positive

of x_a

is the positive image that has the largest error compared to the other positive images. hardest negative

of x_a

is the closest image of another person. Thus, you want the most distant positives to be close to each other and push back the closest negatives. This is captured in the loss equation.
Facenet

calculates its anchor during training (online). In each one minibatch

(which is a set of 40 images) they select hardest negative

to anchor and instead of selecting an image hardest positive

they select all pairs anchored in the batch.

If you want to implement face recognition

it is better to consider this paper , which implements centre loss

, which is much easier to train and demonstrate in order to perform better.

FaceNet for dummies

More articles: