Cross-Entropy loss
The Cross-Entropy Loss is actually the only loss we are discussing here. The other losses names written in the title are other names or variations of it. The CE Loss is defined as:
Where and are the ground truth and the CNN score for each in . As usually an activation function (Sigmoid / Softmax) is applied to the scores before the CE Loss computation, we write to refer to the activations.
In a binary classification problem, where , the Cross Entropy Loss can be defined also as [discussion]:
Where it’s assumed that there are two classes: and . [0,1] and are the ground truth and the score for , and and are the ground truth and the score for . That is the case when we split a Multi-Label classification problem in binary classification problems. See next Binary Cross-Entropy Loss section for more details.
Logistic Loss and Multinomial Logistic Loss are other names for Cross-Entropy loss. [Discussion]
def softmax(X):
exps = np.exp(X)
return exps / np.sum(exps)
def cross_entropy(predictions, targets):
N = predictions.shape[0]
ce = -np.sum(targets * np.log(predictions)) / N
return ce
predictions = np.array([[0.25, 0.25, 0.25, 0.25], [0.01, 0.01, 0.01, 0.97]]) # (N, num_classes)
targets = np.array([[1, 0, 0, 0], [0, 0, 0, 1]]) # (N, num_classes)
cross_entropy(predictions, targets)
# 0.7083767843022996
log_loss(targets, predictions)
# 0.7083767843022996
log_loss(targets, predictions) == cross_entropy(predictions, targets)
# True
The layers of Caffe, Pytorch and Tensorflow than use a Cross-Entropy loss without an embedded activation function are:
- Caffe: Multinomial Logistic Loss Layer. Is limited to multi-class classification (does not support multiple labels).
- Pytorch: BCELoss. Is limited to binary classification (between two classes).
- TensorFlow: log_loss.