Consistant Rank Ligist for Ordinal Regression
Network design
After the last fully-connected layer with num of classes as one, a 1D linear bias layer is introduced.
self.fc = nn.Linear(4096, 1, bias=False)
self.linear_1_bias = nn.Parameter(torch.zeros(num_classes-1).float())
Loss function
Let denote the weight parameters of the neural network excluding the bias units of the final layer. The penultimate layer, whose output is denoted as , shares a single weight with all nodes in the final output layer. independent bias units are then added to such that are the inputs to the crresponding binary classifiers in the final layer. Let be the logistic sigmoid function. The predicted empirical probability for task k is defined as:
For model training, we minimize the loss function:
which is the weighted cross-entropy of K-1 binary classifiers. For rank prediction, the binary labels are obtained via:
Example
Let's take a look at the labels, for 7 ranks:
- Cross-Entropy, the one hot encoded label for class 3 is denoted as ,
CROAL-Loss, it's
levels = [[1] * label + [0] * (self.num_classes - 1 - label) for label in batch_y]
The logits for CORAL-loss looks like this , we find the last num >= 0.5, it's index 3 is our prediction.
During training, the loss for the current sample is calculated as
Ordinal Regression
Network design
Last fc layer outputs (num_classes-1)*2
logits.
self.fc = nn.Linear(2048 * block.expansion, (self.num_classes-1)*2)
Final output is similar to CORAL-loss:
probas = F.softmax(logits, dim=2)[:, :, 1]
predict_levels = probas > 0.5
predicted_labels = torch.sum(predict_levels, dim=1)
Loss function
def cost_fn(logits, levels, imp):
val = (-torch.sum((F.log_softmax(logits, dim=2)[:, :, 1]*levels
+ F.log_softmax(logits, dim=2)[:, :, 0]*(1-levels))*imp, dim=1))
return torch.mean(val)