Neural Network from Scratch in Cangjie: Part 5 - 仓颉从头开始的神经网络:第五部分

Today, we will implement a loss and accuracy function to be able to observe how our network is performing in its currently untrained state. This is the last step before we move on to optimization and training. Training involves doing multiple forward and backward passes, adjusting weights and biases, and monitoring loss and accuracy. If loss goes down and accuracy goes up after each pass, the network is actually learning from the data.

For multi-class classification tasks like what we are trying to achieve, Categorical Crossentropy (CC) is the most popular loss function. In the case of binary classification with 2 classes, that would be Binary Crossentropy.

These functions analyze the results from the Softmax activation function and compare them to ground truths - the `y` in the dataset. When confidence levels in the Softmax output equals to 1 (when the network is 100% sure about the predictions), loss equals to 0, and vice versa.

Imagine Softmax outputs that look like the following:

let softmaxOutputs = [[0.7, 0.1, 0.2], [0.1, 0.5, 0.4], [0.02, 0.9, 0.08]]

...and the `y` values for that batch of outputs are the following (representing cat, dog, and dog):

let classTargets = [0, 1, 1]

In this case, we have 2 classes: 0 - cat, 1 - dog. If we encode the classes this way (and we do in the sample spiral data that is used in this tutorial series), these classes actually are indexes in the Softmax output that we can use to retrieve corresponding predictions from. In order to ensure consistency, we need to encode the labels appropriately at the data cleaning stage.

We get [0.7, 0.5, 0.9] - the model is 70% sure that the first sample is a dog, 50% sure that the second sample is a cat and so on.

Categorical cross entropy is basically the negative logarithm of each confidence level, or -log(x). And "loss" is the average (mean) negative logarithm across samples. What would be the loss and accuracy for the said sample of 3?

First, we get the confidences.

var confidencesList = ArrayList<Float64>([])
for ((targIdx, distribution) in classTargets |> zip(softmaxOutputs)) {
    confidencesList.append(distribution[targIdx])
}

>>> [0.700000, 0.500000, 0.900000]

Second, we calculate the negative logarithm from the confidences array. Because -log(0) is `inf` (infinity) and -log(1) results in a negative value (and loss cannot be negative), we need to ensure that our confidences are never 0 or 1, otherwise the network throws an error. What we can do is `clip` our values by adding a very small number to a prediction if it is 0 and subtracting the same number from a prediction if it is 1. The book recommends to use 1e-7, which is 0.0000001.

let negLog = confidencesList |> map {i => -log(clamp(i, 1e-7, 1.0 - 1e-7))} |> collectArray

>>> [0.356675, 0.693147, 0.105361]

Because w

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值