papers
The Unreasonable Effectiveness of Recurrent Neural Networks
https://karpathy.github.io/2015/05/21/rnn-effectiveness/
Understanding LSTM Networks
https://colah.github.io/posts/2015-08-Understanding-LSTMs/
输出


output
RNN(
(i2h): Linear(in_features=205, out_features=128, bias=True)
(i2o): Linear(in_features=205, out_features=59, bias=True)
(o2o): Linear(in_features=187, out_features=59, bias=True)
(dropout): Dropout(p=0.1)
(softmax): LogSoftmax()
)
config:
eval_epoch_steps : 10
train_load_check_point_file : True
num_workers : 4
momentum : 0.9
early_stop_epoch_limit : 10
optimizer : SGD
early_stop_epoch : True
train_epoch_steps : 10
dataset : names
steps : 100000
early_stop_step : True
n_hidden : 128
epoch_only : True
learn_rate : 0.0005
all_letters : abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ .,;-
epochs : 100
n_letters : 59
batch_size : 1000
max_epoch_stop : True
max_step_stop : True
device : cpu
print_every : 5
data_path : ./data/names
loss : NLL
early_stop_step_limit : 100
[E: 0/100] [S: 5/100000] [Train: Loss:2.958800 5/10 (50%)] [Val: Loss:2.839336 0/10000 (0%)] [Best epoch:0 Loss:2.839336] [Best step:5 Loss:2.839336] [0.00s 0.0s]
[E: 0/100] [S: 10/100000] [Train: Loss:2.647621 10/10 (100%)] [Val: Loss:2.631263 0/10000 (0%)] [Best epoch:0 Loss:2.631263] [Best step:10 Loss:2.631263] [0.00s 0.0s]
[E: 0/100] [S: 10/100000] [Train: Loss:2.968491 10/10 (100%)] [Val: Loss:2.629430 0/10000 (0%)] [Best epoch:0 Loss:2.629430] [Best step:10 Loss:2.631263] [40.92s 40.9s]
[E: 1/100] [S: 15/100000] [Train: Loss:2.736446 5/10 (50%)] [Val: Loss:2.579854 0/10000 (0%)] [Best epoch:1 Loss:2.579854] [Best step:15 Loss:2.579854] [40.92s 40.9s]
[E: 1/100] [S: 20/100000] [Train: Loss:2.443919 10/10 (100%)] [Val: Loss:2.463921 0/10000 (0%)] [Best epoch:1 Loss:2.463921] [Best step:20 Loss:2.463921] [40.92s 40.9s]
[E: 1/100] [S: 20/100000] [Train: Loss:2.628853 10/10 (100%)] [Val: Loss:2.466182 0/10000 (0%)] [Best epoch:1 Loss:2.463921] [Best step:20 Loss:2.463921] [40.96s 81.9s]
[E: 2/100] [S: 25/100000] [Train: Loss:2.594146 5/10 (50%)] [Val: Loss:2.435748 0/10000 (0%)] [Best epoch:2 Loss:2.435748] [Best step:25 Loss:2.435748] [40.96s 81.9s]
[E: 2/100] [S: 30/100000] [Train: Loss:2.365779 10/10 (100%)] [Val: Loss:2.369891 0/10000 (0%)] [Best epoch:2 Loss:2.369891] [Best step:30 Loss:2.369891] [40.96s 81.9s]
[E: 1/100] [S: 25/100000] [Train: Loss:2.570118 5/10 (50%)] [Val: Loss:2.437161 0/10000 (0%)] [Best epoch:1 Loss:2.437161] [Best step:25 Loss:2.437161] [40.96s 81.9s] Step Status
[E: 1/100] [S: 30/100000] [Train: Loss:2.366355 10/10 (100%)] [Val: Loss:2.368035 0/10000 (0%)] [Best epoch:1 Loss:2.368035] [Best step:30 Loss:2.368035] [40.96s 81.9s] Step Status
[E: 1/100] [S: 30/100000] [Train: Loss:2.366355 10/10 (100%)] [Val: Loss:2.368035 0/10000 (0%)] [Best epoch:1 Loss:2.357966] [Best step:30 Loss:2.368035] [40.80s 122.7s] Epoch Status
[E: 2/100] [S: 35/100000] [Train: Loss:2.515979 5/10 (50%)] [Val: Loss:2.340214 0/10000 (0%)] [Best epoch:2 Loss:2.340214] [Best step:35 Loss:2.340214] [40.80s 122.7s] Step Status
[E: 2/100] [S: 40/100000] [Train: Loss:2.323779 10/10 (100%)] [Val: Loss:2.298239 0/10000 (0%)] [Best epoch:2 Loss:2.298239] [Best step:40 Loss:2.298239] [40.80s 122.7s] Step Status
[E: 2/100] [S: 40/100000] [Train: Loss:2.323779 10/10 (100%)] [Val: Loss:2.298239 0/10000 (0%)] [Best epoch:2 Loss:2.287091] [Best step:40 Loss:2.298239] [41.18s 163.9s] Epoch Status
[E: 3/100] [S: 45/100000] [Train: Loss:2.438657 5/10 (50%)] [Val: Loss:2.293576 0/10000 (0%)] [Best epoch:2 Loss:2.287091] [Best step:45 Loss:2.293576] [41.18s 163.9s] Step Status
[E: 3/100] [S: 50/100000] [Train: Loss:2.246784 10/10 (100%)] [Val: Loss:2.264124 0/10000 (0%)] [Best epoch:3 Loss:2.264124] [Best step:50 Loss:2.264124] [41.18s 163.9s] Step Status
[E: 3/100] [S: 50/100000] [Train: Loss:2.246784 10/10 (100%)] [Val: Loss:2.264124 0/10000 (0%)] [Best epoch:3 Loss:2.260864] [Best step:50 Loss:2.264124] [41.46s 205.3s] Epoch Status
[E: 4/100] [S: 55/100000] [Train: Loss:2.416693 5/10 (50%)] [Val: Loss:2.260192 0/10000 (0%)] [Best epoch:4 Loss:2.260192] [Best step:55 Loss:2.260192] [41.46s 205.3s] Step Status
[E: 4/100] [S: 60/100000] [Train: Loss:2.222714 10/10 (100%)] [Val: Loss:2.227812 0/10000 (0%)] [Best epoch:4 Loss:2.227812] [Best step:60 Loss:2.227812] [41.46s 205.3s] Step Status
[E: 4/100] [S: 60/100000] [Train: Loss:2.222714 10/10 (100%)] [Val: Loss:2.227812 0/10000 (0%)] [Best epoch:4 Loss:2.227812] [Best step:60 Loss:2.227812] [41.39s 246.7s] Epoch Status
[E: 5/100] [S: 65/100000] [Train: Loss:2.408526 5/10 (50%)] [Val: Loss:2.221667 0/10000 (0%)] [Best epoch:5 Loss:2.221667] [Best step:65 Loss:2.221667] [41.39s 246.7s] Step Status
[E: 5/100] [S: 70/100000] [Train: Loss:2.190458 10/10 (100%)] [Val: Loss:2.185826 0/10000 (0%)] [Best epoch:5 Loss:2.185826] [Best step:70 Loss:2.185826] [41.39s 246.7s] Step Status
[E: 5/100] [S: 70/100000] [Train: Loss:2.190458 10/10 (100%)] [Val: Loss:2.185826 0/10000 (0%)] [Best epoch:5 Loss:2.185826] [Best step:70 Loss:2.185826] [41.28s 288.0s] Epoch Status
[E: 6/100] [S: 75/100000] [Train: Loss:2.313870 5/10 (50%)] [Val: Loss:2.186711 0/10000 (0%)] [Best epoch:5 Loss:2.185826] [Best step:70 Loss:2.185826] [41.28s 288.0s] Step Status
[E: 6/100] [S: 80/100000] [Train: Loss:2.149164 10/10 (100%)] [Val: Loss:2.153440 0/10000 (0%)] [Best epoch:6 Loss:2.153440] [Best step:80 Loss:2.153440] [41.28s 288.0s] Step Status
[E: 6/100] [S: 80/100000] [Train: Loss:2.149164 10/10 (100%)] [Val: Loss:2.153440 0/10000 (0%)] [Best epoch:6 Loss:2.150477] [Best step:80 Loss:2.153440] [39.34s 327.3s] Epoch Status
[E: 7/100] [S: 85/100000] [Train: Loss:2.325105 5/10 (50%)] [Val: Loss:2.165497 0/10000 (0%)] [Best epoch:6 Loss:2.150477] [Best step:80 Loss:2.153440] [39.34s 327.3s] Step Status
[E: 7/100] [S: 90/100000] [Train: Loss:2.171192 10/10 (100%)] [Val: Loss:2.125976 0/10000 (0%)] [Best epoch:7 Loss:2.125976] [Best step:90 Loss:2.125976] [39.34s 327.3s] Step Status
[E: 7/100] [S: 90/100000] [Train: Loss:2.171192 10/10 (100%)] [Val: Loss:2.125976 0/10000 (0%)] [Best epoch:7 Loss:2.125976] [Best step:90 Loss:2.125976] [41.46s 368.8s] Epoch Status
[E: 8/100] [S: 95/100000] [Train: Loss:2.307920 5/10 (50%)] [Val: Loss:2.147382 0/10000 (0%)] [Best epoch:7 Loss:2.125976] [Best step:90 Loss:2.125976] [41.46s 368.8s] Step Status
[E: 8/100] [S: 100/100000] [Train: Loss:2.107451 10/10 (100%)] [Val: Loss:2.116395 0/10000 (0%)] [Best epoch:8 Loss:2.116395] [Best step:100 Loss:2.116395] [41.46s 368.8s] Step Status
[E: 8/100] [S: 100/100000] [Train: Loss:2.107451 10/10 (100%)] [Val: Loss:2.116395 0/10000 (0%)] [Best epoch:8 Loss:2.115855] [Best step:100 Loss:2.116395] [41.35s 410.1s] Epoch Status
[E: 9/100] [S: 105/100000] [Train: Loss:2.316520 5/10 (50%)] [Val: Loss:2.133158 0/10000 (0%)] [Best epoch:8 Loss:2.115855] [Best step:100 Loss:2.116395] [41.35s 410.1s] Step Status
[E: 9/100] [S: 110/100000] [Train: Loss:2.101270 10/10 (100%)] [Val: Loss:2.119242 0/10000 (0%)] [Best epoch:8 Loss:2.115855] [Best step:100 Loss:2.116395] [41.35s 410.1s] Step Status
[E: 9/100] [S: 110/100000] [Train: Loss:2.101270 10/10 (100%)] [Val: Loss:2.119242 0/10000 (0%)] [Best epoch:9 Loss:2.107559] [Best step:100 Loss:2.116395] [41.22s 451.4s] Epoch Status
[E: 10/100] [S: 115/100000] [Train: Loss:2.301298 5/10 (50%)] [Val: Loss:2.107354 0/10000 (0%)] [Best epoch:10 Loss:2.107354] [Best step:115 Loss:2.107354] [41.22s 451.4s] Step Status
[E: 10/100] [S: 120/100000] [Train: Loss:2.120560 10/10 (100%)] [Val: Loss:2.098027 0/10000 (0%)] [Best epoch:10 Loss:2.098027] [Best step:120 Loss:2.098027] [41.22s 451.4s] Step Status
[E: 10/100] [S: 120/100000] [Train: Loss:2.120560 10/10 (100%)] [Val: Loss:2.098027 0/10000 (0%)] [Best epoch:10 Loss:2.081124] [Best step:120 Loss:2.098027] [41.28s 492.6s] Epoch Status
[E: 11/100] [S: 125/100000] [Train: Loss:2.274269 5/10 (50%)] [Val: Loss:2.093999 0/10000 (0%)] [Best epoch:10 Loss:2.081124] [Best step:125 Loss:2.093999] [41.28s 492.6s] Step Status
[E: 11/100] [S: 130/100000] [Train: Loss:2.076419 10/10 (100%)] [Val: Loss:2.086819 0/10000 (0%)] [Best epoch:10 Loss:2.081124] [Best step:130 Loss:2.086819] [41.28s 492.6s] Step Status
[E: 11/100] [S: 130/100000] [Train: Loss:2.076419 10/10 (100%)] [Val: Loss:2.086819 0/10000 (0%)] [Best epoch:10 Loss:2.081124] [Best step:130 Loss:2.086819] [41.07s 533.7s] Epoch Status
[E: 12/100] [S: 135/100000] [Train: Loss:2.307883 5/10 (50%)] [Val: Loss:2.092391 0/10000 (0%)] [Best epoch:10 Loss:2.081124] [Best step:130 Loss:2.086819] [41.07s 533.7s] Step Status
[E: 12/100] [S: 140/100000] [Train: Loss:2.105602 10/10 (100%)] [Val: Loss:2.056846 0/10000 (0%)] [Best epoch:12 Loss:2.056846] [Best step:140 Loss:2.056846] [41.07s 533.7s] Step Status
[E: 12/100] [S: 140/100000] [Train: Loss:2.105602 10/10 (100%)] [Val: Loss:2.056846 0/10000 (0%)] [Best epoch:12 Loss:2.056846] [Best step:140 Loss:2.056846] [40.88s 574.6s] Epoch Status
[E: 13/100] [S: 145/100000] [Train: Loss:2.229566 5/10 (50%)] [Val: Loss:2.066366 0/10000 (0%)] [Best epoch:12 Loss:2.056846] [Best step:140 Loss:2.056846] [40.88s 574.6s] Step Status
[E: 13/100] [S: 150/100000] [Train: Loss:2.048885 10/10 (100%)] [Val: Loss:2.070952 0/10000 (0%)] [Best epoch:12 Loss:2.056846] [Best step:140 Loss:2.056846] [40.88s 574.6s] Step Status
[E: 13/100] [S: 150/100000] [Train: Loss:2.048885 10/10 (100%)] [Val: Loss:2.070952 0/10000 (0%)] [Best epoch:13 Loss:2.052732] [Best step:140 Loss:2.056846] [40.91s 615.5s] Epoch Status
[E: 14/100] [S: 155/100000] [Train: Loss:2.234232 5/10 (50%)] [Val: Loss:2.075228 0/10000 (0%)] [Best epoch:13 Loss:2.052732] [Best step:140 Loss:2.056846] [40.91s 615.5s] Step Status
[E: 13/100] [S: 155/100000] [Train: Loss:2.257734 5/10 (50%)] [Val: Loss:2.064717] [Best epoch:13 Loss:2.052732] [Best step:140 Loss:2.056846] [40.91s 615.5s] Step Status
[E: 13/100] [S: 160/100000] [Train: Loss:2.035141 10/10 (100%)] [Val: Loss:2.039078] [Best epoch:13 Loss:2.039078] [Best step:160 Loss:2.039078] [40.91s 615.5s] Step Status
[E: 13/100] [S: 160/100000] [Train: Loss:2.035141 10/10 (100%)] [Val: Loss:2.039078] [Best epoch:13 Loss:2.029834] [Best step:160 Loss:2.039078] [42.38s 657.9s] Epoch Status
[E: 14/100] [S: 165/100000] [Train: Loss:2.241119 5/10 (50%)] [Val: Loss:2.037968] [Best epoch:13 Loss:2.029834] [Best step:165 Loss:2.037968] [42.38s 657.9s] Step Status
[E: 14/100] [S: 170/100000] [Train: Loss:2.031683 10/10 (100%)] [Val: Loss:2.042479] [Best epoch:13 Loss:2.029834] [Best step:165 Loss:2.037968] [42.38s 657.9s] Step Status
[E: 14/100] [S: 170/100000] [Train: Loss:2.031683 10/10 (100%)] [Val: Loss:2.042479] [Best epoch:13 Loss:2.029834] [Best step:165 Loss:2.037968] [40.78s 698.6s] Epoch Status
[E: 15/100] [S: 175/100000] [Train: Loss:2.227793 5/10 (50%)] [Val: Loss:2.041410] [Best epoch:13 Loss:2.029834] [Best step:165 Loss:2.037968] [40.78s 698.6s] Step Status
[E: 15/100] [S: 180/100000] [Train: Loss:2.041359 10/10 (100%)] [Val: Loss:2.021471] [Best epoch:15 Loss:2.021471] [Best step:180 Loss:2.021471] [40.78s 698.6s] Step Status
[E: 15/100] [S: 180/100000] [Train: Loss:2.041359 10/10 (100%)] [Val: Loss:2.021471] [Best epoch:15 Loss:2.016103] [Best step:180 Loss:2.021471] [41.15s 739.8s] Epoch Status
[E: 16/100] [S: 185/100000] [Train: Loss:2.255320 5/10 (50%)] [Val: Loss:2.027396] [Best epoch:15 Loss:2.016103] [Best step:180 Loss:2.021471] [41.15s 739.8s] Step Status
[E: 16/100] [S: 190/100000] [Train: Loss:2.016954 10/10 (100%)] [Val: Loss:2.019632] [Best epoch:15 Loss:2.016103] [Best step:190 Loss:2.019632] [41.15s 739.8s] Step Status
[E: 16/100] [S: 190/100000] [Train: Loss:2.016954 10/10 (100%)] [Val: Loss:2.019632] [Best epoch:15 Loss:2.016103] [Best step:190 Loss:2.019632] [40.85s 780.6s] Epoch Status
[E: 17/100] [S: 195/100000] [Train: Loss:2.210386 5/10 (50%)] [Val: Loss:2.018931] [Best epoch:15 Loss:2.016103] [Best step:195 Loss:2.018931] [40.85s 780.6s] Step Status
[E: 17/100] [S: 200/100000]

本文深入探讨了循环神经网络(RNN)和长短期记忆网络(LSTM)的工作原理,通过实例展示了这两种网络如何处理序列数据,同时提供了详细的网络结构配置和训练过程,包括损失函数、优化器选择和早期停止策略。
最低0.47元/天 解锁文章
2280

被折叠的 条评论
为什么被折叠?



