Keras嵌入层造成维数问题

本文介绍了一个使用Keras功能API构建的sequence-to-sequence autoencoder模型,该模型包含一个嵌入层,用于处理序列数据。文章详细展示了模型的代码实现,包括编码器和解码器部分,以及如何训练模型。

我现在试图包含一个嵌入层我sequence-to-sequence autoencoder,建造与keras功能的API。

模型代码是这样的:

<span style="color:#393318"><code><span style="color:#858c93">#Encoder inputs</span><span style="color:#303336">
encoder_inputs </span><span style="color:#303336">=</span> <span style="color:#2b91af">Input</span><span style="color:#303336">(</span><span style="color:#303336">shape</span><span style="color:#303336">=(</span><span style="color:#101094">None</span><span style="color:#303336">,))</span>

<span style="color:#858c93">#Embedding</span><span style="color:#303336">
embedding_layer </span><span style="color:#303336">=</span> <span style="color:#2b91af">Embedding</span><span style="color:#303336">(</span><span style="color:#303336">input_dim</span><span style="color:#303336">=</span><span style="color:#303336">n_tokens</span><span style="color:#303336">,</span><span style="color:#303336"> output_dim</span><span style="color:#303336">=</span><span style="color:#7d2727">2</span><span style="color:#303336">)</span><span style="color:#303336">
encoder_embedded </span><span style="color:#303336">=</span><span style="color:#303336"> embedding_layer</span><span style="color:#303336">(</span><span style="color:#303336">encoder_inputs</span><span style="color:#303336">)</span>

<span style="color:#858c93">#Encoder LSTM</span><span style="color:#303336">
encoder_outputs</span><span style="color:#303336">,</span><span style="color:#303336"> state_h</span><span style="color:#303336">,</span><span style="color:#303336"> state_c </span><span style="color:#303336">=</span><span style="color:#303336"> LSTM</span><span style="color:#303336">(</span><span style="color:#303336">n_hidden</span><span style="color:#303336">,</span><span style="color:#303336"> return_state</span><span style="color:#303336">=</span><span style="color:#101094">True</span><span style="color:#303336">)(</span><span style="color:#303336">encoder_embedded</span><span style="color:#303336">)</span><span style="color:#303336">
lstm_states </span><span style="color:#303336">=</span> <span style="color:#303336">[</span><span style="color:#303336">state_h</span><span style="color:#303336">,</span><span style="color:#303336"> state_c</span><span style="color:#303336">]</span>


<span style="color:#858c93">#Decoder Inputs</span><span style="color:#303336">
decoder_inputs </span><span style="color:#303336">=</span> <span style="color:#2b91af">Input</span><span style="color:#303336">(</span><span style="color:#303336">shape</span><span style="color:#303336">=(</span><span style="color:#101094">None</span><span style="color:#303336">,))</span> 

<span style="color:#858c93">#Embedding</span><span style="color:#303336">
decoder_embedded </span><span style="color:#303336">=</span><span style="color:#303336"> embedding_layer</span><span style="color:#303336">(</span><span style="color:#303336">decoder_inputs</span><span style="color:#303336">)</span>

<span style="color:#858c93">#Decoder LSTM</span><span style="color:#303336">
decoder_lstm </span><span style="color:#303336">=</span><span style="color:#303336"> LSTM</span><span style="color:#303336">(</span><span style="color:#303336">n_hidden</span><span style="color:#303336">,</span><span style="color:#303336"> return_sequences</span><span style="color:#303336">=</span><span style="color:#101094">True</span><span style="color:#303336">,</span><span style="color:#303336"> return_state</span><span style="color:#303336">=</span><span style="color:#101094">True</span><span style="color:#303336">,</span> <span style="color:#303336">)</span><span style="color:#303336">
decoder_outputs</span><span style="color:#303336">,</span><span style="color:#303336"> _</span><span style="color:#303336">,</span><span style="color:#303336"> _ </span><span style="color:#303336">=</span><span style="color:#303336"> decoder_lstm</span><span style="color:#303336">(</span><span style="color:#303336">decoder_embedded</span><span style="color:#303336">,</span><span style="color:#303336"> initial_state</span><span style="color:#303336">=</span><span style="color:#303336">lstm_states</span><span style="color:#303336">)</span>


<span style="color:#858c93">#Dense + Time</span><span style="color:#303336">
decoder_dense </span><span style="color:#303336">=</span> <span style="color:#2b91af">TimeDistributed</span><span style="color:#303336">(</span><span style="color:#2b91af">Dense</span><span style="color:#303336">(</span><span style="color:#303336">n_tokens</span><span style="color:#303336">,</span><span style="color:#303336"> activation</span><span style="color:#303336">=</span><span style="color:#7d2727">'softmax'</span><span style="color:#303336">),</span><span style="color:#303336"> input_shape</span><span style="color:#303336">=(</span><span style="color:#101094">None</span><span style="color:#303336">,</span> <span style="color:#101094">None</span><span style="color:#303336">,</span> <span style="color:#7d2727">256</span><span style="color:#303336">))</span>
<span style="color:#858c93">#decoder_dense = Dense(n_tokens, activation='softmax', )</span><span style="color:#303336">
decoder_outputs </span><span style="color:#303336">=</span><span style="color:#303336"> decoder_dense</span><span style="color:#303336">(</span><span style="color:#303336">decoder_outputs</span><span style="color:#303336">)</span><span style="color:#303336">

model </span><span style="color:#303336">=</span> <span style="color:#2b91af">Model</span><span style="color:#303336">([</span><span style="color:#303336">encoder_inputs</span><span style="color:#303336">,</span><span style="color:#303336"> decoder_inputs</span><span style="color:#303336">],</span><span style="color:#303336"> decoder_outputs</span><span style="color:#303336">)</span><span style="color:#303336">
model</span><span style="color:#303336">.</span><span style="color:#303336">compile</span><span style="color:#303336">(</span><span style="color:#303336">loss</span><span style="color:#303336">=</span><span style="color:#7d2727">'categorical_crossentropy'</span><span style="color:#303336">,</span><span style="color:#303336"> optimizer</span><span style="color:#303336">=</span><span style="color:#7d2727">'rmsprop'</span><span style="color:#303336">,</span><span style="color:#303336"> metrics</span><span style="color:#303336">=[</span><span style="color:#7d2727">'accuracy'</span><span style="color:#303336">])</span></code></span>

模型训练是这样的:

<span style="color:#393318"><code><span style="color:#303336">model</span><span style="color:#303336">.</span><span style="color:#303336">fit</span><span style="color:#303336">([</span><span style="color:#303336">X</span><span style="color:#303336">,</span><span style="color:#303336"> y</span><span style="color:#303336">],</span><span style="color:#303336"> X</span><span style="color:#303336">,</span><span style="color:#303336"> epochs</span><span style="color:#303336">=</span><span style="color:#303336">n_epoch</span><span style="color:#303336">,</span><span style="color:#303336"> batch_size</span><span style="color:#303336">=</span><span style="color:#303336">n_batch</span><span style="color:#303336">)</span></code></span>

X和y有一个形状(n_samples n_seq_len)

模型的编译工作的完美,而在火车,我将永远得到:

ValueError:错误检查目标:预期time_distributed_1 与形状三维,但有数组(n_samples n_seq_len)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值