tensorflow 中的embedding 报错问题解决

最新推荐文章于 2025-03-23 07:18:32 发布

momaojia

最新推荐文章于 2025-03-23 07:18:32 发布

阅读量1.1w

点赞数 2

CC 4.0 BY-SA版权

分类专栏： tensorflow

本文链接：https://blog.youkuaiyun.com/momaojia/article/details/77533129

tensorflow 专栏收录该内容

7 篇文章

订阅专栏

本文介绍使用TensorFlow中tf.contrib.layers.embed_sequence时遇到的错误及其解决方案。问题源于词汇表大小设置不当，解决方法是在词汇表大小上加一，以确保索引值处于有效范围内。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

今天TensorFlow中的tf.contrib.layers.embed_sequence来对输入进行embedding时候，发现报了如下的错误：

InvalidArgumentError (see above for traceback): indices[1,2] = 6 is not in [0, 6)
	 [[Node: EmbedSequence_8/embedding_lookup = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@EmbedSequence_8/embeddings"], validate_indices=true, _dev
ice="/job:localhost/replica:0/task:0/cpu:0"](EmbedSequence_8/embeddings/read, EmbedSequence_8/embedding_lookup/indices)]]

后来就找了一些资料，寻求解决方法，发现是vocab_size要加1才可以，因为vocab_to_int是从1开始编码的，0留作评论单词不足的补全位，代码如下：

   
import tensorflow as tf

features = [[1,2,3],[4,5,6]]
n_words=6
outputs = tf.contrib.layers.embed_sequence(features, vocab_size=n_words+1, embed_dim=4)

with tf.Session(config=config) as sess:
    sess.run(tf.global_variables_initializer())
    a=sess.run(outputs)
    print(a)

词向量结果如下：

[[[ 0.42639822 -0.45257723  0.44895023  0.17683214]
  [ 0.68834776  0.25755352  0.18518716 -0.36953419]
  [-0.20138246 -0.35034212  0.44844049  0.3326121 ]]

 [[-0.55106479 -0.64119202 -0.06463015 -0.68032914]
  [ 0.58467633  0.58155423  0.63106912  0.17282218]
  [ 0.46636218 -0.73744893  0.38337153  0.64258808]]]