1.在cnn的嵌入层看到如下代码
embedding=tf.get_variable('embedding',[3,2])
这时创建了一个类似3x2矩阵的变量,而且这个变量是的。所以要进行初始化,或者赋值。
embedding=tf.get_variable('embedding',[3,2])
session = tf.Session()
session.run(tf.global_variables_initializer())
这两行代码就可以进行初始化,先捕捉变量再进行变量的初始化,结果如下:
2.在tensorflow中定义一个变量
如下:
`
x = tf.Variable(1)
此时定义了一个变量x,那个1是个0阶张量
但此时只是定义了一个结构 类型是 变量 x,初始值是1,单x现在在内存中还不是1.因为tensorflow实际是以graph图表结构的形式运行的,在执行sess.run(传入需要取值的节点)时才去计算该图表某个节点的值,在此之前的操作都是为了构建此graph的结构并没有真正的赋于实际的值。执行variable(1)时也就是只是定义结构(类型为变量,初始值为1)。只有执行变量初始化方法时才赋予其定义的值。
with tf.Session().as_default() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(x))`
3.
4.关于tf.nn.embedding_look()的使用说明,这个实在看textcnn工程发现的。现有分类样本如下,只举很小的一个例子。
体育 鲍勃库西奖归谁属? NCAA最强控卫是坎巴还是弗神新浪体育讯如今,本赛季的NCAA进入到了末段,各项奖项的评选结果也即将出炉,其中评选最佳控卫的鲍勃-库西奖就将在下周最终四强战时公布,
鲍勃-库西奖是由奈史密斯篮球名人堂提供,旨在奖励年度最佳大学控卫。最终获奖的球员也即将在以下几名热门人选中产生。〈〈〈 NCAA疯狂三月专题主页上线,点击链接查看精彩内容吉梅尔-弗雷戴特,
杨百翰大学“弗神”吉梅尔-弗雷戴特一直都备受关注,他不仅仅是一名射手,他会用“终结对手脚踝”
娱乐 林俊杰为电影《夏日乐悠悠》献首唱HOLD住全场新浪娱乐讯 近日,林俊杰(微博)在成都举办歌友会,现场演唱了其为由马楚成(微博)执导的电影《夏日乐悠悠》创作的主题曲《LOVE YOU YOU》,
这是林俊杰首次公开演唱这首新创作的歌曲却在现场出乎意料的引发全场歌迷大合唱场面感人。目前电影《夏日乐悠悠》计划于9月30日在全国上映,该片由当红明星Angelababy(微博)、彭于晏(微博)、朱雨辰(微博)、周扬(微博)
游戏 《海湾战争-空中王者》歼-20保卫升空一款绚丽的战斗画面,横越海、陆、空三大战场,加上紧张的剧情,就构成了今天小编给各大玩家介绍的这款射击游戏:《海湾战争-空中王者》。剧情介绍:海湾地区储存了丰富的石油和天然气,加上地处亚非欧交界,历为兵家必争之地。经过新一次全球经济危机之后,海湾战争一触即发,中国也被卷入纷争之中。为了掌握实地军情,我军派出了有空中王者之称的歼-20,远赴海湾地区……更多精彩震撼感觉,立即下载该款游戏尽情体验吧。玩家交流才是王道,讯易游戏玩家交流中心
a:现在开始读取数据,需要将样本进行进行向量化。上面共有三个标签,分别读取每个标签的第一行,通过如下代码实现:
def read_file(filename):
"""读取文件数据"""
contents, labels = [], []
with open_file(filename) as f:
for line in f:
try:
label, content = line.strip().split('\t')
if content:
contents.append(list(native_content(content)))
labels.append(native_content(label))
except:
pass
return contents, labels
经过这段代码后 labels里包含[‘体育’,‘娱乐’,‘游戏’]
`
contents [[‘鲍’, ‘勃’, ‘库’, ‘西’, ‘奖’, ‘归’, ‘谁’, ‘属’, ‘?’, ’ ', ‘N’, ‘C’, ‘A’, ‘A’, ‘最’, ‘强’, ‘控’, ‘卫’, ‘是’, ‘坎’, ‘巴’, ‘还’, ‘是’, ‘弗’, ‘神’, ‘新’, ‘浪’, ‘体’, ‘育’, ‘讯’, ‘如’, ‘今’, ‘,’, ‘本’, ‘赛’, ‘季’, ‘的’, ‘N’, ‘C’, ‘A’, ‘A’, ‘进’, ‘入’, ‘到’, ‘了’, ‘末’, ‘段’, ‘,’, ‘各’, ‘项’, ‘奖’, ‘项’, ‘的’, ‘评’, ‘选’, ‘结’, ‘果’, ‘也’, ‘即’, ‘将’, ‘出’, ‘炉’, ‘,’, ‘其’, ‘中’, ‘评’, ‘选’, ‘最’, ‘佳’, ‘控’, ‘卫’, ‘的’, ‘鲍’, ‘勃’, ‘-’, ‘库’, ‘西’, ‘奖’, ‘就’, ‘将’, ‘在’, ‘下’, ‘周’, ‘最’, ‘终’, ‘四’, ‘强’, ‘战’, ‘时’, ‘公’, ‘布’, ‘,’], [‘林’, ‘俊’, ‘杰’, ‘为’, ‘电’, ‘影’, ‘《’, ‘夏’, ‘日’, ‘乐’, ‘悠’, ‘悠’, ‘》’, ‘献’, ‘首’, ‘唱’, ‘H’, ‘O’, ‘L’, ‘D’, ‘住’, ‘全’, ‘场’, ‘新’, ‘浪’, ‘娱’, ‘乐’, ‘讯’, ’ ', ‘近’, ‘日’, ‘,’, ‘林’, ‘俊’, ‘杰’, ‘(’, ‘微’, ‘博’, ‘)’, ‘在’, ‘成’, ‘都’, ‘举’, ‘办’, ‘歌’, ‘友’, ‘会’, ‘,’, ‘现’, ‘场’, ‘演’, ‘唱’, ‘了’, ‘其’, ‘为’, ‘由’, ‘马’, ‘楚’, ‘成’, ‘(’, ‘微’, ‘博’, ‘)’, ‘执’, ‘导’, ‘的’, ‘电’, ‘影’, ‘《’, ‘夏’, ‘日’, ‘乐’, ‘悠’, ‘悠’, ‘》’, ‘创’, ‘作’, ‘的’, ‘主’, ‘题’, ‘曲’, ‘《’, ‘L’, ‘O’, ‘V’, ‘E’, ’ ‘, ‘Y’, ‘O’, ‘U’, ’ ‘, ‘Y’, ‘O’, ‘U’, ‘》’, ‘,’], [’《’, ‘海’, ‘湾’, ‘战’, ‘争’, ‘-’, ‘空’, ‘中’, ‘王’, ‘者’, ‘》’, ‘歼’, ‘-’, ‘2’, ‘0’, ‘保’, ‘卫’, ‘升’, ‘空’, ‘一’, ‘款’, ‘绚’, ‘丽’, ‘的’, ‘战’, ‘斗’, ‘画’, ‘面’, ‘,’, ‘横’, ‘越’, ‘海’, ‘、’, ‘陆’, ‘、’, ‘空’, ‘三’, ‘大’, ‘战’, ‘场’, ‘,’, ‘加’, ‘上’, ‘紧’, ‘张’, ‘的’, ‘剧’, ‘情’, ‘,’, ‘就’, ‘构’, ‘成’, ‘了’, ‘今’, ‘天’, ‘小’, ‘编’, ‘给’, ‘各’, ‘大’, ‘玩’, ‘家’, ‘介’, ‘绍’, ‘的’, ‘这’, ‘款’, ‘射’, ‘击’, ‘游’, ‘戏’, ‘:’, ‘《’, ‘海’, ‘湾’, ‘战’, ‘争’, ‘-’, ‘空’, ‘中’, ‘王’, ‘者’, ‘》’, ‘。’, ‘剧’, ‘情’, ‘介’, ‘绍’, ‘:’, ‘海’, ‘湾’, ‘地’, ‘区’, ‘储’, ‘存’, ‘了’, ‘丰’, ‘富’, ‘的’, ‘石’, ‘油’, ‘和’, ‘天’, ‘然’, ‘气’, ‘,’, ‘加’, ‘上’, ‘地’, ‘处’, ‘亚’, ‘非’, ‘欧’, ‘交’, ‘界’, ‘,’, ‘历’, ‘为’, ‘兵’, ‘家’, ‘必’, ‘争’, ‘之’, ‘地’, ‘。’, ‘经’, ‘过’, ‘新’, ‘一’, ‘次’, ‘全’, ‘球’, ‘经’, ‘济’, ‘危’, ‘机’, ‘之’, ‘后’, ‘,’, ‘海’, ‘湾’, ‘战’, ‘争’, ‘一’, ‘触’, ‘即’, ‘发’, ‘,’, ‘中’, ‘国’, ‘也’, ‘被’, ‘卷’, ‘入’, ‘纷’, ‘争’, ‘之’, ‘中’, ‘。’, ‘为’, ‘了’, ‘掌’, ‘握’, ‘实’, ‘地’, ‘军’, ‘情’, ‘,’, ‘我’, ‘军’, ‘派’, ‘出’, ‘了’, ‘有’, ‘空’, ‘中’, ‘王’, ‘者’, ‘之’, ‘称’, ‘的’, ‘歼’, ‘-’, ‘2’, ‘0’, ‘,’, ‘远’, ‘赴’, ‘海’, ‘湾’, ‘地’, ‘区’, ‘…’, ‘…’, ‘更’, ‘多’, ‘精’, ‘彩’, ‘震’, ‘撼’, ‘感’, ‘觉’, ‘,’, ‘立’, ‘即’, ‘下’, ‘载’, ‘该’, ‘款’, ‘游’, ‘戏’, ‘尽’, ‘情’, ‘体’, ‘验’, ‘吧’, ‘。’, ‘玩’, ‘家’, ‘交’, ‘流’, ‘才’, ‘是’, ‘王’, ‘道’, ‘,’, ‘讯’, ‘易’, ‘游’, ‘戏’, ‘玩’, ‘家’, ‘交’, ‘流’, ‘中’, ‘心’]]
上面是contents包含的内容。
接着构建词汇表,准确来说是字汇表,我看的文章是字符集的文本分类。构建词汇表可以根据以下代码实现
def build_vocab(train_dir, vocab_dir, vocab_size=5000):
train_dir就是上面样本的路径
data_train, _ = read_file(train_dir)
#data_train就是上面的contents
all_data = []
for content in data_train: #content对应的就是标签后的那一行
print('content',content)
all_data.extend(content)
print('all_data:',all_data)
all_data的内容可以和data_train比较下
all_data:all_data: ['鲍', '勃', '库', '西', '奖', '归', '谁', '属', '?', ' ', 'N', 'C', 'A', 'A', '最', '强', '控', '卫', '是', '坎', '巴', '还', '是', '弗', '神', '新', '浪', '体', '育', '讯', '如', '今', ',', '本', '赛', '季', '的', 'N', 'C', 'A', 'A', '进', '入', '到', '了', '末', '段', ',', '各', '项', '奖', '项', '的', '评', '选', '结', '果', '也', '即', '将', '出', '炉', ',', '其', '中', '评', '选', '最', '佳', '控', '卫', '的', '鲍', '勃', '-', '库', '西', '奖', '就', '将', '在', '下', '周', '最', '终', '四', '强', '战', '时', '公', '布', ',', '林', '俊', '杰', '为', '电', '影', '《', '夏', '日', '乐', '悠', '悠', '》', '献', '首', '唱', 'H', 'O', 'L', 'D', '住', '全', '场', '新', '浪', '娱', '乐', '讯', ' ', '近', '日', ',', '林', '俊', '杰', '(', '微', '博', ')', '在', '成', '都', '举', '办', '歌', '友', '会', ',', '现', '场', '演', '唱', '了', '其', '为', '由', '马', '楚', '成', '(', '微', '博', ')', '执', '导', '的', '电', '影', '《', '夏', '日', '乐', '悠', '悠', '》', '创', '作', '的', '主', '题', '曲', '《', 'L', 'O', 'V', 'E', ' ', 'Y', 'O', 'U', ' ', 'Y', 'O', 'U', '》', ',', '《', '海', '湾', '战', '争', '-', '空', '中', '王', '者', '》', '歼', '-', '2', '0', '保', '卫', '升', '空', '一', '款',