
nlp
sinat_24395003
先学使用轮子,再学造轮子,再自己造轮子
展开
专栏收录文章
- 默认排序
- 最新发布
- 最早发布
- 最多阅读
- 最少阅读
-
tfidf的tf粗暴过滤相似文本的过程二(计算性能优化)
from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.metrics.pairwise import linear_kernel import numpy as np all_list= ['大雨預報1:16pm:大雨正影響台北東部,市民應提高警覺', '大雨預報1:02pm:大雨正影響台北東部,市民應提高警覺', '大雨預報12:35pm:大雨正影響台北東部,市民應提高警覺', '大雨預報3:46pm:未.原创 2020-09-18 13:11:08 · 374 阅读 · 0 评论 -
tfidf的tf粗暴过滤相似文本的过程
from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.metrics.pairwise import linear_kernel import numpy as np all_list= ['大雨預報1:16pm:大雨正影響台北東部,市民應提高警覺', '大雨預報1:02pm:大雨正影響台北東部,市民應提高警覺', '大雨預報12:35pm:大雨正影響台北東部,市民應提高警覺', '大雨預報3:46pm:未.原创 2020-09-17 14:55:07 · 265 阅读 · 0 评论 -
不均衡样本的sampler构建 Imbalanced Dataset Sampler
from fastNLP.io import SST2Pipe from fastNLP import DataSetIter from torchsampler import ImbalancedDatasetSampler pipe = SST2Pipe() databundle = pipe.process_from_file() vocab = databundle.vocabs['words'] print(databundle) print(databundle.datasets['train.原创 2020-08-12 10:06:51 · 1545 阅读 · 4 评论 -
理解 gluonnlp.data.batchify的pad机制
import math import mxnet as mx import numpy as np import warnings def _pad_arrs_to_max_length(arrs, pad_axis, pad_val, use_shared_mem, dtype, round_to=None): """Inner Implementation of the Pad batchify 填充[arr,arr]列表的数组维度是该维度下的最大值 Parameters .原创 2020-06-09 21:38:06 · 500 阅读 · 0 评论 -
CBOW的pytorch实现过程
代码来源 少量中文注解 纯学习https://github.com/joosthub/PyTorchNLPBook/blob/master/chapters/chapter_5/5_2_CBOW/5_2_Continuous_Bag_of_Words_CBOW.ipynb import json import os from argparse import Namespace from tq...原创 2019-12-19 18:58:00 · 1344 阅读 · 0 评论 -
3层LSTM+word2vec+rnnapi+beamsearch生成射雕英雄传第40章
参照https://github.com/PacktPublishing/Natural-Language-Processing-with-TensorFlow/blob/master/ch8/lstm_word2vec_rnn_api.ipynb embedding.py import re import os import numpy as np import collections im...原创 2019-07-04 17:02:14 · 755 阅读 · 0 评论 -
手撕最简化LSTM运用BeamSearch生成射雕英雄传第40章
#参照网址 https://github.com/PacktPublishing/Natural-Language-Processing-with-TensorFlow/blob/master/ch8/lstms_for_text_generation.ipynb import re import collections import numpy as np import tens...原创 2019-06-29 14:31:04 · 680 阅读 · 0 评论