frank_zhaojianbo-优快云博客

原创 Bert 结构详解

Bert 结构详解1 Bert 模型结构图1，我们导入bert 14 分类model，并且打印出模型结构。https://blog.youkuaiyun.com/frank_zhaojianbo/article/details/107547338?spm=1001.2014.3001.5501图 2 是BertForSequenceClassification 模型的结构，可以看出 bert Model 有两大部分组成，embeddings 和 encoder。上面我们已经介绍过了transformer，B

2022-02-04 00:06:49 7464 1

原创 transformer 详解

Transformer 详解首先，我们从宏观的视角开始讲一下 Transformer。编码器 (encoder)注意力机制查询向量、键向量和值向量矩阵运算实现自注意力机制多头注意力位置编码表示序列的顺序(positional embedding)残差模块层-归一化步骤：https://arxiv.org/abs/1607.06450解码机(decoder)在完成编码阶段后，则开始解码阶段。解码阶段的每个步骤都会输出一个输出序列（在这个例

2022-02-03 21:40:29 1880

原创意图识别bert

import sysimport osimport copyimport jsonimport loggingimport argparseimport torchimport numpy as npfrom tqdm import tqdm, trangeimport torch.nn as nnfrom torchcrf import CRFfrom torch.utils.data import TensorDatasetfrom seqeval.metrics import

2021-06-07 01:51:03 2169 3

原创推荐模型LightGBM

import osimport pandas as pdimport numpy as npfrom tqdm import tqdmfrom sklearn.decomposition import LatentDirichletAllocationfrom sklearn.metrics import accuracy_scoreimport timeimport datetimefrom scipy.sparse import hstackfrom sklearn.model_se

2021-06-06 11:21:20 373

原创推荐模型DeepFM

import osimport pandas as pdimport numpy as npfrom tqdm import tqdmfrom tqdm.autonotebook import *from sklearn.decomposition import LatentDirichletAllocationfrom sklearn.preprocessing import LabelEncoderfrom sklearn.metrics import accuracy_scoreimp

2021-06-06 11:11:34 425

原创推荐模型LSTM+Transformer

import osimport pandas as pdimport numpy as npfrom tqdm import tqdmfrom sklearn.decomposition import LatentDirichletAllocationfrom sklearn.metrics import accuracy_scoreimport timeimport datetimefrom scipy.sparse import hstackfrom sklearn.model_se

2021-06-06 11:08:20 4487 6

原创推荐模型LSTM

import osimport pandas as pdimport numpy as npfrom tqdm import tqdmfrom sklearn.decomposition import LatentDirichletAllocationfrom sklearn.metrics import accuracy_scoreimport timeimport datetimefrom scipy.sparse import hstackfrom sklearn.model_se

2021-06-06 11:01:41 863

原创电商推荐系统架构

电商推荐系统架构## 数据工具包import numpy as npimport pandas as pdfrom tqdm import tqdm## 字符串处理工具包import stringimport reimport gensimfrom collections import Counterimport picklefrom nltk.corpus import stopwordsfrom sklearn.feature_extraction.text import T

2021-06-05 20:49:56 400 1

原创文本分类模型对比

使用kashgari快速搭建文本分类模型，对比各个模型间的正确率。import kashgarifrom kashgari.embeddings import BERTEmbedding# chinese_roberta_wwm_largebert_embed = BERTEmbedding('chinese_roberta_wwm_large', task=kashgari.CLASSIFICATION, sequence_length=256)导入数据集from kashgari.corp

2021-06-05 18:00:26 691 1

原创推荐系统框架详解

推荐系统框架详解1. 数据准备本项目的数据是采集电商网站的用户行为数据，主要包含用户的4种行为：搜索、点击、下单和支付。电商数据保存在log中，我们一般处理之后存入MySQL数据库中使用spark 查看数据2.召回与评估## 数据工具包import randomimport numpy as npimport pandas as pdfrom tqdm import tqdmimport collectionsfrom collections import defaultdict

2021-06-05 17:53:41 976

原创 FFM

FFM 的实例我们的数据import pandas as pddf = pd.read_csv("./train.csv")df PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked

2021-02-25 23:18:25 201

原创 FM deepFM

FM 和 deepFM 的实例FM内部结构deepFM内部结构我们的数据df = pd.read_csv("./train.csv")df PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked

2021-02-25 23:13:33 313

原创基于数据库的知识图库问答系统

CCKS 2020：新冠知识图谱构建与问答评测（四）新冠知识图谱问答评测连接：https://www.biendata.xyz/competition/ccks_2020_7_4/evaluation/初始化数据种子引入需要的包import osimport reimport mathimport torchimport randomimport pickleimport numpy as npimport codecs as csimport pandas as pdimport

2020-09-16 14:26:05 1372 1

原创 lstm词性标注

LSTM 词性标注（POS tagging）本代码原理：这个模型极其的简单，就是把词向量 w 放入lstm 中训练，根据对应得tags 计算loss 并且跟新模型权重。参考论文：https://arxiv.org/pdf/1510.06168.pdfimport osimport torchimport torch.nn as nnimport torch.optim as optimimport torch.nn.functional as Ffrom torch.optim impo

2020-08-10 10:26:40 1955 3

原创基于 lstm的英文翻译系统

基于 lstm的英文翻译系统代码原理是根据论文： Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation链接： https://arxiv.org/pdf/1406.1078.pdf?source=post_page---------------------------import torchif torch.cuda.is_available(): # Te

2020-08-06 06:44:24 1825 3

原创 bert 中文基于文本的问答系统

bert 中文基于文本的问答系统# -!- coding: utf-8 -!-import torchif torch.cuda.is_available(): device = torch.device("cuda") print('there are %d GPU(s) available.'% torch.cuda.device_count()) print('we will use the GPU: ', torch.cuda.get_device_name(0))e

2020-08-06 06:43:29 9107 7

翻译 multi-heads attention 机制和代码详解

Self-Attention说下面的句子是我们要翻译的输入句子：”The animal didn’t cross the street because it was too tired”这句话中的“它”指的是什么？是指街道还是动物？对人类来说，这是一个简单的问题，但对算法而言却不那么简单。当模型处理单词“ it”时，自我关注使它可以将“ it”与“ animal”相关联。在模型处理每个单词（输入序列中的每个位置）时，自我关注使其能够查看输入序列中的其他位置以寻找线索，从而有助于更好地对该单词进

2020-07-28 13:07:20 7173 1

原创文本情感二分类(加权平均模型(Attention Weighted word averaging) )

文本情感二分类(加权平均模型(Attention Weighted word averaging) )# -*- coding:utf-8 -*-import randomfrom collections import defaultdict, Counterimport numpy as npimport torchimport torch.nn as nnimport torch.nn.functional as Fdef set_random_seed(seed): rand

2020-07-24 16:03:30 1700

原创 bert 新闻分类系统

import torchif torch.cuda.is_available(): device = torch.device("cuda") print('There are %d GPU(s) available.' % torch.cuda.device_count()) print('We will use the GPU:', torch.cuda.get_device_name(0))else: print('No GPU available, using t

2020-07-23 20:41:16 4014 1

原创 RNN LSTM GRU 代码实战 ---- 简单的文本生成任务

RNN LSTM GRU 代码实战 ---- 简单的文本生成任务import torchif torch.cuda.is_available(): # Tell PyTorch to use the GPU. device = torch.device("cuda") print('There are %d GPU(s) available.' % torch.cuda.device_count()) print('We will use the GPU:', torch

2020-07-17 10:42:52 2648

原创 RNN LSTM 以及 GRU 详解

![在这里插入图片描述](https://img-blog.csdnimg.cn/2020071706443680.png

2020-07-17 07:11:00 225

原创 LSTM简单的文本生成

模型原理：利用LSTM模型，用前面的单词预测下一个单词。例如 I like the world.我们在这句话的首尾加上<s>和</s>，<s>代表开始</s>代表结束。 Input data 是 <s> I like theworld.，labels 为 I like the world.</s>.代码：import torchif torch.cuda.is_available(): # Tell PyTo

2020-07-12 14:33:14 1247

原创文本摘要生成任务

Keras 文本摘要生成任务，前面部分是代码解析，后面部分是原理讲解。1.代码讲解2.uniflm 原理讲解3.beam search 原理讲解1.代码讲解首先安装bert4keras==0.2.0!pip install bert4keras==0.2.0引入所用的包from tqdm import tqdmimport os, json, codecsfrom bert4keras.bert import load_pretrained_modelfrom bert4keras

2020-07-06 12:41:45 1996 3

原创中文短文本的实体链指

import torchif torch.cuda.is_available(): # Tell PyTorch to use the GPU. device = torch.device("cuda") print('There are %d GPU(s) available.' % torch.cuda.device_count()) print('We will use the GPU:', torch.cuda.get_device_name(0))else:

2020-06-20 22:39:50 4343 14

原创 bert命名实体识别

导入命名实体import torchimport pandas as pdimport numpy as nppath = './'comments = pd.read_csv(path + '英文命名实体信息.csv', encoding="latin1").fillna(method="ffill")print('命名实体总数：%d' % comments.shape[0])Tags = list(set(comments['Tag']))for tag in Tags: p

2020-06-12 21:39:42 2597 1

原创 bert中文情感分析多分类任务详解

查看GPU版本和使用情况import torchif torch.cuda.is_available(): device = torch.device("cuda") print('There are %d GPU(s) available.' % torch.cuda.device_count()) print('We will use the GPU:', torch.cuda.get_device_name(0))else: print('No GPU avail

2020-06-11 02:55:19 7411 26

原创 bert中文情感分析二分类任务详解

查看GPU版本和使用情况import torchif torch.cuda.is_available(): device = torch.device("cuda") print('There are %d GPU(s) available.' % torch.cuda.device_count()) print('We will use the GPU:', torch.cuda.get_device_name(0))else: print('No GPU avail

2020-06-11 02:36:52 4019 7

原创北京大气污染PM2.5预测(LSTM)

import tensorflow as tffrom tensorflow import kerasfrom tensorflow.keras import layersimport pandas as pdimport numpy as npimport matplotlib.pyplot as plt%matplotlib inlinedata = pd.read_excel('./北京空气_2010.1.1-2014.12.31.xlsx')数据处理：把PM2.5 为null的数

2020-05-13 14:12:19 8167 8

原创手写skip-gram pytorch

import torchimport numpy as npimport torch.nn.functional as Ffrom torch.autograd import Variablefrom collections import Counterimport randomMAX_VOCAB_SIZE = 30000S = 'F:\\机器学习高级'path = S + "...

2020-03-04 16:02:57 528

原创隐HMM词性标注以及代码详解

# 简单实现HMM以及详细解释import nltkimport sysfrom nltk.corpus import brown brown_tags_words = [ ]# P(tags | words) 正比于 P(ti | t{i-1}) * P(wi | ti) A * B (状态矩阵概率 * 发射概率)# 这里需要做的预处理是：给词们...

2020-02-29 08:14:20 1266

原创自建词典使用CountVectorizer实现多语种检验器

data,csv代码实现：from sklearn.model_selection import train_test_splitfrom sklearn.naive_bayes import MultinomialNB #采用多项式的朴素贝叶斯from sklearn.feature_extraction.text import CountVectorizer # 从sklearn...

2020-02-21 12:46:39 976

原创 Adaboost 详解以及案例

2020-02-14 22:48:58 505

原创 Skip-Gram 代码详解（pytorch）

import torchimport torch.nn as nnimport torch.nn.functional as Fimport torch.utils.data as tudfrom collections import Counterimport numpy as npimport randomimport mathimport pandas as pdimp...

2020-02-14 21:53:32 993

frank_zhaojianbo的博客

原创 Bert 结构详解

原创 transformer 详解

原创意图识别bert

原创推荐模型LightGBM

原创推荐模型DeepFM

原创推荐模型LSTM+Transformer

原创推荐模型LSTM

原创电商推荐系统架构

原创文本分类模型对比

原创推荐系统框架详解

原创 FFM

原创 FM deepFM

原创基于数据库的知识图库问答系统

原创 lstm词性标注

原创基于 lstm的英文翻译系统

原创 bert 中文基于文本的问答系统

翻译 multi-heads attention 机制和代码详解

原创文本情感二分类(加权平均模型(Attention Weighted word averaging) )

原创 bert 新闻分类系统

原创 RNN LSTM GRU 代码实战 ---- 简单的文本生成任务

原创 RNN LSTM 以及 GRU 详解

原创 LSTM简单的文本生成

原创文本摘要生成任务

原创中文短文本的实体链指

原创 bert命名实体识别

原创 bert中文情感分析多分类任务详解

原创 bert中文情感分析二分类任务详解

原创北京大气污染PM2.5预测(LSTM)

原创手写skip-gram pytorch

原创隐HMM词性标注以及代码详解

原创自建词典使用CountVectorizer实现多语种检验器

原创 Adaboost 详解以及案例

原创 Skip-Gram 代码详解（pytorch）

英文命名实体信息.csv

社交网站评论信息情感分类.csv

部分酒店评论信息.csv

MULTI_LAYERS_LSTM.h5

空空如也