论文阅读笔记
文章平均质量分 88
论文阅读笔记是学术研究的重要工具,用于系统性整理文献核心观点、研究方法与个人见解,帮助构建知识体系并辅助后续写作。
happyprince
这个作者很懒,什么都没留下…
展开
专栏收录文章
- 默认排序
- 最新发布
- 最早发布
- 最多阅读
- 最少阅读
-
[论文阅读笔记79]An Index-based Approach for Efficient and Effective Web Content Extrac
本文提出基于索引的网页内容提取方法(Index-based Web Content Extraction),通过将HTML分割为结构化片段并预测相关位置索引,有效解决现有方法在效率与适应性上的不足。该方法训练了IndexLM系列模型(0.6B/1.7B/4B参数),在RAGQA系统和直接评估中表现优异:平均F1最高达57.94(RAGQA)、87.40(主内容提取)、31.69(查询相关提取),提取速度比生成式方法快10倍以上。实验表明,该方法能有效处理海量网页内容,克服LLM上下文限制问题。原创 2025-12-22 00:47:39 · 496 阅读 · 0 评论 -
[论文阅读笔记78]BookRAG:A Hierarchical Structure-aware Index-based Approach for Retrie
BookRAG提出了一种针对复杂层级文档(如书籍、手册)的新型检索增强生成方法,其创新点在于:1)构建BookIndex索引结构,融合文档原生层级树和细粒度知识图谱;2)基于信息觅食理论实现智能检索策略,动态分类查询类型并匹配定制化流程。该方法通过梯度基实体消歧优化知识图谱质量,在三大基准测试中取得SOTA性能,最高检索召回率达71.2%,QA准确率显著提升,同时保持高效响应速度和低token消耗。核心优势在于有效结合文档结构与语义信息,解决了传统RAG方法处理复杂文档的局限性。原创 2025-12-22 00:39:35 · 562 阅读 · 0 评论 -
Kimi聊天的人太多,要排队
Kimi K2思维模型技术解析:这款原生INT4量化模型采用1.04万亿参数的MoE架构,通过384个专家模块和MLA注意力机制实现高效推理。其关键技术包括:MuonClip优化器确保15.5万亿token训练的稳定性;创新的后训练方法结合3000+真实工具库;强化学习框架VerifiableRewards实现可验证奖励机制。相比同类模型,K2在长序列处理(256k上下文)和复杂任务(200-300步调用)上表现突出,GPU内存占用仅30GB,推理延迟降低50%。该模型在代码、数学等多领域展现出优异的Age原创 2025-11-09 23:59:03 · 978 阅读 · 0 评论 -
[论文阅读笔记01]Neural Architectures for Nested NER through Linearization
[论文阅读笔记]Neural Architectures for Nested NER through Linearization题目 Neural Architectures for Nested NER through Linearization【基于线性化方法的嵌套NER神经结构】摘要 提出了两种结构与一个BILOU sckema.第一种结构,基于LSTM+CRF标准模型,把所有标签经过笛卡尔乘积的形式组合成多标签任务;第二种结构,把Nested NER任务看作为seq2seq任务来处理,由tok原创 2020-08-21 12:00:27 · 1490 阅读 · 3 评论 -
[论文阅读笔记02]Nested Named Entity Recognition via Second-best Sequence Learning and Decoding
题目Nested Named Entity Recognition via Second-best Sequence Learning and Decoding通过次优序列学习和解码嵌套命名实体识别摘要在训练神经模型上,设计了一个目标函数去处理嵌套实体的标签序列作为在双新实体跨度的次优路径;在解码预测上,使用从外部到内部的迭代提取实体的方式;效果为目前领先。问题背景这个句子来自GENIA dataset。主体中包含了三个实体。也从很多文献得到,实体嵌套是一个很普遍的现象。如果默认为实体命名原创 2020-08-26 11:53:21 · 1652 阅读 · 0 评论 -
[论文阅读笔记03]Multi-Graid Namened Entity Recognition
题目Multi-Grained Named Entity Recognition多粒度命名实体识别作者University of Illinois at Chicago,Tencent Medical AI Lab,Alibaba Group,University at Buffalo,Zhejiang Lab摘要主要提出MGNER框架(Multi-Grained Named Entity Recognition).相对于一般的实体识别任务,MGNER发现与识别多跨度的实体,无论是非重叠的或者完原创 2020-08-31 18:08:42 · 634 阅读 · 0 评论 -
[论文阅读笔记04]GFTE:Graph-based Financial Table Extraction
一,题目GFTE:Graph-based Financial Table Extraction 【GFTE:基于图方法的财务表格抽取】二,作者Yiren Li∗, Zheng Huang†, Junchi Yan‡, Yi Zhou§, Fan Ye¶ and Xianhui LiukShanghai Jiao Tong University,China Financial Fraud Research Center三,解决问题[问题前景]当前的工具对财务表格数据抽取不理想,从而提出一个对于原创 2020-12-25 11:39:20 · 1256 阅读 · 1 评论 -
[论文阅读笔记05]Deep Active Learning for Named Entity Recognition
一,题目Deep Active Learning for Named Entity Recognition【NER任务的深度主动学习】来源:ICLR 2018原文:DEEP ACTIVE LEARNING FOR NAMED ENTITY RECOGNITION二,作者Yanyao Shen,Hyokun Yun,Zachary C. Lipton,Yakov Kronrod,Animashree AnandkumarUniversity of Texas at Austin[得克萨斯大学奥斯汀原创 2020-12-30 18:05:08 · 1136 阅读 · 0 评论 -
[论文阅读笔记06]OpenUE:An Open Toolkit of Universal Extraction from Text
一,论文题目OpenUE: An Open Toolkit of Universal Extraction from TextOpenUE:一个开源的通用文本信息抽取工具发表会议:EMNLP 2020 (Demo)二,本文作者张宁豫,浙江大学讲师/alibaba,研究方向为自然语言处理、知识图谱;本文是浙江大学和阿里达摩院合作发表在EMNLP2020上的Demo论文。三, 摘要提出了大多数的NLP任务都可以用单一模型来表示的思想,提供了开源与可扩展的抽取工具OpenUE【3】;另外布署了re原创 2021-01-04 23:34:14 · 1676 阅读 · 0 评论 -
[论文阅读笔记07]Learning from Context or Names? An Empirical Study on Neural Relation Extraction
1. 题目Learning from Context or Names?An Empirical Study on Neural Relation Extraction从上下文学习还是从实体名称中学习?一个关于神经关系抽取的实证研究2. 作者Hao Peng1∗ , Tianyu Gao2∗ , Xu Han1 , Yankai Lin3 , Peng Li3 , Zhiyuan Liu1*†* ,Maosong Sun1 , Jie Zhou3单位:Tsinghua University,P原创 2021-01-07 14:54:19 · 1524 阅读 · 0 评论 -
[论文阅读笔记08]Generalizing from a Few Examples:A Survey on Few-Shot Learning
一,题目Generalizing from a Few Examples: A Survey on Few-Shot Learning从少样本中概括:少样本综述FSL: Few-Shot Learning二,作者YAQING WANG, Hong Kong University of Science and Technology and 4Paradigm IncQUANMING YAO, 4Paradigm IncJAMES T. KWOK, Hong Kong University of原创 2021-01-12 16:30:23 · 3945 阅读 · 0 评论 -
[论文阅读笔记09]A Frustratingly Easy Approach for Joint Entity and Relation Extraction
一,题目A Frustratingly Easy Approach for Joint Entity and Relation Extraction一种简单易行的联合实体和关系提取方法二,作者Zexuan ZhongDanqi Chen:https://www.cs.princeton.edu/~danqic/Email: danqic@cs.princeton.eduDepartment of Computer Science Princeton University 普林斯顿大学 世界排名原创 2021-01-15 16:49:44 · 3577 阅读 · 3 评论 -
[论文阅读笔记10]A General Framework for Information Extraction using Dynamic Span Graphs
1. 题目论文题目:A General Framework for Information Extraction using Dynamic Span Graphs使用动态跨度图提取信息的通用框架论文来源:NAACL 2019 Google AI Language, 华盛顿大学论文链接:https://www.aclweb.org/anthology/N19-1308/代码链接:https://github.com/luanyi/DyGIE关键词:信息抽取,dynamic span graph,原创 2021-01-19 11:13:30 · 1226 阅读 · 1 评论 -
[论文阅读笔记11]Entity,Relation,Event Extraction with Contextualized Span Representations
1. 题目论文题目:Entity, Relation, and Event Extraction with Contextualized Span Representations论文来源:EMNLP 2019 华盛顿大学, Google AI Language论文链接:https://www.aclweb.org/anthology/D19-1585/ https://arxiv.org/pdf/1909.03546.pdf代码链接:https://github.com/dwadden/dygie原创 2021-01-21 09:21:17 · 1863 阅读 · 0 评论 -
[论文阅读笔记12]An Effective Transition-based Model for Discontinuous NER
一, 题目《An Effective Transition-based Model for Discontinuous NER》论文:An Effective Transition-based Model for Discontinuous NER.pdf代码: https://github.com/daixiangau/acl2020-transition-discontinuous-ner**实验数据:**https://data.csiro.au/dap/landingpage?pid=csi原创 2021-01-21 18:15:09 · 2642 阅读 · 0 评论 -
[论文阅读笔记13]A Survey on Deep Learning for Named Entity Recognition
1. 题目A Survey on Deep Learning for Named Entity RecognitionNER的深度学习综述2. 作者Jing Li, Aixin Sun, Jianglei Han, and Chenliang LiNanyang Technological University 南洋理工大学SAPWuhan University 武汉大学Accepted in IEEE TKDETKDE:Transactions on Knowledge an原创 2021-01-27 15:47:01 · 2459 阅读 · 2 评论 -
[论文阅读笔记14]Nested named entity recognition revisited
一, 题目Nested Named Entity Recognition Revisited重访问的嵌套命名实体识别二, 作者Arzoo Katiyar and Claire CardieDepartment of Computer ScienceCornell University 康奈尔大学 (世界顶级私立研究型大学,2021QS世界大学排名世界第18)Ithaca, NY, 14853, USA三,摘要对RNN的创新,提出识别与检测嵌套NER的方法,从RNN中抽取出一个超图表示。原创 2021-01-28 17:26:15 · 1970 阅读 · 0 评论 -
[论文阅读笔记15]Recognizing Complex Entity Mentions:A Review and Future Directions
一,题目Recognizing Complex Entity Mentions:A Review and Future Directions识别复杂实体mentions:回顾与未来方向Dai X . Recognizing Complex Entity Mentions: A Review and Future Directions[C]// The ACL 2018 Student Research Workshop. 2018.二,作者Xiang DaiCSIRO Data61 and Sc原创 2021-01-29 16:30:32 · 626 阅读 · 0 评论 -
[论文阅读笔记16]More data,relations,context ,openness:A review and outlook for relation extraction
一. 题目More data, more relations, more context and more openness: A review and outlook for relation extraction.关系抽取的回顾与展望论文:https://arxiv.org/pdf/2004.03186.pdf2020年引用:Xu Han, Tianyu Gao, Yankai Lin, Hao Peng, Yaoliang Yang, Chaojun Xiao, Zhiyuan Liu,原创 2021-02-04 17:47:05 · 1008 阅读 · 0 评论 -
[论文阅读笔记17]A Survey on Knowledge Graph-Based Recommender Systems
一,题目TKDE 2020A Survey on Knowledge Graph-Based Recommender Systems综述:基于知识图谱的推荐系统In IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE), 2020doi: 10.1109/TKDE.2020.3028705.二,作者Qingyu Guo, Fuzhen Zhuang, Chuan Qin, Hengshu Zhu, Xing Xie, H原创 2021-02-19 17:41:31 · 5239 阅读 · 1 评论 -
[论文阅读笔记18] Jointly Multiple EE via Attention-based Graph Information Aggregation
1. 论文题目Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation论文来源:EMNLP 2018论文链接:https://arxiv.org/abs/1809.09078代码链接:https://github.com/lx865712528/EMNLP2018-JMEE关键词:多事件抽取,GCN,attention,句法依存结构2. 作者Xiao Liu†andZhunchen原创 2021-02-22 17:37:36 · 857 阅读 · 0 评论 -
[论文阅读笔记19]Scalable multi-hop relational reasoning for knowledge-aware question answering
1. 题目知识感知问答的可扩展性多跳关系推理模型Feng Y, Chen X, Lin B Y, et al. Scalable multi-hop relational reasoning for knowledge-aware question answering[J]. 2020.emnlp-main.99链接:https://arxiv.org/pdf/2005.00646.pdfGitHub项目地址:https://github.com/INK-USC/MHGRN2. 作者Yanlin原创 2021-02-24 23:46:18 · 1875 阅读 · 1 评论 -
[论文阅读笔记20]TEMPORALENSEMBLING FORSEMI-SUPERVISED LEARNING
作者:NVIDIASamuli LaineTimo Aila年份:2016核心原则:一致性正则(consistency regularization)一致性正则要求一个模型对相似的输入有相似的输出,即给输入数据注入噪声,模型的输出应该不变,模型是鲁棒的。来自【5】的描述提出模型模型1:П-model 从这个模型流程图可以知道,数据样本xi(例如论文中数据为图片),经过了两次模型进行随机计算(或者理解为加入一些对抗的因子)。由于经过了带有随机性的计算,故计算结果会有一定误差,这个计算原创 2021-02-25 22:14:44 · 737 阅读 · 0 评论 -
[论文阅读笔记21]Mean teachers are better role models
论文标题:Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning resultsAntti Tarvainen(芬兰,阿尔托大学), Harri Valpola (The Curious AI Company) 作者有一句话***“目前在用的所有人工智能都是二流的”***The Curious AI Company公司是一家芬兰深度原创 2021-02-25 22:17:42 · 4568 阅读 · 0 评论 -
[论文阅读笔记22]Pseudo-Label:简单有效的半监督学习方法
题目:Pseudo-Label:The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks年份:2013作者:Dong-Hyun LeeUniversité de Montréal — 蒙特利尔大学–加拿大论文链接:https://www.researchgate.net/profile/Dong-Hyun-Lee/publication/280581078_Pseudo-Label_The原创 2021-02-26 11:23:35 · 7583 阅读 · 0 评论 -
[论文阅读笔记23]MixText:TMix数据增强的MixText半监督方法去文本分类
1. 题目ACL20-《MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification》。论文下载地址:https://arxiv.org/pdf/2004.12239论文开源地址:https://github.com/GT-SALT/MixText2. 作者Jiaao Chen Georgia Tech jchen896@gatech.edu 佐治亚理原创 2021-02-26 16:19:49 · 1221 阅读 · 1 评论 -
[论文阅读笔记24]NeuroNER:神经NER工具
1.题目:模型论文:Feature-Augmented Neural Networks for Patient Note De-identificationhttp://export.arxiv.org/pdf/1610.09704工具论文:NeuroNER: an easy-to-use program for named-entity recognition based on neural networks一个基于神经网络的简单易用的命名实体识别程序2.论文地址:https://www.ac原创 2021-03-01 13:54:38 · 532 阅读 · 0 评论 -
[论文阅读笔记25]GCN4NER:用GCN模型来解决NER
题目:Graph Convolutional Networks for Named Entity RecognitionCetoli A , Bragaglia S , O’Harney A D , et al. Graph Convolutional Networks for Named Entity Recognition[J].https://arxiv.org/pdf/1709.10053.pdf作者:Cetoli, A. Bragaglia, S. O’Harney, A. D. Slo原创 2021-03-01 17:47:44 · 1796 阅读 · 0 评论 -
[论文阅读笔记26]MRC4NER:使用阅读理解方法来解决NER任务
题目A Unified MRC Framework for Named Entity Recognition命名实体识别的统一MRC框架论文URL:https://www.semanticscholar.org/paper/A-Unified-MRC-Framework-for-Named-Entity-Li-Feng/d3c7971f5e1e13712a31722073983599bf71ac43代码 : https://github.com/ShannonAI/mrc-for-flat-nest原创 2021-03-03 12:16:10 · 2218 阅读 · 0 评论 -
[论文阅读笔记27]biaffine4NER:双仿射分类器在NER的应用
题目Named Entity Recognition as Dependency ParsingYu, J., Bohnet, B., & Poesio, M. (2020). Named Entity Recognition as Dependency Parsing. ArXiv, abs/2005.07150.代码:https://github.com/juntaoy/biaffine-ner作者Juntao YuQueen Mary University London, UK 伦原创 2021-03-03 17:37:58 · 11846 阅读 · 3 评论 -
关于NER过程的常用代码[持续更新]
从BIO标注序列中抽取实体,即转成ANN文件格式def seq_to_enity(string, predict): """ 标签转录BIO格式 :param string: 例子:"小明,出生于广州,3岁,学英文;5岁学哲学;想成为,麻省理工学院的老师。" :param predict: 例子:["B-per", "I-per", "O","O", "O", "O", "B-loc", "I-loc", "O", "B.原创 2021-03-18 11:18:45 · 938 阅读 · 2 评论 -
[论文阅读笔记28]Deep Biaffine Attention for Neural Dependency Parsing
题目Deep Biaffine Attention for Neural Dependency Parsing论文:https://arxiv.org/pdf/1611.01734.pdf代码:https://github.com/tdozat/Parser-v1https://github.com/bamtercelboo/PyTorch_Biaffine_Dependency_Parsing作者Timothy DozatStanford University 斯坦福大学Christop原创 2021-03-19 18:04:27 · 2759 阅读 · 0 评论 -
[论文阅读笔记29]生物医学文本摘要(Biomedical Text Summarization)
论文:Clinical Context–Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and ValidationAfzal M, Alam F, Malik K, Malik GClinical Context–Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Va原创 2021-03-19 18:06:14 · 1516 阅读 · 0 评论 -
[论文阅读笔记30]关于pico抽取的研究-1(4篇文献)
论文1:Extracting PICO elements from RCT abstracts using 1-2gram analysis and multitask classificationYuan X , Xiaoli L , Shilei L , et al. Extracting PICO elements from RCT abstracts using 1-2gram analysis and multitask classification[C]// the third Interna原创 2021-04-30 17:39:55 · 3344 阅读 · 1 评论 -
[论文阅读笔记31]UER: An Open-Source Toolkit for Pre-training Models
题目UER: An Open-Source Toolkit for Pre-training Models单位:School of Information and DEKE, MOE, Renmin University of China, Beijing, ChinaTencent AI LabSchool of Electronics Engineering and Computer Science, Peking University, Beijing, ChinaZhao Z ,原创 2021-04-30 17:43:23 · 1107 阅读 · 1 评论 -
[论文阅读笔记32]A guide to deep learning in healthcare
题目A guide to deep learning in healthcarehttps://doi.org/10.1038/s41591-018-0316-zA guide to deep learning in healthcare.[J]. Nature Medicine, 2019.摘要深度学习在医学应用的四个方面:computer vision, natural language processing, reinforcement learning, generalized metho原创 2021-04-30 17:46:52 · 582 阅读 · 1 评论 -
[论文阅读笔记33]CASREL:基于标注与bert的实体与关系抽取
题目A Novel Cascade Binary Tagging Framework for Relational Triple Extraction一个关系三元组抽取的新型级联二元标记框架Jilin UniversityShenzhen Zhuiyi TechnologyUniversity of North Carolina at Chapel Hill 北卡罗来纳大学教堂山分校摘要解决问题: solving the overlapping triple problem.(解决重叠三原创 2021-05-06 22:44:11 · 5029 阅读 · 5 评论 -
[论文阅读笔记34]基于分解策略的实体与关系联合抽取
题目Joint Extraction of Entities and Relations Based on a Novel Decomposition StrategyChinese Academy of Sciences — 中科院Xiaomi AI Lab – 小米AI实验室Peking University – 北京大学摘要解决问题: redundant entity pairs(冗余的实体对);ignore the important inner structure (忽略了重要的内原创 2021-05-19 21:26:04 · 929 阅读 · 0 评论 -
[论文阅读笔记35]RobotReviewer
1. 题目RobotReviewer: evaluation of a system for automatically assessing bias in clinical trialsRobotReviewer: 在临床试验中自动评估偏差系统的评估2. 作者Iain J Marshall:英国伦敦国王学院初级保健和公共卫生科学系Joe¨l Kuiper:荷兰格罗宁根大学医学中心美国德克萨斯大学奥斯汀分校,奥斯汀分校信息学院PUBLISHED ONLINE FIRST 22 June 201原创 2021-05-23 07:27:28 · 651 阅读 · 0 评论 -
[论文阅读笔记36]CASREL代码运行记录
《[论文阅读笔记33]CASREL:基于标注与bert的实体与关系抽取》https://blog.youkuaiyun.com/ld326/article/details/116465089总的来说,文档都还是写得很好的,按文档(readme.md)来就行,不过有点小小不同就是文件的命名,作一个补充记录。0. 关于代码结构—值得学习,十分清晰1. 关于环境按说明的关键的几个句进行,可是依赖的包还是版本不对。这个是requirement.txt, 不过还是有些警告,先不处理警告:absl-py==0.12.0原创 2021-05-23 08:01:20 · 4018 阅读 · 50 评论
分享