机器学习经典论文(转载)

Active Learning

Two Faces of Active Learning250, Dasgupta, 2011

Active Learning Literature Survey63, Settles, 2010

Applications

A Survey of Emerging Approaches to Spam Filtering58, Caruana, 2012

Ambient Intelligence: A Survey18, Sadri, 2011

A Survey of Online Failure Prediction Methods23, Salfner, 2010

Anomaly Detection: A Survey11, Chandola, 2009

Mining Data Streams: A Review14, Gaber, 2005

Workflow Mining: A Survey of Issues and Approaches9, Aalst, 2003

Biology

Support Vector Machines in Bioinformatics: a Survey43, Chicco, 2012

Computational Epigenetics: The New Scientific Paradigm 6, Lim, 2010

Automated Protein Structure Classification: A Survey10, Hassanzadeh, 2009

Chemoinformatics - An Introduction for Computer Scientists6, Brown, 2009

Computational Challenges in Systems Biology4, Heath, 2009

Computational Epigenetics 6, Bock, 2008

Progress and Challenges in Protein Structure Prediction5, Zhang, 2008

A Review of Feature Selection in Bioinformatics6, Saeys, 2007

Machine Learning in Bioinformatics: A Brief Survey and Recommendations for Practitioners13, Bhaskar, 2006

Bioinformatics - An Introduction for Computer Scientists2, Cohen, 2004

Computational Systems Biology2, Kitano, 2002

Protein Structure Prediction and Structural Genomics4, Baker, 2001

Recent Developments and Future Directions in Computational Genomics2, Tsoka, 2000

Molecular Biology for Computer Scientists1, Hunter, 1993

Classification

Supervised Machine Learning: A Review of Classification Techniques99, Kotsiantis, 2007

Clustering

XML Data Clustering: An Overview18, Algergawy, 2011

Data Clustering: 50 Years Beyond K-Means21, Jain, 2010

Clustering Stability: An Overview9, Luxburg, 2010

Parallel Clustering Algorithms: A Survey7, Kim, 2009

A Survey: Clustering Ensembles Techniques3, Ghaemi, 2009

A Tutorial on Spectral Clustering9, Luxburg, 2007

Survey of Clustering Data Mining Techniques4, Berkhin, 2006

Survey of Clustering Algorithms7, Xu, 2005

Clustering of Time Series Data - A Survey4, Liao, 2005

Clustering Methods7, Rokach, 2005

Recent Advances in Clustering: A Brief Survey4, Kotsiantis, 2004

Subspace Clustering for High Dimensional Data: A Review2, Parsons, 2004

Unsupervised and Semi-supervised Clustering: a Brief Survey4, Grira, 2004

Clustering in Life Sciences3, Zhao, 2002

On Clustering Validation Techniques2, Halkidi, 2001

Data Clustering: A Review4, Jain, 1999

A Survey of Fuzzy Clustering5, Yang, 1993

Computer Vision

Pedestrian Detection: An Evaluation of the State of the Art26, Dollar, 2012

A Comparative Study of Palmprint Recognition Algorithms10, Zhang, 2012

Human Activity Analysis: A Review6, Aggarwal, 2011

Subspace Methods for Face Recognition7, Rao, 2010

Context Based Object Categorization: A Critical Survey5, Galleguillos, 2010

Object tracking: A Survey11, Yilmaz, 2006

Detecting Faces in Images: A Survey8, Yang, 2002

Databases

Data Fusion10, Bleiholder, 2008

Duplicate Record Detection: A Survey2, Elmagarmid, 2007

Overview of Record Linkage and Current Research Directions2, Winkler, 2006

A Survey of Schema-based Matching Approaches3, Shvaiko, 2005

Deep Learning

Representation Learning: A Review and New Perspectives84, Bengio, 2012

Dimension Reduction

Dimensionality Reduction: A Comparative Review12, Maaten, 2009

Dimension Reduction: A Guided Tour5, Burges, 2009

A Survey of Manifold-Based Learning Methods2, Huo, 2007

Toward Integrating Feature Selection Algorithms for Classification and Clustering4, Liu, 2005

An Introduction to Variable and Feature Selection4, Guyon, 2003

A Survey of Dimension Reduction Techniques3, Fodor, 2002

Economics

Auctions and Bidding: A Guide for Computer Scientists8, Parsons, 2011

Computational Sustainability2, Gomes, 2009

Computational Finance5, Tsang, 2004

Game Theory

Computer Poker: A Review14, Rubin, 2011

Graphical Models

An Introduction to Variational Methods for Graphical Models12, Jordan, 1999

Kernel Methods

Kernels for Vector-Valued Functions: a Review16, Alvarez, 2012

Learning Theory

Introduction to Statistical Learning Theory21, Bousquet, 2004

Machine Learning

A Few Useful Things to Know about Machine Learning35, Domingos, 2012

A Tutorial on Bayesian Nonparametric Models11, Blei, 2011

Decision Forests for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning7, Criminisi, 2011

Top 10 Algorithms in Data Mining24, Wu, 2008

Semi-Supervised Learning Literature Survey2, Zhu, 2007

Interestingness Measures for Data Mining: A Survey1, Geng, 2006

A Survey of Interestingness Measures for Knowledge Discovery3, McGarry, 2005

A Tutorial on the Cross-Entropy Method, Boer, 2005

A Survey of Kernels for Structured Data1, Gartner, 2003

Survey on Frequent Pattern Mining, Goethals, 2003

The Boosting Approach to Machine Learning: An Overview3, Schapire, 2003

A Survey on Wavelet Applications in Data Mining, Li, 2002

Mathematics

Topology and Data18, Carlsson, 2009

Multi-armed Bandit

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems9, Bubeck, 2012

Natural Computing

Reservoir Computing Approaches to Recurrent Neural Network Training, Jaeger, 2009

Artificial Immune Systems, Aickelin, 2005

A Survey of Evolutionary Algorithms for Data Mining and Knowledge Discovery1, Freitas?? , 2003

Data Mining in Soft Computing Framework: A Survey2, Mitra, 2002

Neural Networks for Classification: A Survey3, Zhang, 2000

Natural Language Processing

Probabilistic Topic Models8, Blei, 2012

Ontology Learning From Text: A Look Back And Into The Future2, Wong, 2012

Machine Transliteration Survey1, Karimi, 2011

Translation Techniques in Cross-Language Information Retrieval, Zhou, 2011

Comprehensive Review of Opinion Summarization1, Kim, 2011

A Survey on Sentiment Detection of Reviews, Tang, 2009

Word Sense Desambiguation: A Survey, Navigli, 2009

Topic Models3, Blei, 2009

Opinion Mining and Sentiment Analysis, Pang, 2008

Information Extraction2, Sarawagi, 2008

Statistical Machine Translation, Lopez, 2008

A Survey of Named Entity Recognition and Classification2, Nadeau, 2007

Adaptive Information Extraction, Turmo, 2006

Survey of Text Clustering1, Jing, 2005

Machine Learning in Automated Text Categorization, Sebastiani, 2002

Web Mining Research: A Survey, Kosala, 2000

Networks

Community Detection in Graphs2, Fortunato, 2010

A Survey of Statistical Network Models1, Goldenberg, 2010

Communities in Networks1, Porter, 2009

Graph Clustering2, Schaeffer, 2007

Graph Mining: Laws, Generators, and Algorithms4, Chakrabarti, 2006

Comparing Community Structure Identification2, Danon, 2005

Link Mining: A Survey3, Getoor, 2005

Detecting Community Structure in Networks1, Newman, 2004

Link Mining: A New Data Mining Challenge1, Getoor, 2003

On-Line Learning

On-Line Algorithms in Machine Learning8, Blum, 1998

Others

A Survey of Very Large-Scale Neighborhood Search Techniques, Ahuja, 2001

Planning and Scheduling

A Review of Machine Learning for Automated Planning3, Jimenez, 2009

Probabilistic

Approximate Policy Iteration: A Survey and Some New Methods1, Bertsekas, 2011

An Introduction to MCMC for Machine Learning1, Andrieu, 2003

Probabilistic Models

An Introduction to Conditional Random Fields7, Sutton, 2010

Randomized Algorithms

Randomized Algorithms for Matrices and Data2, Mahoney, 2011

Recommender Systems

Recent advances in Personalized Recommender Systems4, Liu, 2009

Matrix Factorization Techniques for Recommender Systems3, Koren, 2009

A Survey of Collaborative Filtering Techniques3, Su, 2009

Regression

Ensemble Approaches for Regression: a Survey12, Moreira, 2012

Reinforcement Learning

A Survey of Reinforcement Learning in Relational Domains9, Otterlo, 2005

Reinforcement Learning: A Survey4, Kaelbling, 1996

Rule Learning

Association Mining1, Ceglar, 2006

Algorithms for Association Rule Mining - A General Survey and Comparison1, Hipp, 2000

Testing

Controlled Experiments on the Web: Survey and Practical Guide, Kohavi, 2009

Time Series

Time-Series Data Mining9, Esling, 2012

A Review on Time Series Data Mining6, Fu, 2011

Discrete Wavelet Transform-Based Time Series Analysis and Mining2, Chaovalit, 2011

Transfer Learning

A Survey on Transfer Learning5, Pan, 2010

Web Mining

A Taxonomy of Sequential Pattern Mining Algorithms1, Mabroukeh, 2010

A Survey of Web Clustering Engines, Carpineto, 2009

Web Page Classification: Features and Algorithms2, Qi, 2009

Mining Interesting Knowledge from Weblogs: A Survey, Facca, 2005

An Overview of Web Data Clustering Practices, Vakali, 2005

A Survey of Web Metrics1, Dhyani, 2002

Data Mining for Hypertext: A Tutorial Survey5, Chakrabarti, 2000

<think>嗯,用户已经掌握了一定的机器学习和自然语言处理的理论知识,现在想提高谈薪的筹码。他打算通过写技术博客和做项目来增强自己的竞争力,计划在优快云写博客,项目放到GitHub。不过他想知道有没有更好的平台推荐,以及项目放置的位置是否有其他建议。 首先,我需要分析他的需求。写技术博客和项目展示都是为了展示技术能力,增加面试或谈薪时的筹码。优快云是国内比较知名的平台,但可能还有其他更适合的平台。另外,GitHub虽然是主流,但也许有其他平台可以补充,比如GitLab或者Gitee,特别是考虑到国内访问的问题。 关于技术博客平台,优快云的优点是流量大,中文用户多,但缺点是有广告,界面可能不够简洁。其他选择比如掘金,也是中文社区,内容质量较高,社区活跃。简书可能更适合长文,但技术氛围不如前两者。知乎专栏适合深度文章,但需要一定的写作水平。个人博客的话,可以用Hexo或Hugo搭建,这样更自由,但需要自己维护,可能初期流量少。Medium是英文平台,适合国际化,但需要一定的英文写作能力。所以可能需要根据用户的目标受众来选择,如果主要面向国内,掘金、优快云、知乎都不错;如果想国际化,Medium或Dev.to更好。 然后是项目放置的位置。GitHub是首选,但国内访问可能不稳定,所以可以同时使用Gitee作为镜像。GitLab适合私有项目,但公开项目还是GitHub更流行。另外,项目展示平台比如GitHub Pages可以搭建个人网站,展示项目文档或Demo。Vercel或Netlify可以部署前端项目,方便展示效果。Kaggle和Hugging Face适合机器学习项目,能直接运行和展示模型,方便他人复现,这对求职很有帮助。 用户可能需要考虑多平台分发博客,比如优快云和掘金同时更新,扩大影响力。个人博客作为主站,其他平台作为分发渠道。项目方面,除了GitHub,可以部署到Kaggle或Hugging Face,特别是NLP项目,Hugging Face的模型库很受欢迎。同时,确保项目有良好的文档和README,方便他人理解。 可能用户没有考虑到的是,博客内容的质量和持续性比平台更重要,所以建议他定期更新,深入分析问题。项目不仅要放代码,还要有详细的说明和实际应用案例,比如在GitHub README中写明技术栈、解决的问题、创新点等。另外,参与开源项目或贡献代码也能提升竞争力,可以推荐他积极参与GitHub上的优质项目。 总结下来,推荐技术博客平台根据目标受众选择,项目放置除了GitHub,可以结合其他平台增强展示效果,同时注意内容质量和持续输出。</think>针对你的需求(提升技术影响力、增加谈薪筹码),我会从**技术博客平台选择**和**项目托管方案**两个方向给出建议,并附上具体执行策略: --- ### 一、技术博客平台推荐 #### 📌 国内优先选择(流量大、适合中文技术社区) 1. **掘金(juejin.cn)** - ✅ 优势:技术社区活跃、文章质量高、有新人扶持机制(流量倾斜) - 📝 适合:实战经验总结、源码解析、面试题精讲 - 💡 技巧:添加 `#机器学习` `#NLP` 等标签,参与专题活动增加曝光 2. **知乎专栏** - ✅ 优势:长文深度内容更易传播,适合输出方法论/行业分析 - 📝 适合:写「如何从理论过渡到工业界NLP落地」「Transformer的10个认知误区」等话题 - 💡 技巧:同步回答相关领域问题引流到专栏 3. **优快云** - ✅ 优势:SEO友好,容易被搜索引擎收录 - ⚠️ 注意:避免纯转载,重点写原创解决方案(如:「BERT模型部署中的显存优化技巧」) #### 📌 国际影响力拓展(英文写作能力允许时) 1. **Medium** - ✅ 优势:国际认可度高,可关联GitHub展示项目 - 📝 适合:前沿论文解读、开源项目文档 - 💡 技巧:使用 `Towards Data Science` 等头部专栏投稿 2. **Dev.to** - ✅ 优势:开发者社区友好,互动性强 - 📝 适合:技术踩坑记录、工具链搭建教程 #### 📌 必做项:建立**个人博客** - 🛠️ 工具:Hugo + GitHub Pages(免费) / Vercel(自动部署) - 🌟 价值:作为技术品牌核心阵地,聚合所有平台内容 - 🔗 策略:在优快云/掘金等平台文章末尾附个人博客链接导流 --- ### 二、项目托管与展示方案 #### 📌 代码托管(技术面试重点考察) 1. **GitHub 必备** - ✅ 核心仓库:精选2-3个**完整Pipeline项目**(如:端到端的舆情分析系统) - 📁 结构规范: ```markdown /project-name ├── README.md(必含流程图、性能指标、商业价值说明) ├── requirements.txt ├── src/ ├── docs/(部署文档、API说明) └── examples/(Demo截图或GIF) ``` 2. **Gitee 镜像** - ✅ 作用:解决GitHub国内访问不稳定问题 - 💡 技巧:使用GitHub Actions自动同步到Gitee #### 📌 项目增强曝光 1. **Kaggle** - ✅ 适合:数据竞赛解决方案、特征工程/模型调优Notebook - 📈 价值:获得竞赛排名/Kaggle Expert头衔含金量高 2. **Hugging Face Hub** - ✅ 适合:发布预训练模型、数据集、Spaces交互Demo - 🌰 案例:将训练的文本分类模型上传,并提供API调用示例 3. **个人网站展示** - 🛠️ 工具:Streamlit(快速构建ML可视化应用) - 🌐 部署:Vercel一键托管,域名绑定`yourname.com/projects` --- ### 三、提升议价能力的关键策略 1. **内容方向聚焦** - 80% 写**落地经验**(模型压缩、跨模态实践、部署性能优化) - 20% 写**前沿跟踪**(对比GPT-4o与Gemini 1.5 Pro的架构差异) 2. **数据证明能力** - 博客:标注「本文方案使推理速度提升37%(附压测数据)」 - GitHub:在README展示AB测试结果、QPS等工程指标 3. **打造技术人设** - 在所有平台使用**统一ID和头像** - 定期整理**年度技术总结**(如:《2024我的MLOps实践之路》) --- ### 四、避坑指南 - ❌ 避免在多个平台发布**相同内容**(搜索引擎会判定抄袭) - ✅ 采用「核心平台首发+其他平台摘要+原文链接」模式 - ⏰ 更新频率建议:技术博客**每周1篇**,项目**每季度1个精品** 通过以上组合策略,你的GitHub和博客将成为强有力的能力证明,在薪资谈判时可直接提供「博客访问量增长曲线」「项目Star数」「Hugging Face模型下载量」等量化证据。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值