
Machine_Learning
文章平均质量分 92
ML theories, applications and dev. tools etc.
EverNoob
simply bumping around
展开
-
A Visual Guide to Quantization
content summarized by title, easy read, good stuff.转载 2025-03-31 11:07:12 · 12 阅读 · 0 评论 -
Large sequence models for software development activities
uses。转载 2024-07-09 12:13:43 · 128 阅读 · 0 评论 -
LLM Benchmarks
We very often see a menagerie of performance benchmarks for LLM papers listed to showcase the "breakthroughs" and very likely know very little about the specifics about each particular test suite.There, then, lies a danger of being misled and manipulated b原创 2024-04-08 11:47:21 · 825 阅读 · 0 评论 -
1 bit LLM and 1 trit LLM
In light of NV's recent addition of fp4, I'm once again curious about the bottom line for LLM, at least for inference; let's go back to this BitNet paper from Microsoft, featuring 1 bit LLM, with 1-bit weights trained from scatch, and later on another feat原创 2024-03-22 18:17:10 · 1151 阅读 · 0 评论 -
SORA: text-to-video Generator by OpenAI
sources:1. OpenAI's blog piece: Video generation models as world simulators 2. DiTs (Diffusion Transformers): Scalable Diffusion Models with TransformersThis is so far the most contentious point for SORA, regarding whether it is "learning" physics and gene原创 2024-02-24 21:47:08 · 1244 阅读 · 0 评论 -
vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention
paper: https://arxiv.org/pdf/2309.06180.pdfrepo: GitHub - vllm-project/vllm: A high-throughput and memory-efficient inference and serving engine for LLMshighlights blog by authors: vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention | vLLM BlogLLMs转载 2023-12-05 20:49:41 · 343 阅读 · 0 评论 -
DeConvolution(Transposed Convolution)
DeConv fundamentals原创 2023-11-09 20:47:31 · 269 阅读 · 0 评论 -
Understanding Gated Recurrent Unit (GRU) in Deep Learning
SourceGRU stands for Gated Recurrent Unit, which is a type of recurrent neural network (RNN) architecture that is similar to LSTM (Long Short-Term Memory).Like LSTM, GRU is designed to model sequential data by allowing information to be selectively remembe转载 2023-11-07 19:01:17 · 204 阅读 · 0 评论 -
The Reversal Curse: LLMs trained on “A is B“ fail to learn “B is A“
paper: https://owainevans.github.io/reversal_curse.pdfblog with interactions with the authors: Paper: LLMs trained on “A is B” fail to learn “B is A” — LessWrongThis is a linkpost for https://owainevans.github.io/reversal_curse.pdfThis post is the copy of原创 2023-09-28 18:07:11 · 543 阅读 · 0 评论 -
Illustrated Stable Diffusion
AI image generation is the most recent AI capability blowing people’s minds (mine included). The ability to create striking visuals from text descriptions has a magical quality to it and points clearly to a shift in how humans create art.转载 2023-08-17 14:02:34 · 284 阅读 · 0 评论 -
Automatic Differentiation
For beginners, the most daunting aspect of deep learning algorithms is perhaps Back-Propagations (BP) which require derivations of some highly complex mathematical expressions.Luckily when actually implementing BP, we do not have to rely on smmary symbolic原创 2023-07-28 13:46:54 · 320 阅读 · 0 评论 -
Hierarchical Clustering: Agglomerative and Divisive
efficientaccurate。转载 2023-04-04 18:05:53 · 192 阅读 · 0 评论 -
Common architectures in convolutional neural networks
from: https://www.jeremyjordan.me/convnet-architectures/#lenet5==> most of the graphs cannot be copied to this platform, so just check the linked originalIn this post, I'll discuss commonly used architectures for convolutional networks. As you'll see, almo转载 2023-02-22 18:56:56 · 219 阅读 · 0 评论 -
Domain Specific Compiling: 领域编译器发展的前世今生 • 面向AI的编译技术
作者简介:张朔铭,博士研究生,正在中国科学院计算技术研究所崔慧敏研究员指导下攻读计算机系统结构博士学位,目前主要的研究方向是AI编译。zhangshuoming17@mails.ucas.ac.cn本文分为两个部分,第一部分为综述(领域编译器发展的前世今生 • 综述);这部分重点讨论面向AI领域的编译技术。0. 前言随着人工智能时代的来临,AI领域应用的大量出现也促进着领域编译的发展,最突出的表现就是多种AI编译器的普及和应用。AI领域有几个重要的特征使得AI编译器面临很多新的机遇和挑战:一是AI领域中编程转载 2023-02-21 18:59:50 · 788 阅读 · 0 评论 -
TorchSparse: 3D SC/SSC Acceleration on GPU
Paper:TorchSparse: Efficient Point Cloud Inference EngineNotation:Mappingto get output position set:when down-sampling, since we want to sample as manysparse input sites as possible, we slack the SSC i/o mapping condition to p < s*...原创 2022-05-26 17:19:00 · 601 阅读 · 0 评论 -
3D (Input) Sparse Convolution
Review:2D sparsity in DNNsSparsity in Deep Learning_EverNoob的博客-优快云博客==> the above mentioned 2Dsparsity is decidedly different from the 3D sparsity situation, in that we manually created the structured sparsity to cut down memory footprint, whil.原创 2022-05-24 17:28:13 · 789 阅读 · 0 评论 -
Focal Loss
Definitionsource:http://arxiv.org/abs/1708.02002v2AFocal Lossfunction addresses class imbalance during training in tasks like object detection. Focal loss applies a modulating term to the cross entropy loss in order to focus learning on hard miscla...转载 2022-05-09 16:54:24 · 262 阅读 · 0 评论 -
Cross Entropy (Loss)
Cross Entropyhttps://en.wikipedia.org/wiki/Cross_entropyCross Entropy Losshttps://towardsdatascience.com/cross-entropy-loss-function-f38c4ec8643eA Gentle Introduction to Cross-Entropy for Machine Learning转载 2022-05-08 12:35:59 · 938 阅读 · 0 评论 -
PointNet++: PointNet for Fine-grained Features
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric SpacePointNet++[arXiv version][Code and Data (GitHub)]AbstractFew prior works study deep learning on point sets. PointNet by Qi et al. is a pioneer in this direction. Howeve.转载 2022-05-07 21:19:18 · 538 阅读 · 0 评论 -
RCNN and Variants
Intro videohttps://www.youtube.com/watch?v=vr5rs_cTKCs(short summary)https://towardsdatascience.com/object-detection-explained-r-cnn-a6c813937a7613.8. Region-based CNNs (R-CNNs) — Dive into Deep Learning 0.17.5 documentationVariants:https://t..转载 2022-05-06 14:04:59 · 307 阅读 · 0 评论 -
RoI: Region of Interest Projection and Pooling
RoI is a technique/layer introduced in Fast-RCNN paper:https://arxiv.org/abs/1504.08083here is an easy to read intro:Understanding Region of Interest (RoI Pooling) - Blog by Kemal Erdem==> in short, RoI projection shrinks RoI after CNNpre-proces..转载 2022-05-06 11:19:17 · 326 阅读 · 0 评论 -
PointNet
PointNetPaper Highlights Figure 1.Applications of PointNet.We propose a novel deep net architecture that consumes raw point cloud (set of points) without voxelization or rendering. It is a unified architecture that learns both global and local po...转载 2022-05-06 09:18:57 · 400 阅读 · 0 评论 -
NNs for Point Cloud: PRNN and PV-CRNN
for basics on point cloud, see:(3D Imaging) Point Cloud_EverNoob的博客-优快云博客Moving Point Cloud Processing: PointRNNhttps://arxiv.org/abs/1910.08287In this paper, we introduce a Point Recurrent Neural Network (PointRNN) for moving point cloud process原创 2022-05-09 20:04:49 · 1013 阅读 · 0 评论 -
Ridge, Lasso, Group Lasso and Sparse Group Lasso
Main Article==> this is a great introductary article with visual cues about the statistical regularization techniques.https://en.wikipedia.org/wiki/Lasso_(statistics)(secondary title: Complete Guide Using Scikit-Learn)Moving on from a very impor转载 2022-04-19 15:40:01 · 804 阅读 · 0 评论 -
(Weight) Sparsity in Deep Learning
SOTA* Overview*[Submitted on 31 Jan 2021]Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networkshttps://arxiv.org/abs/2102.00554The growing energy and performance costs of deep learning have driven th.原创 2022-04-20 18:43:40 · 3790 阅读 · 0 评论 -
PINN: Physics Informed Neural Networks
Introhttps://en.wikipedia.org/wiki/Physics-informed_neural_networksPhysics-informed neural networks(PINNs) are a type of universal function approximators that can embed the knowledge of any physical laws that govern a given data-set in the learning p.原创 2022-04-15 10:42:58 · 2023 阅读 · 0 评论 -
Batch Normalization: BP
Understanding the backward pass through Batch Normalization LayerFlair of Machine LearningPosted on February 12, 2016(for intro and howit possibly could work, see: Batch Normalization_EverNoob的博客-优快云博客)(for a concise mathematical solution, see: B.转载 2022-03-28 15:46:10 · 274 阅读 · 0 评论 -
Batch Normalization: Basics and Intuition
Wiki Introhttps://en.wikipedia.org/wiki/Batch_normalization==> this wiki article is techinical enough for further reference on related concepts and deeper looksBatch normalization(also known asbatch norm) is a method used to makeartificial neu...转载 2022-03-28 14:27:09 · 1128 阅读 · 0 评论 -
Wireless Communication and Wifi
for communication channel and channel interference see:https://blog.youkuaiyun.com/maxzcl/article/details/123753591Wireless Communication BriefingWireless Communication: Introduction, Types and ApplicationsWireless Communication is the fastest growing an转载 2022-03-26 14:34:47 · 2274 阅读 · 0 评论 -
Minifloats: FP Types for DNNs
https://en.wikipedia.org/wiki/MinifloatIncomputing,minifloatsarefloating-pointvalues represented with very fewbits. Predictably, they are not well suited for general-purpose numerical calculations. They are used for special purposes, most often in...原创 2022-03-24 12:59:55 · 2709 阅读 · 0 评论 -
int8 quantization in DNN
from: What Is int8 Quantization and Why Is It Popular for Deep Neural Networks? - MATLAB & SimulinkWhat Is int8 Quantization and Why Is It Popular for Deep Neural Networks?By Ram Cherukuri, MathWorksDeep learning deployment on the edge for转载 2021-12-28 11:08:42 · 299 阅读 · 0 评论 -
Deep Learning with 4-bit systems (int4)
4-bit introduction paper:https://papers.nips.cc/paper/2020/file/13b919438259814cd5be8cb45877d577-Paper.pdf4-bit CNN paper:https://arxiv.org/pdf/2009.06488.pdfshort news articles:https://medium.com/swlh/4-bit-deep-learning-d1614c0883e3https://towa..转载 2021-12-25 11:04:42 · 238 阅读 · 0 评论 -
warpAffine and Affine Transformation
InterfacefromOpenCV: Geometric Image Transformations◆warpAffine()void cv::warpAffine ( InputArray src, OutputArray dst, InputArray M, Size dsize, int flags=INTER_LINEAR, int...转载 2021-11-24 12:24:15 · 164 阅读 · 0 评论 -
CNN Parameter Estimation
fromhttps://towardsdatascience.com/understanding-and-calculating-the-number-of-parameters-in-convolution-neural-networks-cnns-fc88790d530d转载 2021-11-04 17:06:04 · 96 阅读 · 0 评论 -
Large Language Models: A New Moore‘s Law?
Published October 26, 2021.Update on GitHubjuliensimonJulien SimonOpinion pieceA few days ago, Microsoft and NVIDIAintroducedMegatron-Turing NLG 530B, a Transformer-based model hailed as "the world’s largest and most powerful generative language m...转载 2021-11-03 11:23:08 · 512 阅读 · 0 评论 -
Mathematical Morphology and Filter Dilation
Morphologyhttps://en.wikipedia.org/wiki/Mathematical_morphologyDilationhttps://en.wikipedia.org/wiki/Dilation_(morphology)Dilated Convolution Examplehttps://towardsdatascience.com/understanding-2d-dilated-convolution-operation-with-examples原创 2021-11-01 14:47:49 · 458 阅读 · 0 评论 -
Transformer and Bert
Bert is the main application of the transformer modelTransformer Main ArticleThe Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time.Transformer main videohttps://www.youtube.com/watch?v=ugWDIIOHtPA~10:00原创 2021-10-19 08:57:03 · 173 阅读 · 0 评论 -
Caffe Eltwise (combined share)
from Eltwise层解析 - greathuman - 博客园Concat层虽然利用到了上下文的语义信息,但仅仅是将其拼接起来,之所以能起到效果,在于它在不增加算法复杂度的情形下增加了channel数目。那有没有直接关联上下文的语义信息呢?答案是Eltwise层,被广泛使用,屡试不爽,并且我们常常拿它和Concat比较,所以我常常一起说这两个层。我们普遍认为,像这样的“encoder-decoder”的过程,有助于利用较高维度的feature map信息,有利于提高小目标的检测效果。Eltwi转载 2021-10-18 09:43:13 · 337 阅读 · 0 评论 -
博客园刘建平Pinard神经网络博文: RNN
RNN循环神经网络(RNN)模型与前向反向传播算法LSTM模型与前向反向传播算法转载 2021-04-25 22:03:01 · 654 阅读 · 2 评论 -
博客园刘建平Pinard神经网络博文摘要: CNN
CNN卷积神经网络(CNN)模型结构卷积神经网络(CNN)前向传播算法卷积神经网络(CNN)反向传播算法转载 2021-04-25 22:02:04 · 659 阅读 · 0 评论