
机器学习
hebastast
这个作者很懒,什么都没留下…
展开
专栏收录文章
- 默认排序
- 最新发布
- 最早发布
- 最多阅读
- 最少阅读
-
Santander unhappy customer
import pandas as pd import numpy as np import warnings #drop warnings generated by warnings.filterwarnings('ignore') import seaborn as sns %matplotlib inline import matplotlib.pyplot as plt sns.set(s原创 2016-08-29 19:46:42 · 1212 阅读 · 0 评论 -
neural network -recognize handwritten digits
""" network.py ~~~~~~~~~~ A module to implement the stochastic gradient descent learning algorithm for a feedforward neural network. Gradients are calculated using backpropagation. Note that I have f原创 2016-09-25 22:27:52 · 522 阅读 · 0 评论 -
Facial_keypoints_deeplearning_cnn
import os import numpy as np import pandas as pd from sklearn.utils import shuffleFTRAIN='./input/training.csv' FTEST='./input/test.csv'df_train=pd.read_csv(FTRAIN) df_train['Image']=df_train['Image'].原创 2016-10-12 14:04:21 · 1034 阅读 · 0 评论 -
theano_scan_demo_compute_Jacobian_matrix
import numpy as np import theano.tensor as T import theano floatX='float32'V=T.vector('V') A=T.matrix('A') y=T.tanh(T.dot(V,A))results,updates=theano.scan(lambda i:T.grad(y[i],V),sequences=[T.arange(y.原创 2016-10-06 20:15:26 · 350 阅读 · 0 评论 -
Principal Component Analysis
pca的目标是通过基变换 使得元素的方差从大到小排列 并且元素之间的协方差为0(线性无关)。而元素的协方差 矩阵的对角线上是元素的方差 ,对角线外的元素是协方差。 我们的目标就是 找到单位正交基,使得通过基变换后 ,元素的方差从大到小排列 而不同元素的协方差为0 。元素的协方差矩阵 和经过基变换后的协方差矩阵关系如下 原数据为X X的协方差矩阵为C Y=PX P为变换矩阵 Y为变换后的元原创 2017-02-08 20:41:42 · 458 阅读 · 0 评论 -
Local Response Normalization (LRN)
This concept was raised in AlexNet, click here to learn more. Local response normalization algorithm was inspired by the real neurons, as the author said, “bears some resemblance to the local contrast转载 2017-03-27 14:20:26 · 1304 阅读 · 0 评论 -
word2vec
part1 The Model The skip-gram neural network model is actually surprisingly simple in its most basic form; I think it’s the all the little tweaks and enhancements that start to clutter the explanatio转载 2017-03-27 14:22:31 · 875 阅读 · 0 评论 -
特征离散化,特征交叉,连续特征离散化
一.互联网广告特征工程博文《互联网广告综述之点击率系统》论述了互联网广告的点击率系统,可以看到,其中的logistic regression模型是比较简单而且实用的,其训练方法虽然有多种,但目标是一致的,训练结果对效果的影响是比较大,但是训练方法本身,对效果的影响却不是决定性的,因为训练的是每个特征的权重,权重细微的差别不会引起ctr的巨大变化。 在训练方法确定后,对ctr预估起到决定性作用的是选原创 2017-04-28 09:55:15 · 954 阅读 · 0 评论 -
DigitRecongnizer_CNN_DeepLearning
import numpy as np import pandas as pd %matplotlib inline import matplotlib.pyplot as plt import matplotlib.cm as cmfrom lasagne.layers import Conv2DLayer from lasagne.layers import MaxPool2DLayer fr原创 2016-10-10 22:16:05 · 649 阅读 · 0 评论 -
backpropagation
1.关于梯度简单的理解f(x,y)=xy 可以很容易得到f(x,y)关于x 和y的偏导数 函数关于每个变量的偏导数告诉了你整个函数对于单个变量的敏感程度。2.链式法则f(x,y,z)=(x+y)z 可以把上面的公式分解成为 q=x+y 和 f=qz 对y求偏导也同样如此 # set some inputs x = -2; y = 5; z = -原创 2016-09-08 10:33:36 · 504 阅读 · 0 评论 -
digit_recongnition
# Standard scientific Python imports %matplotlib inline import matplotlib.pyplot as plt# Import datasets, classifiers and performance metrics from sklearn import datasets, svm, metrics# The digits data原创 2016-09-01 16:23:15 · 831 阅读 · 0 评论 -
titanic prediction
# Imports# pandas import pandas as pd from pandas import Series,DataFrame# numpy, matplotlib, seaborn import numpy as np import matplotlib.pyplot as plt import seaborn as sns sns.set_style('whitegrid')原创 2016-08-25 21:44:59 · 1781 阅读 · 0 评论 -
softmax_linear_classifier
import numpy as np %matplotlib inline import matplotlib.pyplot as pltN = 100 # number of points per class D = 2 # dimensionality K = 3 # number of classes X = np.zeros((N*K,D)) # data matrix (each row原创 2016-09-18 22:43:34 · 1743 阅读 · 0 评论 -
neural_network
import numpy as np %matplotlib inline import matplotlib.pyplot as pltN = 100 # number of points per class D = 2 # dimensionality K = 3 # number of classes X = np.zeros((N*K,D)) # data matrix (each row原创 2016-09-18 22:44:18 · 565 阅读 · 0 评论 -
iris_visualization
import pandas as pd import warnings #ignore the warnings that generated by seaborn warnings.filterwarnings('ignore') import seaborn as sns %matplotlib inline import matplotlib.pyplot as pltsns.set(s原创 2016-08-26 19:36:37 · 1068 阅读 · 0 评论 -
DigitRecognizer
from sklearn.ensemble import RandomForestClassifier import numpy as np import pandas as pd dataset=pd.read_csv('input/train.csv') test=pd.read_csv('input/test.csv')dataset.describe()原创 2016-09-06 19:01:53 · 688 阅读 · 0 评论 -
deeplearning_cnn_theano
#### Libraries # Standard library import gzip import pickle # Third-party libraries import numpy as np import theano import theano.tensor as T from theano.tensor.nnet import conv from theano.tensor.nne原创 2016-10-08 20:07:25 · 891 阅读 · 1 评论 -
logistic_regression
import numpy import theano import theano.tensor as T rng = numpy.randomN = 400 # training sample size feats = 784 # number of input varia原创 2016-09-07 11:33:04 · 1169 阅读 · 0 评论 -
hadoop下实现kmeans算法——一个mapreduce的实现方法
写mapreduce程序实现kmeans算法,我们的思路可能是这样的1. 用一个全局变量存放上一次迭代后的质心2. map里,计算每个质心与样本之间的距离,得到与样本距离最短的质心,以这个质心作为key,样本作为value,输出3. reduce里,输入的key是质心,value是其他的样本,这时重新计算聚类中心,将聚类中心put到一个全部变量t中。4. 在main里比较前一次的质心和本次的质心是否转载 2017-04-24 10:36:29 · 984 阅读 · 0 评论