
数据科学入门与实战
范德彪陕西分彪
Chosen one
展开
-
seaborn绘制概率密度图
#matplotlib inline #IPython notebook中的魔法方法,这样每次运行后可以直接得到图像,不再需要使用plt.show()import numpy as np #导入numpy包,用于生成数组import seaborn as sns #习惯上简写成snssns.set() #切换到seaborn的默认运行配置x=np.random.randn(100) #随机生成100个符合正态分布的数sns.kdeplot(x)#正常来说y =原创 2022-02-04 21:51:08 · 1825 阅读 · 0 评论 -
ToPILImage
import matplotlib.pyplot as pltimport numpy as npshow = ToPILImage()data = torch.rand(3,64,128)data = show(data)plt.imshow(data)plt.show()原创 2021-04-15 16:18:41 · 2352 阅读 · 1 评论 -
plt绘制1 / (1 + np.exp(-x))
x = np.linspace(-20,20,10000)y = np.array(1 / (1 + np.exp(-x)))y1 = np.array(1 / (1 + np.exp(-x)) ** 0) / 2.0plt.plot(x,y)plt.plot(x,y1)plt.axhline(y=0.7,ls="-",c="red")plt.axis([-20,20,-0.1,1])plt.show()原创 2021-03-30 09:44:48 · 1209 阅读 · 0 评论 -
可视化
import matplotlib.pyplot as pltimport PIL as imgimport cv2import cv2import matplotlib.pyplot as pltimport numpy as npimg1 = cv2.imread('./ImgDB/tiger_001.jpg')img2 = cv2.imread('./ImgDB/tiger_002.jpg')img3 = cv2.imread('./ImgDB/tiger_003.jpg')img原创 2021-03-12 21:05:38 · 152 阅读 · 0 评论 -
数据科学入门与实战:Seaborn002热力图等
show me the codeimport numpy as npimport pandas as pdfrom pandas import Series,DataFrameimport matplotlib.pyplot as pltimport seaborn as snss1 = Series(np.random.randn((1000)))plt.hist(s1)s1.plot(kind = 'kde')然后用sns很容易想象hist,kde = False 图形的情原创 2020-11-28 22:41:24 · 181 阅读 · 0 评论 -
数据科学入门与实战:Seaborn001
需求:画一个花瓣(petal)和花萼(sepal)长度的散点图并且颜色要区分import numpy as npimport pandas as pdfrom pandas import Series,DataFrameimport matplotlib.pyplot as pltimport seaborn as snsiris = pd.read_csv('iris.csv')iris.head()type(iris)通过matplotlib画散点图print(iris.Na原创 2020-11-28 22:20:53 · 127 阅读 · 0 评论 -
数据科学入门与实战:Matplotlib绘图hist
直方图就是可以看数据整体分布情况import numpy as npimport pandas as pdfrom pandas import Series,DataFrameimport matplotlib.pyplot as plts = Series(np.random.randn(1000))print(s)plt.hist(s)调整柱子宽度plt.hist(s,rwidth=0.5)另一个例子#另一个例子a = np.arange(10)print(a)p原创 2020-11-28 22:09:31 · 164 阅读 · 0 评论 -
数据科学入门与实战:Matplotlib绘图DateFrame
import numpy as npimport pandas as pdfrom pandas import Series,DataFrameimport matplotlib.pyplot as plt创建DateFrame,并绘图df = DataFrame(np.random.randint(1,10,40).reshape(10,4),columns=['A','B','C','D'])print(df)df.plot()bardf.plot(kind = 'bar')原创 2020-11-28 21:57:26 · 1077 阅读 · 0 评论 -
数据科学入门与实战:Matplotlib绘图Series
引入相关包import numpy as npimport pandas as pdfrom pandas import Series,DataFrameimport matplotlib.pyplot as plt创建s1 = Series(np.random.randn(10)).cumsum()s2 = Series(np.random.randn(10)).cumsum()s = Series([1,2,3,4,5])print(s.cumsum())#累加绘图s1s1原创 2020-11-28 21:02:48 · 541 阅读 · 0 评论 -
数据科学入门与实战:Matplotlib绘图基础一
为什么用python画图GUI太复杂Excel太头疼python简单免费什么是matplotlib一个Python包用于2D绘图还有很多扩展如:seaborn# hello world in matplotlibimport matplotlib.pyplot as pltimport numpy as np%matplotlib inline#魔法函数..x = np.linspace(0,2 * np.pi ,100)y = np.sin(x)plt.plot(x,y)原创 2020-11-28 20:51:29 · 174 阅读 · 0 评论 -
数据科学入门与实战:玩转pandas实战项目分析航班晚点情况
引入相关包import numpy as npimport pandas as pdfrom pandas import DataFrame,Series读取数据文件df = pd.read_csv('usa_flights.csv')看看数据个数print(df.size)#计算总数print(df.shape)#计算矩阵行列大小查看数据print(df.head())#一个月的数据..#主要看看arr_delay才知道是否延误数据情况,包含航班日期,航空公司,航班号,数原创 2020-11-28 20:38:59 · 763 阅读 · 1 评论 -
数据科学入门与实战:玩转pandas之七数据透视
数据透视引入相关包#透视表import numpy as npimport pandas as pdfrom pandas import DataFrame,Series看看…df = pd.read_excel('sales-funnel.xlsx')print(df.head())生成透视表#生成透视表print('-'*100)print(pd.pivot_table(df,index=['Name']))聚合参数#聚合参数print(pd.pivot_tabl原创 2020-11-27 09:12:45 · 189 阅读 · 1 评论 -
数据科学入门与实战:玩转pandas之七数据分箱技术,分组技术,聚合技术
首先导入相关包import pandas as pdimport numpy as npfrom pandas import Series,DataFrame#数据分箱技术Binning数据分箱技术创建一个数列,长度为20,数值从25到100score_list = np.random.randint(25,100,size=20)print(score_list)设置区间bins,并统计落入各区间的个数#统计落入各区间的个数,突然想到可以用到直方图统计bins = [0,59,原创 2020-11-26 23:08:40 · 262 阅读 · 0 评论 -
数据科学入门与实战:玩转pandas之六时间序列
时间序列的操作基础首先引入相关的包import numpy as npimport pandas as pdfrom pandas import Series,DataFramefrom datetime import datetime创建一个时间对象t1 = datetime(2009,10,20,0,0)print(t1)创建多个时间对象date_list = [ datetime(2016,9,1), datetime(2017,9,2),原创 2020-11-23 21:09:39 · 205 阅读 · 1 评论 -
数据科学入门与实战:玩转pandas之五
通过apply进行数据预处理首先读入csv文件df = pd.read_csv('apply_demo.csv')print(df.head())看看规模print(df.size)创建新的seriess1 = Series(['a']*7879)df['A'] = s1print(df.head())打印结果把A下面的a变成大写df['A'] = df['A'].apply(str.upper)print(df.head())把data里面的symbol seq原创 2020-11-22 22:11:30 · 321 阅读 · 0 评论 -
数据科学入门与实战:玩转pandas之四
DataFrame的merge(合并)操作首先import numpy as npimport pandas as pdfrom pandas import Series,DataFrame创建df1 df2df1 = DataFrame({'key':['X','Y','Z','X'],'data_set_1':[1,2,3,4]})print(df1)df2 = DataFrame({'key':['X','B','C'],'data_set_2':[4,5,6]})print(df原创 2020-11-22 20:42:48 · 113 阅读 · 0 评论 -
数据科学入门与实战:玩转pandas之三
重命名DataFrame的Index引入包import numpy as npimport pandas as pdfrom pandas import Series,DataFrame创建df1df1 = DataFrame(np.arange(9).reshape(3,3),index=['BJ','SH','GZ'],columns=['A','B','C'])print(df1)给index重新赋值首先显示原来的indexprint(df1.index)改变inde原创 2020-11-22 18:07:55 · 111 阅读 · 0 评论 -
数据科学入门与实战:玩转pandas之二
Series,DateFrame的排序功能引入相关的包import numpy as npimport pandas as pdfrom pandas import Series,DataFrame建立一个seriouss1 = Series(np.random.randn(10))#返回一个标准正态分布的array打印一下serious,和其值,索引的情况按照值对它排序s2 = s1.sort_values(ascending=True)#从小到大print(s2)根据索原创 2020-11-22 17:35:46 · 131 阅读 · 0 评论 -
数据科学入门与实战:玩转pandas之一
0014import numpy as npimport pandas as pdfrom pandas import Series,DataFrames1 = Series([1,2,3],index = ['A','B','C'])print(s1)s2 = Series([4,5,6,7],index=['B','C','D','E'])print(s2)print(s1+s2)#index 相同才能相加A 1B 2C 3dtype: int64B原创 2020-11-22 17:09:49 · 126 阅读 · 0 评论 -
数据科学入门与实战:numpy&pandas入门
0011import numpy as np#creat python listlist_1 = [1,2,3,4]list_11[1, 2, 3, 4]3array_1 = np.array(list_1)array_13array([1, 2, 3, 4])5list_2 = [5,6,7,8]#如果是两个列表注意带括号array_2 = np.array([list_1,list_2])array_25array([[1, 2, 3, 4], [5,原创 2020-11-17 22:40:25 · 385 阅读 · 1 评论