R语言对于Machine learning的代码以及研究

本文详细探讨了R语言在机器学习领域的应用,包括数据预处理、模型训练、评估与调优等方面,通过实例代码展示如何使用R进行机器学习项目。读者将了解R中的各种机器学习库,如caret、randomForest等,并掌握如何整合这些工具解决实际问题。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

#### packages
  install.packages("ggplot2")
  install.packages("ROCR")
  install.packages("glmnet")
  install.packages("Metrics")
  install.packages("DMwR")
  install.packages("Rcpp")
   
   
  library(ggplot2)
  library(ROCR)
  library(glmnet)
  library(Metrics)
  #### Input
  marketing<- read.csv("marketing.csv")
   
  head(marketing)
  summary(marketing)
   
   
  #### Data Visualization #############################
   
  ### Average age for each occupation
  ggplot(marketing, aes(job, age)) +
  geom_bar(stat = "summary", fun.y = "mean", color = "black",fill= "grey", width = 0.5) +
  theme_bw() +
  labs( y = "Age",
  title = "Age Distribution")+
  theme(plot.title = element_text(hjust = 0.5),
  plot.subtitle = element_text(hjust = 0.5))
   
  ggplot(marketing, aes(job, age, fill = y)) +
  geom_bar(stat = "summary", fun.y = "mean", width = 0.5) +
  theme_bw() +
  labs( y = "Age",
  title = "Age Distribution")
   
  ggplot(marketing, aes(job, age, fill = y)) +
  geom_bar(stat = "summary", fun.y = "mean", width = 0.5) +
  facet_wrap( ~ marital)
  theme_bw() +
  labs( y = "Age",
  title = "Age Distribution")
  geom_density()
   
   
  ####Data preparation##################################
  ## Training and Testing
  data_y<- marketing[marketing$y == "yes",]
  data_n<- marketing[marketing$y == "no", ]
   
  set.seed(1234)
  ysub<- sample(nrow(data_y), floor(nrow(data_y)*0.7))
  nsub<- sample(nrow(data_n), floor(nrow(data_n)*0.7))
   
  train_yes<- data_y[ysub,]
  train_no<- data_n[nsub,]
   
  test_yes<- data_y[-ysub,]
  test_no<- data_n[-nsub,]
   
  train<- rbind(train_yes, train_no)
  train$y<- ifelse(train$y== "yes", 1, 0)
  test<- rbind(test_yes, test_no)
  test$y<- ifelse(test$y== "yes", 1, 0)
   
  nrow(marketing)- nrow(train)- nrow(test)
  print(prop.table(table(train$y)))
   
  #### Explore SMOTe
  library(DMwR)
   
  X<- nrow(train_no)
  Y<- nrow(train_yes)
  perc.over<- ((X-Y)*100/Y)
  perc.under<- X*100/(X-Y)
   
  train$y<- as.factor(train$y)
  train_bal <- SMOTE(y ~ . , train, perc.over=perc.over, perc.under = perc.under)
   
  print(prop.table(table(train_bal$y)))
   
   
   
  ################## Model result function
  modelperf<- function(ypredict, ytrue, cutoff) {
  library(ROCR)
  ##
  ypredict <- as.numeric(ypredict)
  ytrue<- as.numeric(as.character(ytrue))
  yresult<- ifelse(ypredict > cutoff, 1,0)
  accuracy <- 1 - mean(yresult != ytrue)
   
  ypredict<- as.numeric(ypredict)
Machine Learning Using R English | 12 Jan. 2017 | ISBN: 1484223330 | 568 Pages | PDF | 11.47 MB This book is inspired by the Machine Learning Model Building Process Flow, which provides the reader the ability to understand a ML algorithm and apply the entire process of building a ML model from the raw data. This new paradigm of teaching Machine Learning will bring about a radical change in perception for many of those who think this subject is difficult to learn. Though theory sometimes looks difficult, especially when there is heavy mathematics involved, the seamless flow from the theoretical aspects to example-driven learning provided in Blockchain and Capitalism makes it easy for someone to connect the dots. For every Machine Learning algorithm covered in this book, a 3-D approach of theory, case-study and practice will be given. And where appropriate, the mathematics will be explained through visualization in R. All practical demonstrations will be explored in R, a powerful programming language and software environment for statistical computing and graphics. The various packages and methods available in R will be used to explain the topics. In the end, readers will learn some of the latest technological advancements in building a scalable machine learning model with Big Data. Who This Book is For: Data scientists, data science professionals and researchers in academia who want to understand the nuances of Machine learning approaches/algorithms along with ways to see them in practice using R. The book will also benefit the readers who want to understand the technology behind implementing a scalable machine learning model using Apache Hadoop, Hive, Pig and Spark. What you will learn: 1. ML model building process flow 2. Theoretical aspects of Machine Learning 3. Industry based Case-Study 4. Example based understanding of ML algorithm using R 5. Building ML models using Apache Hadoop and Spark
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值