Adative-lasso+灰色预测（R）_灰色预测平均相对误差-优快云博客

本文链接：https://blog.youkuaiyun.com/csqazwsxedc/article/details/52250232

本文探讨了使用R语言进行Adaptive Lasso特征选择在分析地方财政收入影响因素上的应用，并结合选定的特征，对广州市2015年财政收入及其类别进行预测。通过实例展示了如何利用数据进行模型构建和预测，旨在识别关键的财政收入影响特征。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

最近在看特征的选择，看到lasso对特征选择不错，下面直接上干货

数据为广州统计年检2015年数据

目标：

1) 梳理影响地方财政收入的关键特征，分析、识别影响地方财政收入的关键特征的选择模型；

2) 结合目标1的因素分析，对广州市2015年的财政总收入及各个类别收入进行预测。

下面为R语言代码部分

head(data)

##        x1     x2      x3       x4      x5      x6     x7      x8    x9
## 1 3831732 181.54  448.19  7571.00 6212.70 6370241 525.71  985.31 60.62
## 2 3913824 214.63  549.97  9038.16 7601.73 6467115 618.25 1259.20 73.46
## 3 3928907 239.56  686.44  9905.31 8092.82 6560508 638.94 1468.06 81.16
## 4 4282130 261.58  802.59 10444.60 8767.98 6664862 656.58 1678.12 85.72
## 5 4453911 283.14  904.57 11255.70 9422.33 6741400 758.83 1893.52 88.88
## 6 4548852 308.58 1000.69 12018.52 9751.44 6850024 878.26 2139.18 92.85
##      x10   x11   x12  x13      y
## 1  65.66 120.0 1.029 5321  64.87
## 2  95.46 113.5 1.051 6529  99.75
## 3  81.16 108.2 1.064 7008  88.11
## 4  91.70 102.2 1.092 7694 106.07
## 5 114.61  97.7 1.200 8027 137.32
## 6 152.78  98.5 1.198 8549 188.14

Data <- data

###数据概括性度量
Min=sapply(Data,min)      #最小值
Max=sapply(Data,max)      #最大值
Mean=sapply(Data,mean)    #均值
SD=sapply(Data,sd)        #方差
cbind(Min,Max,Mean,SD)

##             Min         Max         Mean           SD
## x1  3831732.000 7599295.000 5579519.9500 1.262195e+06
## x2      181.540    2110.780     765.0350 5.956983e+02
## x3      448.190    6882.850    2370.8250 1.919167e+03
## x4     7571.000   42049.140   19644.6850 1.020302e+04
## x5     6212.700   33156.830   15870.9480 8.199771e+03
## x6  6370241.000 8323096.000 7350513.6000 6.213419e+05
## x7      525.710    4454.550    1712.2390 1.184714e+03
## x8      985.310   15420.140    5705.7990 4.478400e+03
## x9       60.620     228.460     129.4935 5.050983e+01
## x10      65.660     852.560     340.2165 2.515779e+02
## x11      97.500     120.000     103.3050 5.513283e+00
## x12       1.029       1.906       1.4222 2.532348e-01
## x13    5321.000   41972.000   17273.8000 1.110919e+04
## y        64.870    2088.140     618.0840 6.092545e+02

#pearson相关系数，保留两位小数
round(cor(Data,method = c("pearson")),2)

##        x1    x2    x3    x4    x5    x6    x7    x8    x9   x10   x11
## x1   1.00  0.95  0.95  0.97  0.97  0.99  0.95  0.97  0.98  0.98 -0.29
## x2   0.95  1.00  1.00  0.99  0.99  0.92  0.99  0.99  0.98  0.98 -0.13
## x3   0.95  1.00  1.00  0.99  0.99  0.92  1.00  0.99  0.98  0.99 -0.15
## x4   0.97  0.99  0.99  1.00  1.00  0.95  0.99  1.00  0.99  1.00 -0.19
## x5   0.97  0.99  0.99  1.00  1.00  0.95  0.99  1.00  0.99  1.00 -0.18
## x6   0.99  0.92  0.92  0.95  0.95  1.00  0.93  0.95  0.97  0.96 -0.34
## x7   0.95  0.99  1.00  0.99  0.99  0.93  1.00  0.99  0.98  0.99 -0.15
## x8   0.97  0.99  0.99  1.00  1.00  0.95  0.99  1.00  0.99  1.00 -0.15
## x9   0.98  0.98  0.98  0.99  0.99  0.97  0.98  0.99  1.00  0.99 -0.23
## x10  0.98  0.98  0.99  1.00  1.00  0.96  0.99  1.00  0.99  1.00 -0.17
## x11 -0.29 -0.13 -0.15 -0.19 -0.18 -0.34 -0.15 -0.15 -0.23 -0.17  1.00
## x12  0.94  0.89  0.89  0.91  0.90  0.95  0.89  0.90  0.91  0.90 -0.43
## x13  0.96  1.00  1.00  1.00  0.99  0.94  1.00  1.00  0.99  0.99 -0.16
## y    0.94  0.98  0.99  0.99  0.99  0.91  0.99  0.99  0.98  0.99 -0.12
##       x12   x13     y
## x1   0.94  0.96  0.94
## x2   0.89  1.00  0.98
## x3   0.89  1.00  0.99
## x4   0.91  1.00  0.99
## x5   0.90  0.99  0.99
## x6   0.95  0.94  0.91
## x7   0.89  1.00  0.99
## x8   0.90  1.00  0.99
## x9   0.91  0.99  0.98
## x10  0.90  0.99  0.99
## x11 -0.43 -0.16 -0.12
## x12  1.00  0.90  0.87
## x13  0.90  1.00  0.99
## y    0.87  0.99  1.00

#加载adapt-lasso源代码

lasso.adapt.bic2<-function(x,y){
   
   
require(lars)
ok<-complete.cases(x,y)
x<-x[ok,]                           
y<-y[ok]                             
m<-ncol(x)
n<-nrow(x)
x<-as.matrix(x)                      
one <- rep(1, n)
meanx <- drop(one %*% x)/n
xc <- scale(x, meanx, FALSE)         
normx <- sqrt(drop(one %*% (xc^2)))
names(normx) <- NULL
xs <- scale(xc, FALSE, normx)        
out.ls=lm(y~xs)                      
beta.ols=out.ls$coeff[2:(m+1)]       
w=abs(beta.ols)                      
xs=scale(xs,center=FALSE,scale=1/w) 
object=lars(xs,y,