Storm Data preprocessing

 The data s huge and mass, and there are loads of ways to preprocessing  the data . 
 The way i dealed with it  , probably is not  really good ,but still can show what i need .

 

When i started preprocessing the data , there s a lot of different trouble. like the tail of the data is really a mass.

> tail(storm)
                                                                                                                                                                                                                                                                                                                                                                                              X.STATE__.
1769564                                                                                                                                                                                                                                                                                                                                                        68 kt (78 mph) at the Cape Lisburne AWOS.
1769565                                                                                                                               Zone 202: Blizzard conditions were observed at Barrow from approximately 1021AKST through 1700AKST on the 9th. The visibility was frequently reduced to one quarter mile or less in blowing snow. There was a peak wind gust to 46 kt (53 mph) at the Barrow ASOS.
1769566 Zone 207: Blizzard conditions were observed at Kivalina from approximately 0400AKST through 1230AKST on the 9th. The visibility was frequently reduced to one quarter of a mile in snow and blowing snow. There was a peak wind gust to 61 kt (70 mph) at the Kivalina ASOS.  The doors to the village transportation shed were blown out to sea.  Many homes lost portions of their tin roofing
1769567                                                                                                                                                                                                                                                                                                                                                                                             1.00
1769568                                                                                                                                                                                                                                                        with rainfall remaining light to moderate during most its duration.  The rainfall resulted in minor river flooding along the Little River
1769569                                                                                                                                                                                                                                                                                 The rain mixed with and changed to snow across north Alabama during the afternoon and  evening hours of the 28th
                                                                                                                                               X.BGN_DATE.
1769564                                                                                                                                                   
1769565                                                                                                                                                   
1769566                                                     and satellite dishes were ripped off of roofs. One home had its door blown off.  At Point Hope
1769567                                                                                                                                 11/28/2011 0:00:00
1769568 Big Wills Creek and Paint Rock.   A landslide occurred on Highway 35 just north of Section in Jackson County.  A driver was trapped in his vehicle
1769569                        and lasted into the 29th.  The heaviest bursts of snow occurred in northwest Alabama during the afternoon and evening hours
                                              ..........           

Then i started to clean the data like below:

healthData<-storm[,c("X.EVTYPE.","X.BGN_DATE.","X.FATALITIES.", "X.INJURIES.")]
healthData$FATALITIES<-as.numeric(healthData$X.FATALITIES.)
healthData<-subset(healthData,healthData$FATALITIES>0)
healthData<-healthData[,-3]

healthData$INJURIES<-as.numeric(healthData$X.INJURIES.)
healthData<-subset(healthData,healthData$INJURIES>0)
healthData<-healthData[,-3]
healthData$total <- healthData$FATALITIES + healthData$INJURIES

propData<-storm[,c("X.EVTYPE.","X.BGN_DATE.", "X.PROPDMG.", "X.PROPDMGEXP.")]

propData$pronum<-as.numeric(propData$X.PROPDMG.)

propData<-subset(propData,propData$pronum>0)

propData<-propData[,-3]

library(plyr)
propData <- mutate(propData, PropertyDamage = ifelse(toupper(X.PROPDMGEXP.) =="\"K\"" , pronum*1000, 
                                                     ifelse(toupper(X.PROPDMGEXP.) =="\"M\"" , pronum*1000000,
                                                            ifelse(toupper(X.PROPDMGEXP.) == "\"B\"" , pronum*1000000000, 
                                                                   ifelse(toupper(X.PROPDMGEXP.) == "\"H\"" , pronum*100, pronum)))))

then u can see the result become easier to analysis .

 

to check  out more , feel free to my Rpub:storm data analysis

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值