之前的一个小项目的记录,公司的一台服务器上保存了过去一年到项目开展时的日志,全是CSV的,公司同事整理了将其导回测试数据库的方法,但希望我能想办法对其进行分析,由于其中大部分有用数据都集中在其中的一个列中,必须先利用存储过程对其进行提取和筛选(有效数据)形成数据仓库,文档的数据分析部分由此开始,由于要保存在项目记录中,因此为英文。
1 Data Analysis Rules
1.1 Data pretreatment
Table “EFMLOGGING” still need to be pretreated because nearly all the information we need is in the “MESSAGE” column.
Base on the table “EFMLOGGING”, we create a Table “ANALYSIS”:
Column Name | Source & Type |
ID | “EFMLOGGING . ID” |
Action | “EFMLOGGING . MESSAGE” à “[Action]” |
Time | “EFMLOGGING . MESSAGE” à “[Time]” |
User | “EFMLOGGING . MESSAGE” à “[User]” |
From | “EFMLOGGING . MESSAGE” à “[From]” |
Session | “EFMLOGGING . MESSAGE” à “[Session]” |
Params | “EFMLOGGING . MESSAGE” à “[Params]” |
1.2 Analysis Rules
The analysis is base on table “ANALYSIS”
1.2.1 Action
According to the “Action” column, we need the statistic as below:
In the table , action=action_1, action_2, action_3……
action’=action_1+ action_2+ action_3……;
Statistic | Description |
MaxTime | The max time for every type of action: |
MinTime | The minimum time for every type of action: |
AvgTime | The average time for every type of action:
|
SortTime |
|
ActionRank | The percentage of an action in all action: |
TimeRank | |
TimePercentage | |
1.2.2 DayTime
According to the “DayTime” column, we need the statistic as below:
In the table, daytime’=daytime_1+daytime_2+daytime_3……;
daytime=daydaytime_1, daytime_2, daytime_3……;
action’=action_1+ action_2+ action_3……;
action=action_1, action_2, action_3……;
statistic | Description |
SortAction | |
CountActionByDate | Show how the actions change by date : |
CountActionByTime | Find out the busy time slice: |
1.3 Analysis the statistic
Base on the SortTime, the biggish “Action” in , the lesser “Action” in
, the biggish “Action” in
have low efficiency and should be improved.
Base on the ActionRank, higher one ActionRank of an action is, more often it is used.
Base on the TimeRank, the biggish “Action” in TimeRank will make great influence on the behavior of the system. And the TimePercentage also shows the efficiency of actions.
Base on the SortAction, the lesser “Action” in are not often used , they have low PRI on the “Need to be improved” List.
Base on the CountActionByTime, we can find out the busy time in one day, then we can give more resource to the server at the right time.
2 Data Analysis
Count all available data:
Result: 1949261
Compute MaxTime:
Compute MinTime:
Compute AvgTime:
Compute ActionRank:
Compute TimeRank:
Compute TimePercentage:
Sum Time:
Result: 395343319
Compute SortAction:
Compute CountActionByDate:
Compute CountActionByTime:
3 Analysis Result
MaxTime:
MinTime:
AvgTime:
SortTime:
| EFORM_ACTION | MAX_TIME |
1 | ExportToCO.action | 8998788 |
2 | ExportToGEMS.action | 2700708 |
3 | HdCaseList.action | 2084602 |
4 | COToInventory.action | 2033733 |
5 | DelegateeWorkItemListAct.action | 1416415 |
6 | RndAssetRequestList.action | 1368801 |
7 | Login.action | 442272 |
8 | CompleteWork.action | 237044 |
9 | WfWorkDoneList.action | 190792 |
10 | CommonFileUpload.action | 143056 |
11 | AssetManageLogList.action | 133538 |
12 | HdAttachmentUpload.action | 109373 |
13 | RndResourceRequestList.action | 99027 |
14 | CODataImport.action | 84933 |
15 | HdCaseStartProcess.action | 78109 |
16 | AssetITList.action | 74687 |
17 | AssetManageList.action | 67080 |
18 | DoneWorkItemListAct.action | 46896 |
19 | CompleteWorkWithReassign.action | 38973 |
20 | HdServiceDefineList.action | 37070 |
21 | HdWFWorkingUpdateComplete.action | 36601 |
22 | AssignedTo.action | 35443 |
| EFORM_ACTION | MIN_TIME |
1 | StartInvCheck.action | 17978 |
2 | ExportToGEMS.action | 2489 |
3 | KnQueryList.action | 764 |
4 | SSOAppList.action | 666 |
5 | HdRerouteSave.action | 649 |
6 | BeanFlexFormEditDisplay.action | 631 |
7 | EftdAccessRecordEditDisplay.action | 559 |
8 | NewhireManagerRejectSave.action | 490 |
9 | HostAppsEditDisplay.action | 469 |
10 | RndCustomReject.action | 454 |
11 | KnDataCollectionLkupSearch.action | 450 |
12 | HostAppsView.action | 428 |
13 | ITManagerApprove.action | 411 |
14 | HdGoThroughSave.action | 402 |
15 | RndDirector.action | 393 |
16 | RndAssetOther.action | 370 |
17 | WfProcessCancel.action | 366 |
18 | EftdAccessRecordPrint.action | 357 |
19 | EftdAccessRecordView.action | 357 |
20 | RndLevel2Manager.action | 342 |
21 | LtMgrRejectSave.action | 339 |
22 | ModuleView.action | 317 |
| EFORM_ACTION | MAX_TIME | MIN_TIME | AVG_TIME |
1 | ExportToGEMS.action | 2700708 | 2489 | 105730 |
2 | ExportToCO.action | 8998788 | 8 | 96004 |
3 | COToInventory.action | 2033733 | 51 | 25545 |
4 | StartInvCheck.action | 17978 | 17978 | 17978 |
5 | CODataImport.action | 84933 | 103 | 5738 |
6 | CommonFileUpload.action | 143056 | 47 | 4277 |
7 | TerminateITCleanCompleteAction.action | 19942 | 15 | 3511 |
8 | NewhireITManager.action | 12059 | 12 | 3066 |
9 | CompleteWorkWithReassign.action | 38973 | 2 | 2927 |
10 | TerminateCaseStartProcess.action | 25109 | 17 | 2753 |
11 | BatchApprovalCase.action | 34892 | 13 | 2740 |
12 | AssignedTo.action | 35443 | 3 | 2650 |
13 | AssetImport.action | 18068 | 2 | 1833 |
14 | BatchAssignCase.action | 27175 | 4 | 1789 |
15 | GsrpBookingInfo.action | 4045 | 1 | 1784 |
16 | WfWorkingList.action | 8591 | 5 | 1576 |
17 | CloseCaseWithPassword.action | 17230 | 2 | 1510 |
18 | HdCaseReassignView.action | 11742 | 6 | 1347 |
19 | GoHomeNew.action | 2200 | 4 | 1102 |
20 | LoadSupporters.action | 11590 | 5 | 1070 |
21 | ResourceImport.action | 26312 | 3 | 940 |
22 | KnDataResultList.action | 2017 | 10 | 873 |
ActionRank:
TimeRank:
TimePercentage:
SortAction:
| EFORM_ACTION | COUNT_ACTION |
1 | ShowComments.action | 327382 |
2 | WorkHistoryDisplay.action | 226470 |
3 | GoHome.action | 159703 |
4 | ShowWorkLog.action | 150331 |
5 | Login.action | 116902 |
6 | RndResourceRequestList.action | 79082 |
7 | HdCaseEditDisplay.action | 52189 |
8 | TreeRpcController.action | 42645 |
9 | RndResourceRequestView.action | 42447 |
10 | InventoryList.action | 41200 |
11 | TagEmployeeLookup.action | 37362 |
12 | HdCaseList.action | 35135 |
13 | RndResourceRequestEditDisplay.action | 34469 |
14 | RndAssetRequestList.action | 26210 |
15 | HdWFWorking.action | 25738 |
16 | CompleteWorkWithReassign.action | 24530 |
17 | HdCaseView.action | 18616 |
18 | EmployeeList.action | 17446 |
19 | WFApprovalWithPushback.action | 14833 |
CountActionByDate:
CountActionByTime:
文档中只是对数据进行和较初步的分析,而使用SPSS对数据进行深入挖掘的工作在一次会议后便中途停止了,因为得知部门不会因根据些分析结论而给我们增加服务器,哪怕在不变动当前系统结构的情况下系统的压力增加已经造成系统的日益缓慢。深入挖掘的结论非常零散而没有放入文档,不过中间存储过程对日志的处理还是非常有趣的,值得记录。