场景:
1.有些关键词以excel表格的形式存储,但是某列的关键词可能重复了,需要过滤掉它,如何做更快呢?用excel的vbs编程是一种方案。这里用我的是独立的vbs脚本。首先存为.csv格式,excel的另一种文本编辑模式,当然任意脚本语言都能做到,这里用vbs只是为了对vbs函数和语法有进一步认识。。
过滤关键词.vbs (ANSI编码)
' Author: sai
' The include sub is used to import some files like java.
Sub include(file)
Set objFS = CreateObject("Scripting.FileSystemObject")
Set logFile=objFS.OpenTextFile(file,1,false,TristateFalse)
Dim all
all=logFile.readAll
logFile.close
ExecuteGlobal all
End Sub
' Begin
Dim currentDirectory,fileName
Set WshShell = WScript.CreateObject("WScript.Shell")
Set objFS = CreateObject("Scripting.FileSystemObject")
currentDirectory = WshShell.CurrentDirectory
fileName=InputBox("请输入文件名:"&vbCr&vbCr&" ")
Dim currentFile
currentFile = currentDirectory & "/" & fileName
If Not objFS.FileExists(currentFile) Then
Wscript.echo "文件不存在: " & currentFile & " " & "已退出"
WScript.Quit(1)
End If
Wscript.echo "过滤开始"
Set csvFile=objFS.OpenTextFile(currentFile,1,false,TristateFalse)
Dim today,newFileName, newFolderPath, newFilePath
today = DatePart("yyyy",Date) & DatePart("m",Date) & DatePart("d",Date)
newFileName = today & "_" & fileName
newFolderPath = currentDirectory & "/" & today
newFilePath = newFolderPath & "/" & newFileName
'1.创建新文件夹,复制Excel文件
If Not objFS.FolderExists(newFolderPath) Then
objFS.CreateFolder(newFolderPath)
End If
'创建以ANSI编码的文件,这样Excel打开CSV时能正确识别逗号
Set txtFile=objFS.CreateTextFile(newFilePath,True,False)
'是否记录日志
Set logFile=objFS.CreateTextFile(newFolderPath & "/" & "delete_" & newFileName,True,False)
Dim line,tempFix,keyword
Set map = CreateObject("Scripting.Dictionary")
While csvFile.AtEndOfStream <> True
line = csvFile.readline()
tempFix = Split(line,",",-1,1)
keyword = tempFix(0)
If map.Exists(keyword) Then
logFile.WriteLine(line)
Else
map.Add keyword,line
txtFile.WriteLine(line)
End If
Wend
csvFile.Close()
txtFile.Close()
logFile.Close()
Wscript.echo "过滤结束"
原文件.csv
第一列是关键词.
我们,大家
大像,1
我们,23
刘,ad
哦,asdf
过滤后的.csv
我们,大家
大像,1
刘,ad
哦,asdf