我们在利用FckEditor编辑器的时候会有一个清除从Word粘贴过来的多余html代码的功能,它是利用javascript编写的。有了这项功能以后,我们的网页内容可以直接从Word拷贝粘贴而不用担心内容裏会有一大堆多余的东西佔据资料库空间影响网页执行的性能了。
那么,我们参照了Fckeditor的javascript功能编写了CFScript功能版本的ClearWord函数,利用该函数在页面内容添加入库时可以直接进行清除冗余操作了。
function CleanWord(html)
{
html = REReplaceNocase(html,"
s*
","","all");html = REReplaceNocase(html,"
.*?
"," ","all");html = REReplaceNocase(html,"s*mso-[^:]+:[^;"]+;?","","all");
html = REReplaceNocase(html,"s*MARGIN: 0cm 0cm 0pts*;","","all");
html = REReplaceNocase(html,"s*MARGIN: 0cm 0cm 0pts*"","""","all");
html = REReplaceNocase(html,"s*TEXT-INDENT: 0cms*;","","all");
html = REReplaceNocase(html,"s*TEXT-INDENT: 0cms*"","""","all");
html = REReplaceNocase(html,"s*TEXT-ALIGN: [^s;]+;?"","""","all");
html = REReplaceNocase(html,"s*PAGE-BREAK-BEFORE: [^s;]+;?"","""","all");
html = REReplaceNocase(html,"s*FONT-VARIANT: [^s;]+;?"","""","all");
html = REReplaceNocase(html,"s*tab-stops:[^;"]*;?","","all");
html = REReplaceNocase(html,"s*tab-stops:[^"]*","","all");
html = REReplaceNocase(html,"s*face="[^"]*"","","all");
html = REReplaceNocase(html,"s*face=[^ >]*","","all");
html = REReplaceNocase(html,"s*FONT-FAMILY:[^;"]*;?","","all");
html = REReplaceNocase(html,"]*) class=([^ |>]*)([^>]*)","<1","all");
html = REReplaceNocase(html,"]*) style="([^""]*)"([^>]*)","<1","all");
html = REReplaceNocase(html,"s*style="s*"","","all");
html = REReplaceNocase(html,"]*>s* s*"," ","all");
html = REReplaceNocase(html,"]*>","","all");
html = REReplaceNocase(html,"]*) lang=([^ |>]*)([^>]*)","<1","all");
html = REReplaceNocase(html,"(.*?)","1","all");
html = REReplaceNocase(html,"(.*?)","1","all");
html = REReplaceNocase(html,"]*>","","all");
html = REReplaceNocase(html,"?w+:[^>]*>","","all");
html = REReplaceNocase(html,"s*","","all");
html = REReplaceNocase(html,"
]*)>","","all");
html = REReplaceNocase(html,"
]*)>","","all");
html = REReplaceNocase(html,"
]*)>","","all");
html = REReplaceNocase(html,"
]*)>","","all");
html = REReplaceNocase(html,"
]*)>","","all");
html = REReplaceNocase(html,"
]*)>","","all");
html = REReplaceNocase(html,"","