从老师那里拿到了她博士期间的研究项目,网络信息再利用,看了一下主要就是网页抓取与数据包装的问题。
现在来讲讲怎么搭建:
文件清单如下:
需要的工具是tomcat + eclipse
1。 将 tomcat目录中lib的 serverlet-api.jar 拷贝到jdk/jre/lib/ext中去。
2。 将HTMLParser-2.0-SNAPSHOT\lib 中的jar 和 htmlunit-2.11\lib中的jar 拷贝到tomcat中的lib中去。
3。 建立好WEB-INF/classes文件 将pathreader(class)中的class拷贝到classes中去。 WEB-INF/web.xml的内容为:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<web-app xmlns="http://java.sun.com/xml/ns/javaee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/javaee
http://java.sun.com/xml/ns/javaee/web-app_3_0.xsd"
version="3.0"
metadata-complete="true">
<display-name>Welcome to Tomcat</display-name>
<description>
Welcome to Tomcat
</description>
<!-- pathreader mappings start -->
<servlet>
<servlet-name>PathReaderServlet</servlet-name>
<servlet-class>pathreader.PathReaderServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>PathReaderServlet</servlet-name>
<url-pattern>/pathreader</url-pattern>
</servlet-mapping>
<!-- pathreader mappings end -->
</web-app>
4。配置tomcat的虚拟目录为pathreader然后启动即可。