一、引用命名空间
using HtmlAgilityPack;
二、运用HtmlAgilityPack解析代码
1、需要解析的代码片段,这是直接从网页复制下的源代码
<tbody>
<tr>
<td align="center">11111111111</td>
<td align="center">11111111111</td>
<td align="center">xxxxxxx分公司(冷饮)</td>
<td align="center">采购订单</td>
<td align="center">经销商采购订单</td>
<td align="center"><span>部分收货</span></td>
<td align="center">已出厂</td>
<td align="center">2022-03-02</td>
<td align="center"> - </td>
<td align="center">66077.25</td>
<td align="center">2022-03-01 16:43:46</td>
<td align="center">姓名</td>
<td align="center">手工</td>
<!--
<td align="center"></td>
<td align="center"></td>
-->
<td align="center">
<a href='view.do?id=xxxxxxxxxxxxx'>查</a>
<a href='copy.do?id=xxxxxxxxxxxx'> 复制</a>
<a href="javascript:;"name="vhc_route_map_btn"order_id="xxxxxxxxxxxx"> | 轨迹</a>
</td>
</tr>
</tbody>
2、代码解析,xpath要准确定位,否则就会查不到信息
string strhtml = "" //strhtml即上图页面源代码
var doc = new HtmlDocument();
doc.LoadHtml(strhtml);
try
{
HtmlNodeCollection nodes =doc.DocumentNode.SelectNodes("//tbody/tr//td[@align=\"center\"]");
//用xpath匹配html元素,
if (nodes != null)
{
foreach (var item in nodes)
{
System.Console.WriteLine(item.InnerHtml.Trim());//打印输出,去除空格
System.Console.WriteLine("-----");
}
}
//System.Console.WriteLine(nodes);
return nodes.Count.ToString();
}
catch (Exception ex)
{
System.Console.WriteLine("未查到信息" + ex);
return "0";
}
3、打印输出结果如下
-----
采购订单
-----
经销商采购订单
-----
<span>已收货</span>
-----
已出厂
-----
2022-03-02
-----
-
-----
66188.95
-----
2022-03-01 16:21:50