How to save a web page as HTML or MHT

本文介绍如何使用Delphi中的TWebBrowser组件将网页保存为原始HTML文件或MHT(MHTML)单文件存档格式。通过两种不同方法实现:仅保存HTML源代码或包括所有资源在内的单个MHT文件。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

How to save a web page as HTML or MHT
Here's how to save a web page displayed inside a WebBrowser (TWebBrowser component) as a raw HTML file or into a single (MHT) file (MHTML format: web archive - single file).
 
 
 

When working with Delphi, the TWebBrowser component allows you to create a customized Web browsing application or to add Internet, file and network browsing, document viewing, and data downloading capabilities to your applications.

File ... Save As; or how to save a web page from TWebBrowser
When using Internet Explorer, you are allowed you to view the source HTML code of a page and to save that page as a file on your local drive. If you are viewing a page that you wish to keep, go to the File/Save As ... menu item. In the dialog box that opens, you have several file types offered. Saving the page as a different filetype will affect how the page is saved...

SaveAs in IE

The TWebBrowser component (located on the "Internet" page of the Component Palette) provides access to the Web browser functionality from your Delphi applications. In general, you'll want to enable saving of a web page displayed inside a WebBrowser as a HTML file to a disk.

Saving a web page as a raw HTML
If you only want to save a web page as a raw HTML you would select "Web Page, HTML only (*.htm, *.html)". It will simply save the current page's source HTML to your drive intact. This action will NOT save the graphics from the page or any other files used within the page, which means that if you loaded the file back from the local disk, you would see broken image links.

Here's how to save a web page as raw HTML using Delphi code:

uses ActiveX;
...
procedure WB_SaveAs_HTML
(WB:TWebBrowser; const FileName : string);
var
  PersistStream: IPersistStreamInit;
  Stream: IStream;
  FileStream: TFileStream;
begin
  if not Assigned(WB.Document) then
  begin
    ShowMessage('Document not loaded!');
    Exit;
  end;

  PersistStream := WB.Document as IPersistStreamInit;
  FileStream := TFileStream.Create(FileName, fmCreate);
  try
    Stream := TStreamAdapter.Create(FileStream, soReference)
              as IStream;
    if Failed(PersistStream.Save(Stream, True)) then
      ShowMessage('SaveAs HTML fail!');
  finally
    FileStream.Free;
  end;
end; (* WB_SaveAs_HTML *)

 

Usage sample:

  //first navigate
  WebBrowser1.Navigate('http://delphi.about.com');
  
  //then save
  WB_SaveAs_HTML(WebBrowser1,'c:/WebBrowser1.html');

 

Note 1: the IPersistStreamInit and IStream interfaces are declared inside the ActiveX unit.
Note 2: the web page is saved as a raw html to the WebBrowser1.html file on the root folder of the C drive.

MHT : Web archive - single file
When you save a Web page as "Web archive, single file (*.mht)" the web document gets saved in the Multipurpose Internet Mail Extension HTML (MHTML) format with a .mht file extension. All relative links in the Web page are remapped and the embedded content is included in the .mht file, rather than being saved in a separate folder (as the case is with "Web Page, complete (*.htm, *.html)").

MHTML enables you to send and receive Web pages and other HTML documents using e-mail programs such as Microsoft Outlook, and Microsoft Outlook Express; or even your custom Delphi email sending solutions. MHTML enables you to embed images directly into the body of your e-mail messages rather than attaching them to the message.

Here's how to save a web page as a single file (mht format) using Delphi code:

uses CDO_TLB, ADODB_TLB;
...
procedure WB_SaveAs_MHT(WB: TWebBrowser; FileName: TFileName);
var
  Msg: IMessage;
  Conf: IConfiguration;
  Stream: _Stream;
  URL : widestring;
begin
  if not Assigned(WB.Document) then Exit;
  URL := WB.LocationURL;

  Msg := CoMessage.Create;
  Conf := CoConfiguration.Create;
  try
    Msg.Configuration := Conf;
    Msg.CreateMHTMLBody(URL, cdoSuppressAll, '', '');
    Stream := Msg.GetStream;
    Stream.SaveToFile(FileName, adSaveCreateOverWrite);
  finally
    Msg := nil;
    Conf := nil;
    Stream := nil;
  end;
end; (* WB_SaveAs_MHT *)

 

Sample usage:

  //first navigate
  WebBrowser1.Navigate('http://delphi.about.com');
  
  //then save
  WB_SaveAs_MHT(WebBrowser1,'c:/WebBrowser1.mht');

 

Note 1: The _Stream class is defined in ADODB_TLB unit that you probably already have created. The IMessage and IConfiguration interfaces code from cdosys.dll library. CDO stands for Collaboration Data Objects - object libraries designed to enable SMTP Messaging.
The CDO_TLB is an auto generated unit by Delphi. To create it, from the main menu select "Import Type Library", select "C:/WINDOWS/system32/cdosys.dll" then click the "Create unit" button.

 

Saving a web document using TWebBrowser

Note that you could rewrite the WB_SaveAs_MHT procedure to accept an URL string (not TWebBrowser) to be able to save a web page directly - no need to use the WebBrowser component. The URL from WebBrowser is retrieved using the WB.LocationURL property.

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值