解析FTP LIST命令返回的目录列表

本文介绍了一个复杂的FTP目录列表解析程序,该程序能够处理多种不同格式的目录列表,并针对不同的服务器类型进行特殊处理。文章详细解释了如何解析Unix、DOS等格式的目录列表,并讨论了多行条目和日期格式等复杂情况。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

有个程序要分析FTP服务器返回的目录列表,本来以为比较简单,也在网上查了几个帖子,可都是一知半解的。于是下载了Filezilla的源代码,她的源文件
directorylistingparser.h
directorylistingparser.cpp
就是解析目录列表的,有同样需求的不妨看一看,还是挺费事的,不同平台要都要特殊处理。我把头文件贴出来:


InBlock.gif#ifndef __DIRECTORYLISTINGPARSER_H__
InBlock.gif#define __DIRECTORYLISTINGPARSER_H__
InBlock.gif
/* This class is responsible for parsing the directory listings returned by
InBlock.gif * the server.
InBlock.gif * Unfortunatly, RFC959 did not specify the format of directory listings, so
InBlock.gif * each server uses its own format. In addition to that, in most cases the 
InBlock.gif * listings were not designed to be machine-parsable, they were meant to be
InBlock.gif * human readable by users of that particular server.
InBlock.gif * By far the most common format is the one returned by the Unix "ls -l"
InBlock.gif * command. However, legacy systems are still in place, especially in big
InBlock.gif * companies. These often use very exotic listing styles.
InBlock.gif * Another problem are localized listings containing date strings. In some
InBlock.gif * cases these listings are ambiguous and cannot be distinguished.
InBlock.gif * Example for an ambiguous date: 04-05-06. All of the 6 permutations for
InBlock.gif * the location of year, month and day are valid dates.
InBlock.gif * Some servers send multiline listings where a single entry can span two
InBlock.gif * lines, this has to be detected as well, as far as possible.
InBlock.gif *
InBlock.gif * Some servers send MVS style listings which can consist of just the 
InBlock.gif * filename without any additional data. In order to prevent problems, this 
InBlock.gif * format is only parsed if the server is in fact recognizes as MVS server.
InBlock.gif *
InBlock.gif * Please see tests/dirparsertest.cpp for a list of supported formats and the
InBlock.gif * expected parser result.
InBlock.gif *
InBlock.gif * If adding data to the parser, it first decomposes the raw data into lines,
InBlock.gif * which then are processed further. Each line gets consecutively tested for
InBlock.gif * different formats, starting with the most common Unix style format.
InBlock.gif * Lines not containing a recognized format (e.g. a part of a multiline
InBlock.gif * entry) are rememberd and if the next line cannot be parsed either, they
InBlock.gif * get concatenated to be parsed again (and discarded if not recognized).
InBlock.gif */

InBlock.gif
class CLine;
InBlock.gifclass CToken;
InBlock.gifclass CControlSocket;
InBlock.gifclass CDirectoryListingParser
InBlock.gif{
InBlock.gifpublic:
InBlock.gif  CDirectoryListingParser(CControlSocket* pControlSocket, const CServer& server);
InBlock.gif  ~CDirectoryListingParser();
InBlock.gif
  CDirectoryListing Parse(const CServerPath &path);
InBlock.gif
  void AddData(char *pData, int len);
InBlock.gif  void AddLine(const wxChar* pLine);
InBlock.gif
  void Reset();
InBlock.gif
  void SetTimezoneOffset(const wxTimeSpan& span) { m_timezoneOffset = span; }
InBlock.gif
  void SetServer(const CServer& server) { m_server = server; };
InBlock.gif
protected:
InBlock.gif  CLine *GetLine(bool breakAtEnd = false);
InBlock.gif
  void ParseData(bool partial);
InBlock.gif
  bool ParseLine(CLine *pLine, const enum ServerType serverType, bool concatenated);
InBlock.gif
  bool ParseAsUnix(CLine *pLine, CDirentry &entry, bool expect_date);
InBlock.gif  bool ParseAsDos(CLine *pLine, CDirentry &entry);
InBlock.gif  bool ParseAsEplf(CLine *pLine, CDirentry &entry);
InBlock.gif  bool ParseAsVms(CLine *pLine, CDirentry &entry);
InBlock.gif  bool ParseAsIbm(CLine *pLine, CDirentry &entry);
InBlock.gif  bool ParseOther(CLine *pLine, CDirentry &entry);
InBlock.gif  bool ParseAsWfFtp(CLine *pLine, CDirentry &entry);
InBlock.gif  bool ParseAsIBM_MVS(CLine *pLine, CDirentry &entry);
InBlock.gif  bool ParseAsIBM_MVS_PDS(CLine *pLine, CDirentry &entry);
InBlock.gif  bool ParseAsIBM_MVS_PDS2(CLine *pLine, CDirentry &entry);
InBlock.gif  bool ParseAsIBM_MVS_Migrated(CLine *pLine, CDirentry &entry);
InBlock.gif  bool ParseAsMlsd(CLine *pLine, CDirentry &entry);
InBlock.gif  bool ParseAsOS9(CLine *pLine, CDirentry &entry);
InBlock.gif  
InBlock.gif  // Only call this if servertype set to ZVM since it conflicts
InBlock.gif  // with other formats.
InBlock.gif  bool ParseAsZVM(CLine *pLine, CDirentry &entry);
InBlock.gif
  // Only call this if servertype set to HPNONSTOP since it conflicts
InBlock.gif  // with other formats.
InBlock.gif  bool ParseAsHPNonstop(CLine *pLine, CDirentry &entry);
InBlock.gif
  // Date / time parsers
InBlock.gif  bool ParseUnixDateTime(CLine *pLine, int &index, CDirentry &entry);
InBlock.gif  bool ParseShortDate(CToken &token, CDirentry &entry, bool saneFieldOrder = false);
InBlock.gif  bool ParseTime(CToken &token, CDirentry &entry);
InBlock.gif
  // Parse file sizes given like this: 123.4M
InBlock.gif  bool ParseComplexFileSize(CToken& token, wxLongLong& size, int blocksize = -1);
InBlock.gif
  bool GetMonthFromName(const wxString& name, int &month);
InBlock.gif
  CControlSocket* m_pControlSocket;
InBlock.gif
  static std::map<wxString, int> m_MonthNamesMap;
InBlock.gif  
InBlock.gif  struct t_list
InBlock.gif  {
InBlock.gif    char *p;
InBlock.gif    int len;
InBlock.gif  };
InBlock.gif  int m_currentOffset;
InBlock.gif
  std::list<t_list> m_DataList;
InBlock.gif  std::list<CDirentry> m_entryList;
InBlock.gif
  CLine *m_prevLine;
InBlock.gif
  CServer m_server;
InBlock.gif
  bool m_fileListOnly;
InBlock.gif  std::list<wxString> m_fileList;
InBlock.gif  
InBlock.gif  bool m_maybeMultilineVms;
InBlock.gif
  wxTimeSpan m_timezoneOffset;
InBlock.gif};
InBlock.gif
#endif









本文转自 h2appy  51CTO博客,原文链接:http://blog.51cto.com/h2appy/122279,如需转载请自行联系原作者
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值