原题目如下:
编写一段程序,解析url
url是如下格式的字符串:
schema+“://”+ authority + host + port+ fullpath
其中authority、port、fullpath
3项都是可选 的,在url中可以全部出现,出现一部分或者全部不出现
Schema为英文字母(大写、小写)组成的串。典型的如:http,ftp,mms.
authority格式:
user + ":" + passwd + “@”
或者 user + "@"
或者 “:” + "@"
或者 “:” + passwd + “@”
或者 "@"
host 为英文字母(大写、小写)和10个阿拉伯数字组成的串
port格式:
":" + 数字串
fullpath格式和unix下的全路径格式一样 如/user/1.mmp3,2.mp3
现有一个文本文件1.txt,每行是一个url 串,请编写程序,把这个文本文件中所有的url的要schema、user,passwd,host,port,fullpath解析也来,结果放在2.txt中,每行是一个usrl解析结果,每行6列,分析是schema、user,passwd,host,port,fullpath项的值,要求排列整齐。如果没有相应的值则用串"default代替"
实现的思路:
1.首先打开文件1.txt,读取一行数据
2.处理一行数据,进行解析
3.输出追加到2.txt
这里我只做了部分功能,即只做了第2个部分的,其中1.txt的数据如下:
http://sss@ewr1234567890:78/user/1.mp3,2.mp3
http://sss@ewr1234567890:78/user/1.mp3,2.mp3
http://sss@ewr1234567890:78/user/1.mp3,2.mp3
参考另外一篇网页写的
http://blog.youkuaiyun.com/is2120/article/details/6251412
具体代码如下:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int parse_url(char *url,char **Schema_p, char **Authority_p, char **Host_p, int *portp, char **Fullpath_p)
{
char buf[256];
int serverlen, numread=0;
//获取http...ftp...mms...
sscanf(url, "%255[^://]", buf);
serverlen = strlen(buf);
*Schema_p = (char *)malloc(serverlen+1);
strcpy(*Schema_p, buf);
//去掉Schema头
url = url+serverlen+3;
//进行Authority匹配
sscanf(url, "%255[^@]", buf);
serverlen = strlen(buf);
*Authority_p = (char *)malloc(serverlen);
strcpy(*Authority_p, buf);
//去掉Authority部分
url = url+serverlen+1;
//进行Host匹配
sscanf(url, "%255[^:]", buf);
serverlen = strlen(buf);
*Host_p = (char *)malloc(serverlen+1);
strcpy(*Host_p, buf);
//进行port查找
if(url[serverlen]==':')
{
sscanf(&url[serverlen+1], "%d%n", portp, &numread);
/* add one to go PAST it */
numread++;
}
else
{
*portp = 80;
}
/* the path is a pointer into the rest of url */
//获取最终路径
*Fullpath_p = &url[serverlen+numread];
return 0;
}
int main()
{
char url[256] = "http://sss@ewr1234567890:78/user/1.mp3,2.mp3";
char url_1[256] = "ftp://sss@ewr1234567890:78/user/1.mp3,2.mp3";
char Schema_str[256] = {'\0'};
char Authority_str[256] = {'\0'};
char Host_str[256] = {'\0'};
char Fullpath_str[256] = {'\0'};
char* Schema_p = Schema_str;
char* Authority_p = Authority_str;
char* Host_p = Host_str;
char* Fullpath_p = Fullpath_str;
int port;
parse_url(url,&Schema_p,&Authority_p,&Host_p,&port,&Fullpath_p);
printf("%s\n%s\n%s\n%s\n%d\n%s\n",url,Schema_p,Authority_p,Host_p,port,Fullpath_p);
printf("\n\n");
parse_url(url_1,&Schema_p,&Authority_p,&Host_p,&port,&Fullpath_p);
printf("%s\n%s\n%s\n%s\n%d\n%s\n",url,Schema_p,Authority_p,Host_p,port,Fullpath_p);
return 0;
}
如果要实现思路中1和3,只需要将前面定义的url和url_1变成是从1.txt读取的即可,并在最后的printf处改为输出到2.txt并保持格式,这里不再写实现代码。