知识储备
正则:
1.什么是正则?
指定字符让他具有特殊意义
蛇 --修炼---> 白素贞
. --正则---> 除换行符外的任意字符
2.正则表达式注意事项
1.正则表达式应用非常广泛,存在于各种编程语言中。
2.正则表达式和Linux的通配符以及特殊字符是有区别的。
3.要想学好 grep、sed、awk 首先就需要对正则表达式有一定的了解。只有了解了规则,才能灵活的运用。
3.正则表达式规则
. 除换行符外的任意字符
* 前一个字符的0到无穷次
grep 过滤
seq 100 | grep "1" 过滤有1的行
seq 100 | grep "." 匹配所有内容
后项引用前项
sed -r 's###g'
awk -F '' '{print $0}'
seq 100 竖着输出1到100
| 将前一个命令的结果交给后一个命令
curl
curl https://kg.qq.com/node/personal?uid=639b9f8c252c348c37 #获取网页源代码
-s 静默输出
wget
wget http://tx.stream.kg.qq.com/vcloud1021.tc.qq.com/1021_6ab00ff17ce9474db25e62dccd56a330.f0.m4a 下载
循环读取每一行内容:
for line in `cat xxx.txt`
do
echo $line
done
目的:批量下载全民k歌
主页:
https://kg.qq.com/node/personal?uid=639b9f8c252c348c37
相亲之路
1.获取小网站
https://kg.qq.com/node/personal?uid=639b9f8c252c348c37
2.获取姑娘的地址
<a href="https://node.kg.qq.com/play?s=doBtj4dMh-U-3dM7&g_f=personal" class="mod_playlist__work" target="_blank"><span class="icon_tag icon_tag_main">主打</span>公主病</a>
<a href="https://node.kg.qq.com/play?s=hGw7lJhF-LWJSh1c&g_f=personal" class="mod_playlist__work" target="_blank"><span class="icon_tag icon_tag_main">主打</span>长生诀</a>
<a href="https://node.kg.qq.com/play?s=Mx0qWcMd1VwXCMbC&g_f=personal" class="mod_playlist__work" target="_blank">后会无期</a>
3.去姑娘家里
https://node.kg.qq.com/play?s=doBtj4dMh-U-3dM7&g_f=personal
4.找姑娘
<audio id="player" title="公主病" meta="公主病" src="http://tx.stream.kg.qq.com/vcloud1021.tc.qq.com/1021_6ab00ff17ce9474db25e62dccd56a330.f0.m4a?vkey=CED45909187A9470DF7C290DF076DA57349C834CA4E7590EFF5B340279014B62D84C93CBB603273D2DCB1AFCABBC4073C7F550E117B58F53EFCEE8AD6A23682CBA774F09BFF651EE9739CBEA65D9947F7A65772D9810A00C&fromtag=1506&sdtfrom=v1506&ugcid=162901373_1463681994_77"></audio>
5.得到她
http://tx.stream.kg.qq.com/vcloud1021.tc.qq.com/1021_6ab00ff17ce9474db25e62dccd56a330.f0.m4a
操作:
1.获取小网站
[root@web02 ~]# curl https://kg.qq.com/node/personal?uid=639b9f8c252c348c37
2.获取姑娘的地址
[root@web02 ~]# curl -s https://kg.qq.com/node/personal?uid=639b9f8c252c348c37 | grep "mod_playlist__work"
<a href="https://node.kg.qq.com/play?s=I5rmkiI-SM9M_I-V&g_f=personal" class="mod_playlist__work" target="_blank"><span class="icon_tag icon_tag_main">主打</span>公主病</a>
<a href="https://node.kg.qq.com/play?s=Ghi_6rGzHa4ruGAj&g_f=personal" class="mod_playlist__work" target="_blank"><span class="icon_tag icon_tag_main">主打</span>长生诀</a>
<a href="https://node.kg.qq.com/play?s=98VWq29Li0AFR9uR&g_f=personal" class="mod_playlist__work" target="_blank">后会无期</a>
<a href="https://node.kg.qq.com/play?s=1ANHep1OMG8PJOuS&g_f=personal" class="mod_playlist__work" target="_blank">告白气球</a>
<a href="https://node.kg.qq.com/play?s=H-61ikH5tkwfMHxt&g_f=personal" class="mod_playlist__work" target="_blank">琴师</a>
<a href="https://node.kg.qq.com/play?s=vpKPOAveW2U14vBo&g_f=personal" class="mod_playlist__work" target="_blank">大碗宽面</a>
<a href="https://node.kg.qq.com/play?s=V39SYCVgzs8DOVJC&g_f=personal" class="mod_playlist__work" target="_blank">乡</a>
<a href="https://node.kg.qq.com/play?s=t2fdL8tqM9PKFtEo&g_f=personal" class="mod_playlist__work" target="_blank">山外小楼夜听雨</a>
地址:
[root@web02 ~]# curl -s https://kg.qq.com/node/personal?uid=639b9f8c252c348c37 | grep "mod_playlist__work" | sed -r 's#.*<a href="(.*)" class=.*#\1#g'
https://node.kg.qq.com/play?s=98VWq29LXpIps9Lr&g_f=personal
https://node.kg.qq.com/play?s=Pkzv5-PiJ3R-cPXS&g_f=personal
https://node.kg.qq.com/play?s=W4S9MoWjeY-ZTW3T&g_f=personal
https://node.kg.qq.com/play?s=rJIXFGrlCpE7AlLa&g_f=personal
https://node.kg.qq.com/play?s=1ANHep1OdpUBq1Qd&g_f=personal
https://node.kg.qq.com/play?s=D0xbuEDsXGBRnDU_&g_f=personal
https://node.kg.qq.com/play?s=2tCoa92Q-Wfj72w9&g_f=personal
https://node.kg.qq.com/play?s=D0xbuEDsCR7weD8Y&g_f=personal
3.去姑娘家里
[root@web02 ~]# curl -s 'https://node.kg.qq.com/play?s=98VWq29LXpIps9Lr&g_f=personal'
4.找姑娘
[root@web02 ~]# curl -s 'https://node.kg.qq.com/play?s=98VWq29LXpIps9Lr&g_f=personal' |grep m4a |head -n1 |grep m4a | sed -r 's#.*playurl":"(.*m4a).*#\1#g'
http://tx.stream.kg.qq.com/vcloud1021.tc.qq.com/1021_6ab00ff17ce9474db25e62dccd56a330.f0.m4a
5.得到姑娘
[root@web02 ~]# wget http://tx.stream.kg.qq.com/vcloud1021.tc.qq.com/1021_6ab00ff17ce9474db25e62dccd56a330.f0.m4a
获取歌名
https://node.kg.qq.com/play?s=98VWq29LXpIps9Lr&g_f=personal
<h2 class="play_name">公主病</h2>
[root@lvhanzhi tmp]# curl -s 'https://node.kg.qq.com/play?s=98VWq29LXpIps9Lr&g_f=personal'|grep play_name|awk -F '[<>]' '{print $3}'
公主病
批量下载脚本:
#!/bin/bash
curl -s https://kg.qq.com/node/personal?uid=639b9f8c252c348c37 | grep "mod_playlist__work" | sed -r 's#.*<a href="(.*)" class=.*#\1#g' >girl_home.txt
for home in `cat girl_home.txt`
do
curl `curl -s $home |grep m4a |head -n1 |grep m4a | sed -r 's#.*playurl":"(.*m4a).*#\1#g'` > `curl -s $home |grep play_name|awk -F '[<>]' '{print $3}'`.m4a
done