最近一直在opengrok上看代码,该工具搜索,查看调用等都很方便,不好是放在服务器上,有时候网络不好时,看起来很不方便。便想着把代码下载到本地,网络不好时在本地看。首先是网上找下载工具,都没有合适的。最后没办法只好自己动手写一个脚本,从网上找了个python脚本,根据我的情况修改了一下,运行良好。
代码如下,因为是在我的环境上运行的,所以对于脚本中有些提取目录和下载链接的关键词要根据你的情况来修改。
#!/usr/bin/python
import requests
import re
import os
import sys
def help(script):
text = 'python3 %s <link_address> <path>' % script
print(text)
def get_file(url,path):##文件下载函数
content = requests.get(url).text
file_str1 = "class=\"p\""
if file_str1 in content:
#Get download link from html, you should change the key word according to your html
sub_url = re.findall('class="p".*?href="(/source/download.*?)"',content)
length = len(sub_url)
print("\033[31m======Total %s files, will download=======\033[0m" %length)
for sub_path in sub_url:
cpath = sub_path[sub_path.rfind('/Qualcomm/'):]
fileName = sub_path.split('/')[-1]
print("downloading -> %-60s" %(fileName), end=" ")
dir_path = path + cpath
#To check whether file is exist or not, you can delete this if you want download all file again
if os.path.isfile(dir_path):
print("\033[33mFile is exists already, ignoring!\033[0m")
continue
#it can not extract IP or domain from html, so we get it from input URL
domainName = url[0:url.rfind('/source/xref/Qualcomm')]
res = requests.get(domainName+sub_path)
res.raise_for_status() # 确保程序在下载失败时停止
playFile = open(path+cpath, 'wb')
for chunk in res.iter_content(100000):
playFile.write(chunk)
playFile.close()
print("\033[32mDone\033[0m")
def get_dir(url,path): #文件夹处理逻辑
content = requests.get(url).text
dir_str1 = "class=\"r\"" #directory mark in html
file_str1 = "class=\"p\"" #single file mark in html
if dir_str1 in content:
sub_url = re.findall('class="r".*?href="(/source/history.*?)"',content)
for sub_path in sub_url:
path_slice=sub_path[sub_path.rfind('/Qualcomm/'):]
if not os.path.exists(path+path_slice):
print("will create directory %s" %(path+path_slice))
os.makedirs(path+path_slice)
i = sub_path.split('/')[-1]
get_dir(url+"/"+i,".")
if file_str1 in content:
get_file(url,path)
if __name__ == '__main__':
if (len(sys.argv) < 2):
help(sys.argv[0])
exit(0)
else:
get_dir(sys.argv[1],".")
本文介绍了一位开发者因网络问题不便使用在线代码浏览工具OpenGrok,决定自行编写Python脚本将代码下载到本地的故事。该脚本能从指定链接抓取代码并按目录结构保存,避免了网络不稳定带来的困扰。
9万+

被折叠的 条评论
为什么被折叠?



