网站备份本地镜像

最新推荐文章于 2024-03-06 15:47:52 发布

原创最新推荐文章于 2024-03-06 15:47:52 发布 · 742 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#网站备份本地镜像

运维专栏收录该内容

10 篇文章

订阅专栏

由于一些不可说的原因，单位网站需要备份一个镜像到本地，以备不时之需。此为背景。

由于源代码是静态网页，先考虑手工备份，手工下载页面和相关资源，后来工作量太大，放弃。

网上查工具，先试了teleport pro，按照网上教程，不知道为何各种无法下载。

再查工具，webzip，下载了，绿色破解版，按照教程配置了，下载，顺利，个别页面和资源失败，影响不大。反正手工需要调整。

本地nginx代理资源，逐个查看404，下载css，js，img，html。修改页面超链接（很多超链接是非html后缀的，与目录不匹配）。

其中最坑的地方是，所有页面都有一个公共的页面头部（包含菜单），内容非常多。这就说明所有的页面头部都需要修改css，js，超链接的内容。这个工作量。。。作为程序员，只能自己动手了。

1、先准备一个正确的本地可访问的头部html内容，存入一个文本文件。

2、程序访问所有html，定位需要替换的部分，将预先准备好的内容替换。

废话不多，上java代码：


import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;

public class Rewrite {

	public static void main(String[] args) {
		String path = "C:\\soft\\nginx-1.17.10\\html";
		String suffix=".html";
		String source = "C:\\Users\\lenovo\\Desktop\\s.txt";
		String begin="<!DOCTYPE html>";
		String end="<div class=\"main\">";
		try {
			String s = readSource(source);
			System.out.println("rp len:"+s.length());
			rewrite(path,s,suffix,begin,end);
		}catch (Exception e) {
			e.printStackTrace();
		}
	}
	
	private static String readSource(String  source) throws Exception {
		return new String(Files.readAllBytes(Paths.get(source)));
	}
	
	private static void rewrite(final String path,final String s,final String suffix,final String begin,final String end) throws Exception {
		
		if(Files.isDirectory(Paths.get(path))) {
			System.out.println("folder path:" +path);
			
			for(File t : new File(path).listFiles()) {
				rewrite(t.getAbsolutePath(),s,suffix,begin,end);
			}
		}else {
			System.out.println("file path:" +path);
			if(path.endsWith(suffix)) {
				String t = new String(Files.readAllBytes(Paths.get(path)));
				System.out.println("ori len:"+t.length());
				
				int beginindex = t.indexOf(begin);
				if(-1 == beginindex) {
					System.out.println("begin tag unfound");
					return;
				}
				String pre = t.substring(0,beginindex);
				System.out.println("pre len:"+pre.length());
				
				int endindex = t.indexOf(end);
				if(-1 == endindex) {
					System.out.println("end tag unfound");
					return;
				}
				String sf = t.substring(endindex);
				System.out.println("sf len:"+sf.length());
				
				String n = pre + s +sf;
				System.out.println("new len:"+n.length());
				if(t.length() == n.length()) {
					System.out.println("file has dealed");
					return;
				}
				
				Files.write(Paths.get(path), n.getBytes(), StandardOpenOption.WRITE,StandardOpenOption.TRUNCATE_EXISTING);
			}
		}
	}
}

这里需要补充的是定位方法是使用html标签指定开始和结束标签，然后将这一部分替换掉。比如我需要替换从开头到div class="main"的部分。