FastDFS笔记
一、简介
FastDFS是高性能的分布式文件系统,主要功能包括文件的存储、同步、访问(上传和下载),特别适合用于以文件为主体的网络站点(图片分享和视频分享)
FastDFS包含两个角色:
- Tracker (追踪器 调度服务器):调度和文件访问的负载均衡
- Storage (存储器 存储服务器):文件的管理(存储、同步、访问接口)和文件的元数据
支持水平扩展,不会对在线服务器造成任何影响
存储服务器通过卷/组
组织管理,不同的卷管理不同的文件。一个卷管理一台或者多态存储服务器,并且这些存储服务器之间相互备份
存储容量=所有卷容量之和
文件在集群中标示=卷名+文件名
架构图:
二、集群搭建(单机版)
1.克隆虚拟机
修改IP地址
vi /etc/sysconfig/network-scripts/ifcfg-eth0
修改主机名
vi /etc/sysconfig/network
删除mac地址
rm -rf /etc/udev/rules.d/70-persistent-net.rules
2.安装
安装基本环境
yum install gcc-c++ perl-devel pcre-devel openssl-devel zlib-devel wget
下载
wget https://github.com/happyfish100/fastdfs/archive/V5.11.tar.gz
wget https://github.com/happyfish100/libfastcommon/archive/V1.0.36.tar.gz
编译安装
tar -zxvf V1.0.36.tar.gz
cd libfastcommon-1.0.36/
./make.sh && ./make.sh install
cd ..
tar -zxvf V5.11.tar.gz
cd fastdfs-5.11/
(./make.sh clean 错误的安装清除)
./make.sh && ./make.sh install
注意:在使用&&,请记住是两个&&,单个&会导致编译失败,如果出现这种情况可以使用make clean清除后在重新编译
3.配置
在上述操作完成后,在etc/fdfs中会出现几个配置文件,需要修改其后缀名
cd /etc/fdfs
cp tracker.conf.sample tracker.conf
cp storage.conf.sample storage.conf
cp client.conf.sample client.conf
修改配置文件:tracker.conf
base_path=/data/fastdfs/tracker
----------------------------------------
base_path这里手动指定一个文件夹,这里指的是存储文件元数据
需要手动创建文件夹
mkdir -p /data/fastdfs/tracker
修改配置文件:stroage.conf
base_path=/data/fastdfs/storage
store_path0=/data/fastdfs/storage/store
tracker_server=192.168.134.160:22122
base_path是存储数据基本文件夹,包括日志等
store_path0是真正存放的文件的文件夹
tracker_server是指追踪服务器的ip地址,默认端口为22122
mkdir -p /data/fastdfs/tracker
mkdir -p /data/fastdfs/storage/store
修改配置文件:clinet.conf
base_path=/tmp
tracker_server=192.168.134.50:22122
5.启动
fdfs_trackerd /etc/fdfs/tracker.conf start
fdfs_storaged /etc/fdfs/storage.conf start
6.测试
# 监控指令
fdfs_monitor /etc/fdfs/client.conf
# 上传文件
fdfs_upload_file /etc/fdfs/client.conf /root/1.png
# 下载文件
fdfs_download_file /etc/fdfs/client.conf
group1/M00/00/00/wKgriFr0YmeAI_lcABr-dch7j3Q658.png /root/2.png
5
#删除文件
fdfs_delete_file /etc/fdfs/client.conf group1/M00/00/00/wKiGoFxzYyeAGJ0MAAAc3T8QekU.sample
三、Java API
1.依赖(两种方式)
- 手动添加依赖
mvn install
- pom文件
<dependency>
<groupId>net.oschina.zcx7878</groupId>
<artifactId>fastdfs-client-java</artifactId>
<version>1.27.0.0</version>
</dependency>
2.准备配置文件 fdfs_client.conf
tracker_server = 192.168.134.50:22122
3.测试代码
package io.gjf;
import org.csource.common.MyException;
import org.csource.common.NameValuePair;
import org.csource.fastdfs.*;
import org.junit.Before;
import org.junit.Test;
import java.io.FileOutputStream;
import java.io.IOException;
/**
* Create by GuoJF on 2019/2/12
*/
public class Main {
StorageClient client = null;
@Before
public void before() throws Exception {
// 加载配置文件
ClientGlobal.init("fdfs_client.conf");
TrackerClient trackerClient = new TrackerClient();
TrackerServer trackerServer = trackerClient.getConnection();
// 通过client对象操作分布式文件系统
client = new StorageClient(trackerServer, null);
}
@Test
public void testUpload() throws Exception {
client.upload_file("F:\\大数据\\数据清洗.pdf", "pdf", new NameValuePair[]{new
NameValuePair("author", "zs")});
}
/*
* 文件下载
* */
@Test
public void testDownload() throws Exception {
byte[] group1s = client.download_file("group1", "M00/00/00/wKiGMlxiV4WATqHgAAaZKsx5sKo937.pdf");
FileOutputStream fileOutputStream = new FileOutputStream("E:\\a.pdf");
fileOutputStream.write(group1s);
fileOutputStream.close();
}
/*
*
* 删除文件
* */
@Test
public void testDelete() throws IOException, MyException {
client.delete_file("group1", "M00/00/00/wKiGMlxiV4WATqHgAAaZKsx5sKo937.pdf");
}
/*
* 获 取文件信息
* */
@Test
public void testGetFileInfo() throws IOException, MyException {
FileInfo file_info = client.get_file_info("group1", "M00/00/00/wKiGMlxiV4WATqHgAAaZKsx5sKo937.pdf");
System.out.println("文件大小:" + file_info.getFileSize());
System.out.println("创建时间:" + file_info.getCreateTimestamp());
System.out.println("来源IP地址:" + file_info.getSourceIpAddr());
}
/*
* 获取文件的元数据
*
* */
@Test
public void testMetadata() throws IOException, MyException {
NameValuePair[] nameValuePairs = client.get_metadata("group1", "M00/00/00/wKiGMlxiV4WATqHgAAaZKsx5sKo937.pdf");
for (NameValuePair nameValuePair : nameValuePairs) {
System.out.println(nameValuePair.getValue());
}
}
}
四、与SpringBoot项目整合
1.创建SpringBoot工程
远程项目SpringBoot模版
https://github.com/GuoJiafeng/SpringBootSample.git
2.配置入口类
package io.yg;
import com.alibaba.fastjson.serializer.SerializerFeature;
import com.alibaba.fastjson.support.config.FastJsonConfig;
import com.alibaba.fastjson.support.spring.FastJsonHttpMessageConverter;
import com.github.tobato.fastdfs.FdfsClientConfig;
import org.mybatis.spring.annotation.MapperScan;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.autoconfigure.web.HttpMessageConverters;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.EnableMBeanExport;
import org.springframework.context.annotation.Import;
import org.springframework.http.converter.HttpMessageConverter;
import org.springframework.jmx.support.RegistrationPolicy;
/**
* Created by Administrator on 2017/12/2.
*/
@Import(FdfsClientConfig.class)
// 解决jmx重复注册bean的问题
@EnableMBeanExport(registration = RegistrationPolicy.IGNORE_EXISTING)
/**
https://blog.youkuaiyun.com/yaerfeng/article/details/28232435
JMX相关知识
**/
@SpringBootApplication
@MapperScan("com.yg.dao")
public class SpringBootApp {
public static void main(String[] args) {
SpringApplication.run(SpringBootApp.class, args);
}
@Bean
public HttpMessageConverters fastjsonHttpMessageConverter() {
//定义一个转换消息的对象
FastJsonHttpMessageConverter fastConverter = new FastJsonHttpMessageConverter();
//添加fastjson的配置信息 比如 :是否要格式化返回的json数据
FastJsonConfig fastJsonConfig = new FastJsonConfig();
fastJsonConfig.setSerializerFeatures(SerializerFeature.PrettyFormat);
//在转换器中添加配置信息
fastConverter.setFastJsonConfig(fastJsonConfig);
HttpMessageConverter<?> converter = fastConverter;
return new HttpMessageConverters(converter);
}
}
3.配置yml文件
# ===================================================================
# 分布式文件系统FDFS配置
# ===================================================================
fdfs:
so-timeout: 1501
connect-timeout: 601
thumb-image: #缩略图生成参数
width: 150
height: 150
tracker-list: #TrackerList参数,支持多个
- 192.168.134.50:22122
4.依赖
<dependency>
<groupId>com.github.tobato</groupId>
<artifactId>fastdfs-client</artifactId>
<version>1.26.5</version>
</dependency>
5.测试代码
package io.yg.test;
import com.github.tobato.fastdfs.domain.StorePath;
import com.github.tobato.fastdfs.proto.storage.DownloadByteArray;
import com.github.tobato.fastdfs.service.FastFileStorageClient;
import io.yg.SpringBootApp;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.test.context.junit4.SpringRunner;
import java.io.*;
/**
* Create by GuoJF on 2019/2/17
*/
@RunWith(SpringRunner.class)
@SpringBootTest
public class Main {
@Autowired
private FastFileStorageClient storageClient;
/*文件上传*/
@Test
public void testUpload() throws FileNotFoundException {
File file = new File("E:\\ee.md");
FileInputStream inputStream = new FileInputStream(file);
StorePath storePath = storageClient.uploadFile(inputStream,
file.length(), "md", null);
System.out.println(storePath.getGroup() + " | " +
storePath.getPath());
}
/*文件下载*/
@Test
public void testDownload() throws IOException {
byte[] b = storageClient.downloadFile("group1",
"M00/00/00/wKiGMlxpAHqAAAZHAAAhQD_azhs5434.md", new DownloadByteArray());
FileOutputStream fileOutputStream = new
FileOutputStream("E:\\aa.md");
fileOutputStream.write(b);
fileOutputStream.close();
}
}
五、集成fastdfs-nginx-module
1.原因
- 为分布式文件系统提供Http服务支持
- 解决复制延迟问题(重定向到文件存储的源存储服务器获取文件)
2.环境搭建
# fastdfs版本大于等于5.11
# 上传fastdfs-nginx-module 和 nginx-1.11.1.tar.gz
解压安装fastdfs-nginx-module
tar -zxvf nginx-1.11.1.tar.gz
cd nginx-1.11.1
./configure --add-module=/usr/local/src/fastdfs-nginx-module/src/ --
prefix=/usr/local/nginx
make && make install
复制一些配置文件就进入到etc/fdfs中
cp /usr/local/src/fastdfs-nginx-module/src/mod_fastdfs.conf /etc/fdfs
cp fastdfs-5.11/conf/http.conf /etc/fdfs
cp fastdfs-5.11/conf/mime.types /etc/fdfs
修改配置文件:nginx.conf
vi /usr/local/nginx/conf/nginx.conf
server {
listen 8888; ## 该端口为storage.conf中的http.server_port相同
server_name localhost;
location ~/group[0-9]/ {
ngx_fastdfs_module;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
修改配置文件:mod_fastdfs.conf
vi /etc/fdfs/mod_fastdfs.conf
tracker_server=192.168.128.141:22122
url_have_group_name = true
group_name=group1
# 当前需要提供web支持的存储服务器的文件存储目录
store_path0=/data/fastdfs/storage/store
启动nginx服务器
./nginx -c /usr/local/nginx/conf/nginx.conf
启动FastDFS服务
fdfs_trackerd /etc/fdfs/tracker.conf start
fdfs_storaged /etc/fdfs/storage.conf start
3.测试
打开浏览器访问
测试链接规范:
协议://域名(主机名/IP):端口/文件组/文件名
http://nginxserver:8888/groupname/filename
http://192.168.134.50:8888/group1/M00/00/00/wKiGMlxlOS2AR4wnAAaZKjlbdDE408.pdf
六、集成FastDHT
去重原理:
一般的去重原理即存储每个上传文件的md5值,当下次有新文件存入的时候和之前已经存储过的md5值进行对比,如果值相同就代表该文件已经被上传过了,可以将后上传的文件的链接直接引入当前用户的文件库中,避免了的存储空间的浪费,也节省了用户的时间。(参照百度网盘/QQ/微信 秒传功能)
FastDFS本身支持文件的排重处理机制( fdfd_crc32 效率高于MD5),但需要FastDHT作为文件
hash的索引存储。FastDHT是同一个作者的开源key-value数据库。
FastDFS的storage server每次上传均计算文件的hash值,然后从FastDHT服务器上进行查找比
对,如果没有返回,则写入hash,并将文件保存如果有返回,则建立一个新的文件链接(软连接 ln -
s),不保存文件。
在FastDFS中实现文件去重的功能
- 环境搭建
安装基本的类库环境BerkeleyDB
tar -zxf db-4.7.25.tar.gz
cd db-4.7.25
cd build_unix/
./../dist/configure
make && make install
ERROR make clean
安装FastDHT
tar zxf Fa./make.shstDHT_v2.01.tar.gz
cd FastDHT
./make.sh
./make.sh install
sh make.sh clean ./make.sh clean
安装结束后会在/etc目录下产生fdht文件夹
/etc/fdht/
├── fdht_client.conf
├── fdhtd.conf
└── fdht_servers.conf
配置fdhtd.conf
base_path=/data/fastdht
mkdir /data/fastdht
配置fdht_servers.conf
group_count = 1
group0 = 192.168.145.150:11411
配置storage.conf
check_file_duplicate=1
keep_alive=1
#include /etc/fdht/fdht_servers.conf
强调: #include 请严格安装此格式来 # 和 include中间没有空格 #include和后面的路径需要有空格
启动
/usr/local/bin/fdhtd /etc/fdht/fdhtd.conf start
/usr/bin/fdfs_trackerd /etc/fdfs/tracker.conf restart
/usr/bin/fdfs_storaged /etc/fdfs/storage.conf restart
测试
上传相同文件。发现原始文件只会保留一份,相同文件上传后都会变成快捷方式
七、FastDFS分布式文件系统集群环境搭建
准备三台虚拟机
192.168.134.161 tracker server
192.168.134.162 storage server group1
192.168.134.163 storage server group2
在克隆完成,记得三连操作(删mac,改ip,改域名),然后reboot
删除之前在服务器中残留下的数据,也就上一台机器中数据文件夹中的所有数据
1.不集成任何插件
FastDFS01 作为追踪服务器
FastDFS02、FastDFS03作为存储服务器
注意:记住删除之前残留的文件
rm -rf /data/fastdfs/storage/store/*
rm -rf /data/fastdfs/storage/log/*
rm -rf /data/fastdfs/storage/data/*
-------------------------------
rm -rf /data/fastdfs/tracker/*
------------------------------
rm -rf /data/fastdht/*
配置Tracker Server
首先FastDFS01先不需要做任何修改
可以直接启动tracke server
配置Stroager Server ——FastDFS02和FastDFS03
分别打开两个节点中的storage.conf
# 修改卷名
# 192.168.134.52 group1
# 192.168.134.53 group2
group_name=group1
base_path=/data/fastdfs/storage/
store_path0=/data/fastdfs/storage/store
tracker_server=192.168.134.161:22122(主机的ip)
group_name=group2
base_path=/data/fastdfs/storage/
store_path0=/data/fastdfs/storage/store
tracker_server=192.168.134.161:22122(主机的ip)
base_path和store_path0是可以自定义的
上下两段配置文件分别是配置两台Stroage Server的
上述配置完成之后就可以直接在两台StroageServer中启动stroage
此时通过监控命令就可以看见两台Stroageserver在运行(你必须先配置client配置文件将trackerserver地址更新(更新为现在tarackerserver)后才能使用监控命令)
监控命令:fdfs_monitor /etc/fdfs/client.conf
[root@FDFS01 data]# fdfs_monitor /etc/fdfs/client.conf
[2019-02-23 13:26:09] DEBUG - base_path=/tmp, connect_timeout=30, network_timeout=60, tracker_server_count=1, anti_steal_token=0, anti_steal_secret_key length=0, use_connection_pool=0, g_connection_pool_max_idle_time=3600s, use_storage_id=0, storage server id count: 0
server_count=1, server_index=0
tracker server is 192.168.134.121:22122
group count: 2
Group 1:
group name = group1
disk total space = 17581 MB
disk free space = 10949 MB
trunk free space = 0 MB
storage server count = 1
active server count = 1
storage server port = 23000
storage HTTP port = 8888
store path count = 1
subdir count per path = 256
current write server index = 0
current trunk file id = 0
Storage 1:
id = 192.168.134.122
ip_addr = 192.168.134.122 ACTIVE
http domain =
version = 5.11
join time = 2019-02-23 13:24:42
up time = 2019-02-23 13:24:42
total storage = 17581 MB
free storage = 10949 MB
upload priority = 10
store_path_count = 1
subdir_count_per_path = 256
storage_port = 23000
storage_http_port = 8888
current_write_path = 0
source storage id =
if_trunk_server = 0
connection.alloc_count = 256
connection.current_count = 0
connection.max_count = 0
total_upload_count = 0
success_upload_count = 0
total_append_count = 0
success_append_count = 0
total_modify_count = 0
success_modify_count = 0
total_truncate_count = 0
success_truncate_count = 0
total_set_meta_count = 0
success_set_meta_count = 0
total_delete_count = 0
success_delete_count = 0
total_download_count = 0
success_download_count = 0
total_get_meta_count = 0
success_get_meta_count = 0
total_create_link_count = 0
success_create_link_count = 0
total_delete_link_count = 0
success_delete_link_count = 0
total_upload_bytes = 0
success_upload_bytes = 0
total_append_bytes = 0
success_append_bytes = 0
total_modify_bytes = 0
success_modify_bytes = 0
stotal_download_bytes = 0
success_download_bytes = 0
total_sync_in_bytes = 0
success_sync_in_bytes = 0
total_sync_out_bytes = 0
success_sync_out_bytes = 0
total_file_open_count = 0
success_file_open_count = 0
total_file_read_count = 0
success_file_read_count = 0
total_file_write_count = 0
success_file_write_count = 0
last_heart_beat_time = 2019-02-23 13:25:56
last_source_update = 1970-01-01 08:00:00
last_sync_update = 1970-01-01 08:00:00
last_synced_timestamp = 1970-01-01 08:00:00
Group 2:
group name = group2
disk total space = 17581 MB
disk free space = 10949 MB
trunk free space = 0 MB
storage server count = 1
active server count = 1
storage server port = 23000
storage HTTP port = 8888
store path count = 1
subdir count per path = 256
current write server index = 0
current trunk file id = 0
Storage 1:
id = 192.168.134.123
ip_addr = 192.168.134.123 ACTIVE
http domain =
version = 5.11
join time = 2019-02-23 13:24:45
up time = 2019-02-23 13:25:01
total storage = 17581 MB
free storage = 10949 MB
upload priority = 10
store_path_count = 1
subdir_count_per_path = 256
storage_port = 23000
storage_http_port = 8888
current_write_path = 0
source storage id =
if_trunk_server = 0
connection.alloc_count = 256
connection.current_count = 0
connection.max_count = 0
total_upload_count = 0
success_upload_count = 0
total_append_count = 0
success_append_count = 0
total_modify_count = 0
success_modify_count = 0
total_truncate_count = 0
success_truncate_count = 0
total_set_meta_count = 0
success_set_meta_count = 0
total_delete_count = 0
success_delete_count = 0
total_download_count = 0
success_download_count = 0
total_get_meta_count = 0
success_get_meta_count = 0
total_create_link_count = 0
success_create_link_count = 0
total_delete_link_count = 0
success_delete_link_count = 0
total_upload_bytes = 0
success_upload_bytes = 0
total_append_bytes = 0
success_append_bytes = 0
total_modify_bytes = 0
success_modify_bytes = 0
stotal_download_bytes = 0
success_download_bytes = 0
total_sync_in_bytes = 0
success_sync_in_bytes = 0
total_sync_out_bytes = 0
success_sync_out_bytes = 0
total_file_open_count = 0
success_file_open_count = 0
total_file_read_count = 0
success_file_read_count = 0
total_file_write_count = 0
success_file_write_count = 0
last_heart_beat_time = 2019-02-23 13:26:02
last_source_update = 1970-01-01 08:00:00
last_sync_update = 1970-01-01 08:00:00
last_synced_timestamp = 1970-01-01 08:00:00
[root@FDFS01 data]#
提示:如果不成功,请先屏蔽掉之前文件去重的相关配置
2.集成Nginx
继承Nginx还是需要前面提到的单节点配置需要的基础环境,所以此安装过程在克隆了单节点的基础上的,一些基础配置是可以省略的。
修改StroageServer 中mod_fastdfs.conf文件
FastDFS02
tracker_server=192.168.43.161:22122
url_have_group_name = true
group_name=group1
# 当前需要提供web支持的存储服务器的文件存储目录
store_path0=/data/fastdfs/storage/store
FastDFS03
tracker_server=192.168.43.162:22122
url_have_group_name = true
group_name=group2
# 当前需要提供web支持的存储服务器的文件存储目录
store_path0=/data/fastdfs/storage/store
启动nginx
/usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
提示:如果不成功,请先屏蔽掉之前文件去重的相关配置
新的问题:
- 访问不同组,ip地址需要变动
- 存储服务器直接暴漏给用户,会造成数据不安全
解决之道:
配置网关服务器(Nginx),提供统一的请求入口,避免数据服务器直接暴漏给用户造成的数据
安全问题
搭建网关服务器Nginx
cd /usr/local
rm -rf nginx/
cd /usr/local/src/
rm -rf nginx-1.11.1
tar -zxvf nginx-1.11.1.tar.gz
cd nginx-1.11.1
./configure --prefix=/usr/local/nginx
make && make install
修改网关服务器Nginx的配置文件
vi /usr/local/nginx/conf/nginx.conf
upstream fdfs_group1{
server 192.168.134.162:8888 weight=1 max_fails=2 fail_timeout=30s;
}
upstream fdfs_group2{
server 192.168.134.163:8888 weight=1 max_fails=2 fail_timeout=30s;
}
---------------------------------------------------------------------------------------
location /group1/M00 {
proxy_pass http://fdfs_group1;
}
location /group2/M00 {
proxy_pass http://fdfs_group2;
}
启动TrackerServer中的Nginx
/usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
3.集成DHT
在trackerserver中安装dht服务
已经存在话则不需要重新安装
删除dht的数据即可
修改存储服务器的配置文件fdht_servers.conf
group_count = 1
group0 = 192.168.134.160:11411
两台机器更该为相同的配置
编辑两台机器的storage.conf(在上面屏蔽掉的dht设置还原回来即可)
check_file_duplicate=1
keep_alive=1
#include /etc/fdht/fdht_servers.conf
启动服务
fdhtd /etc/fdht/fdhtd.conf start (tracker中开启去重)
/usr/bin/fdfs_trackerd /etc/fdfs/tracker.conf restart (tracker中重启)
/usr/bin/fdfs_storaged /etc/fdfs/storage.conf restart (storage中重启)