Web Cache(varnish)

最新推荐文章于 2024-11-07 17:06:29 发布

钟明家

最新推荐文章于 2024-11-07 17:06:29 发布

阅读量316

点赞数

CC 4.0 BY-SA版权

本文链接：https://blog.youkuaiyun.com/weixin_36888575/article/details/98641142

本文深入探讨HTTP协议中的缓存控制机制，包括请求与响应头字段的作用及交互方式，并详细介绍Varnish缓存系统的工作原理、配置方法及实践技巧。

Age：当代理服务器用自己缓存的实体去响应请求时，用该头部表明该实体从产生到现在经过多长时间了。
2、请求：
no-cache（不要缓存的实体，要求现在从WEB服务器去取）不要用缓存响应我 f5 ctrl+f5：不能用缓存内容回应我（no-cache）
max-age：（只接受 Age 值小于 max-age 值，并且没有过期的对象）最大存活期的
max-stale：（可以接受过去的对象，但是过期时间必须小于 max-stale 值）可以接受过期的，但是小于max-stale
min-fresh：（接受其新鲜生命期大于其当前 Age 跟 min-fresh 值之和的缓存对象）可以接受age+min-fresh之和的
3、响应：
public(可以用 Cached 内容回应任何用户)
private（只能用缓存内容回应先前请求该内容的那个用户）
no-cache（可以缓存，但是只有在跟WEB服务器验证了其有效后，才能返回给客户端）验证了缓存有效期后才能缓存
max-age：（本响应包含的对象的过期时间）可缓存时长
s-maxage: （本响应包含的对象的过期时间，强行设置为公共可缓存）公共可缓存时长
4、ALL: no-store（不允许缓存）
5、新鲜度检测机制：
（1）过期日期：
HTTP/1.0 Expires
Expires:Thu, 04 Jun 2015 23:38:18 GMT
HTTP/1.1 Cache-Control: max-age
Cache-Control:max-age=600

（2）有效性再验正：revalidate
如果原始内容未改变，则仅响应首部（不附带body部分），响应码304 （Not Modified）
如果原始内容发生改变，则正常响应，响应码200；
如果原始内容消失，则响应404，此时缓存中的cache object也应该被删除；

（3）条件式请求首部：
If-Modified-Since：基于请求内容的时间戳作验正；
If-Unmodified-Since
If-Match：
If-None-Match：
Etag: faiy89345
二、varnish框架
DSL: vcl：Varnish Configuration Language。缓存策略配置接口；基于“域”的简单编程语言；

1、管理进程：编译VCL并应用新配置；监控vanish；初始化varnish；CLI接口；类似nginx master
commad line:命令行（cli、telnet不安全、web要给钱）需要编译为二进制格式
child process mgm：管理child 与process的，监控vanish
initialition：初始化varnish程序的

2、Child/cache：
Acceptor：接收新的连接请求；
worker threads：处理用户请求；
object Expiry：清理缓存中的过期对象；
backend communication：与后端服务器交流；在v4里有专门处理backend流程体现会非常明白
log/stats:日志状态查看
command line：接受命令行
storage/hashing：存储以及hash计算
比如：每个child只能启动500个work threads，那么1000个请求需要管理进程启动2个child子进程

3、log/file:
varnishlog:普通的日志
varnishstat:状态
varnishhist:历史
varnishtop:排序
varnishncsa:类似web的日志格式

日志：Shared Memory Log，共享内存日志大小默认一般为90MB，分为两部分，前一部分为计数器，后一部分请求相关的数据；

4、安装：
yum -y install epel-release
yum -y install varnish
(1/3): jemalloc-3.6.0-1.el7.x86_64.rpm
(2/3): varnish-libs-4.0.4-3.el7.x86_64.rpm
(3/3): varnish-4.0.4-3.el7.x86_64.rpm
备注：内存分配和回收：malloc(), free()编译安装时需要：jemalloc
5、varnish如何存储缓存对象：
（1）file: 单个文件；varnish专用文件系统，独立自治王国。元数据（key）在内存中，不支持持久机制；（重启后元数据消失，缓存数据失效）会产生大量磁盘io。
（2）malloc: 内存；jemalloc为它服务，内存空间有限，如果使用空间太大，时间久了必然产生内存碎片。（不建议大内存）
备注： menmory：mysql indodb buffer pool。centos 2.6后默认内核自动打开huge page功能。varnish 会激活huge mem，有时候大内存反而会降低性能。
（3）persistent：基于文件的持久存储；varnish 重启后缓存文件不会丢失，4版本都是实验版本，尽量不使用。
比如大量图片基于file缓存，尽量采用固态硬盘，不差钱的pci-e或者那些io接近内存的硬盘来弄。
persistent (experimental)4.0任然是实验
6、如何启动varnish：
/etc/varnish/varnish.params 设定程序启动所用的参数如果centos 6在/etc/sysconfig/varnish
/usr/lib/systemd/system/varnish.service
/usr/lib/systemd/system/varnishlog.service
/usr/lib/systemd/system/varnishncsa.service
（1）附：varnish.params：
# Set this to 1 to make systemd reload try to switch VCL without restart.
RELOAD_VCL=1

# Main configuration file. You probably want to change it.
VARNISH_VCL_CONF=/etc/varnish/default.vcl

# Default address and port to bind to. Blank address means all IPv4
# and IPv6 interfaces, otherwise specify a host name, an IPv4 dotted
# quad, or an IPv6 address in brackets.
# VARNISH_LISTEN_ADDRESS=192.168.1.5
VARNISH_LISTEN_PORT=80

# Admin interface listen address and port
VARNISH_ADMIN_LISTEN_ADDRESS=127.0.0.1
VARNISH_ADMIN_LISTEN_PORT=6082

# Shared secret file for admin interface
VARNISH_SECRET_FILE=/etc/varnish/secret

# Backend storage specification, see Storage Types in the varnishd(5)
# man page for details.
#VARNISH_STORAGE=”malloc,256M”
VARNISH_STORAGE=”file,/var/lib/varnish/varnish_storage.bin,1G”
# User and group for the varnishd worker processes
VARNISH_USER=varnish
VARNISH_GROUP=varnish

# Other options, see the man page varnishd(1)
DAEMON_OPTS=”-p thread_pool_min=5 -p thread_pool_max=500 -p thread_pool_timeout=300″
（2）启动脚本：
ExecStart=/usr/sbin/varnishd \
-P /var/run/varnish.pid \
-f $VARNISH_VCL_CONF \
-a ${VARNISH_LISTEN_ADDRESS}:${VARNISH_LISTEN_PORT} \
-T ${VARNISH_ADMIN_LISTEN_ADDRESS}:${VARNISH_ADMIN_LISTEN_PORT} \
-S $VARNISH_SECRET_FILE \
-s $VARNISH_STORAGE \
$DAEMON_OPTS

三、配置varnish
1、varnishd应用程序的命令行参数；应用程序的工作特性
监听的socket, 使用的存储类型等等；额外的配置参数；
-p param=value
-r param,param,… : 设定只读参数列表；/etc/varnish/varnish.params
-a address[:port][,address[:port][…] 默认6081（服务）web服务一般是80
Listen for client requests on the specified address and port. The address can be a host name (“localhost”), an IPv4 dotted-quad (“127.0.0.1”), or an IPv6 address enclosed in square brackets (“[::1]”). If address is not specified, varnishd will
listen on all available IPv4 and IPv6 interfaces. If port is not specified, the default HTTP port as listed in /etc/services is used. Multiple listening addresses and ports can be specified as a whitespace or comma -separated list.
-b host[:port] 指定后端服务器端口
Use the specified host as backend server. If port is not specified, the default is 8080.
-s [name=]type[,options] 缓存类型，Use the specified storage backend. The storage backends can be one of the following:
· malloc[,size]
· file[,path[,size[,granularity]]] granularity力度
· persistent,path,size 生产可用了4上面persistent (experimental)4.0任然是实验
-T address[:port] （管理端口）Offer a management interface on the specified address and port. See Management Interface for a list of management commands.

2、-p选项指明的参数：线程的工作机制，运行时参数：也可在程序运行中，通过其CLI进行配置；

3、vcl：配置缓存系统的缓存机制；通过vcl配置文件进行配置；先编译，后应用；依赖于c编译器；
4、基本配置：
（1）vcl配置
# vim /etc/varnish/default.vcl
backend default {
.host = “172.16.31.125”;
.port = “80”;
}

sub vcl_recv {
}

sub vcl_backend_response {
}

sub vcl_deliver {
}

# vim /etc/varnish/varnish.params
RELOAD_VCL=1
VARNISH_VCL_CONF=/etc/varnish/default.vcl
VARNISH_LISTEN_PORT=80
VARNISH_ADMIN_LISTEN_ADDRESS=172.16.31.124
VARNISH_ADMIN_LISTEN_PORT=6082
VARNISH_SECRET_FILE=/etc/varnish/secret
VARNISH_STORAGE=”file,/var/lib/varnish/varnish_storage.bin,1G”
VARNISH_USER=varnish
VARNISH_GROUP=varnish
DAEMON_OPTS=”-p thread_pool_min=5 -p thread_pool_max=500 -p thread_pool_timeout=300″
（2）命令行工具：
varnishadm -S /etc/varnish/secret -T IP:PORT
VARNISH_ADMIN_LISTEN_ADDRESS=172.16.31.124
VARNISH_ADMIN_LISTEN_PORT=6082
# varnishadm -S /etc/varnish/secret -T 172.16.31.124:6082
ping
200
PONG 1496379784 1.0
status //状态查看
200
Child in state running
vcl.list
200
active 0 boot
vcl.load haha /etc/varnish/default.vcl //编译生成配置文件
200
VCL compiled.
vcl.use haha //使用配置文件
200
VCL ‘haha’ now active
vcl.list //查看配置文件
200
available 0 boot
active 0 haha
vcl.discard haha //删除配置文件
200

vcl.list
200
active 0 boot
param.show thread_pool_max //查看单个参数
200
thread_pool_max
Value is: 500 [threads]
Default is: 5000
Minimum is: 5

The maximum number of worker threads in each pool.

Do not set this higher than you have to, since excess worker
threads soak up RAM and CPU and generally just get in the way
of getting work done.

Minimum is 10 threads.

NB: This parameter may take quite some time to take (full)
effect.
param.set thread_pool_max 5000 //最大工作线程数
200
panic.show //进程恐慌信息
300
Child has not panicked or panic has been cleared
vcl.show boot //查看编译前配置文件
backend.list //查看后端设备未加入健康检测
200
Backend name Refs Admin Probe
default(172.16.31.125,,80) 1 probe Healthy (no probe)

ban.list //缓存清理规则
200
Present bans:
1496379674.602972 0 C

（3）Log: ctrl+f5：不能用缓存内容回应我（no-cache） f5：不使用本地缓存
varnishlog（普通日志格式）
varnishncsa（httpd类型log）

varnishlog之：varnishncsa
172.16.31.123 – – [02/Jun/2017:02:39:23 -0400] “GET http://172.16.31.124/centos/ HTTP/1.1” 200 240 “http://172.16.31.124/centos/5.11/” “Mozilla/5.0 (Windows NT 6.1; WOW64; rv:53.0) Gecko/20100101 Firefox/53.0”

（4）Statistics
# varnishstat ：状态查看

（5）Top：
# varnishtop（各种信息排序）
四vcl:
state engine：各引擎之间存一定程度上的相关性；前一个engine如果可以有多种下游engine，则上游engine需要用return指明要转移的下游engine；
vcl_recv: 收到请求
vcl_hash：hash规范化计算
vcl_hit：命中缓存
vcl_miss：未命中缓存。
vcl_fetch：到后端服务器去取数据
vcl_deliver：回复响应给客户端
vcl_pipe：如果请求方法不可理解，直接一手托两家在前端后段建立一条管道
vcl_pass：不可缓存的，比如cookie，auth等类信息的。
vcl_error: 如果varnish能够知道此类请求会出错，它可以直接构建错误响应
state engine workflow(v3):
vcl_recv –> vcl_hash –> vcl_hit –> vcl_deliver
vcl_recv –> vcl_hash –> vcl_miss –> vcl_fetch –> vcl_deliver
vcl_recv –> vcl_pass –> vcl_fetch –> vcl_deliver
vcl_recv –> vcl_pipe
备注：vcl_hit与vcl_miss 都可以直接交给vcl_pass，这是有特殊功能（手动删除缓存）。
v4：
前端：
vcl_recv: 收到请求
vcl_hash：hash规范化计算
vcl_hit：命中缓存
vcl_miss：未命中缓存。
vcl_deliver：回复响应给客户端
vcl_pass：不可缓存的，比如cookie，auth等类信息的。
vcl_pipe：如果请求方法不可理解，直接一手托两家在前端后段建立一条管道
vcl_purge：删除单个缓存，在varnish v3中是交由vcl_hit处理
vcl_synth
后端：
vcl_backend_fetch
vcl_backend_response
vcl_backend_error
worker threads：处理用户请求；
backend communication：与后端服务器交流；vcl_backend_fetch、vcl_backend_response、vcl_backend_error
每个请求一个线程，而后端一个线程
五、编程语言语法：
1、总体规则
(1) //, #, /* */ 用于注释；会被编译器忽略；
(2) sub $name: 用于定义子例程；
sub vcl_recv {

}
(3) 不支持循环；
(4) 有众多内置的变量，变量的可调用位置与state engine有密切相关性；
(5) 支持终止语句，return(action)；没有返回值；
(6) “域”专用；
(7) 操作符：=, ==, ~, !, &&, ||

2、条件判断语句：
if (CONDTION) {

} else {

}
3、变量赋值：
set name=value
unset name
===========================================================================================
req.http.HEADER：调用request报文中http协议的指定的HEADER首部；
req.http.X-Forwarded-For
req.http.Auhtorization
req.http.cookie
req.request: 请求方法
例如：
===========================================================================================
sub vcl_recv {
if (req.method == “PRI”) {
/* We do not support SPDY or HTTP/2.0 不支持spdy与http 2.0*/
return (synth(405));
}

if (req.http.restarts == 0 ) {
/*连接不是多次请求的*/
if (req.http.x-Forwarded-For) {
/*如果x-Forwarded-For值存在，添加client.ip到x-Forwarded-For后面*/
set req.http.x-Forwarded-For = req.http.x-Forwarded-For + “,” + client.ip;
} else {
set req.http.x-Forwarded-For = client.ip;
}
}
if (req.method != “GET” &&
req.method != “HEAD” &&
req.method != “PUT” &&
req.method != “POST” &&
req.method != “TRACE” &&
req.method != “OPTIONS” &&
req.method != “DELETE”) {
/* Non-RFC2616 or CONNECT which is weird. 这些方法varnish无法识别使用管道，*/
return (pipe);
}
if (req.method != “GET” && req.method != “HEAD”) {
/* We only deal with GET and HEAD by default 不是get与head方法，无法使用缓存*/
return (pass);
}
if (req.http.Authorization || req.http.Cookie) {
/* Not cacheable by default 认证与cookie不使用缓存*/
return (pass);
}
return (hash);
}
4、函数
===========================================================================================
hash函数
client.ip: 客户端IP；
例如：
===========================================================================================
sub vcl_hash {
/*hash请求url*/

hash_data(req.url);

if (req.http.host) {
/*以主机名做hash*/
hash_data(req.http.host);
} else {
/*以ip做hash,通常为varnish的外部地址*/
hash_data(server.ip);
}

return (lookup);
}
===========================================================================================
vcl_backend_response
===========================================================================================
sub vcl_backend_response {
if (beresp.ttl <= 0s ||
beresp.http.Set-Cookie ||
beresp.http.Surrogate-control ~ “no-store” ||
(!beresp.http.Surrogate-Control &&
beresp.http.Cache-Control ~ “no-cache|no-store|private”) ||
beresp.http.Vary == “*”) {

/* Mark as “Hit-For-Pass” for the next 2 minutes 两分钟不缓存*/

set beresp.ttl = 120s;
set beresp.uncacheable = true;
}
return (deliver);
}
===========================================================================================