Web Cache(varnish)

本文深入探讨HTTP协议中的缓存控制机制,包括请求与响应头字段的作用及交互方式,并详细介绍Varnish缓存系统的工作原理、配置方法及实践技巧。

一、httpd协议关于cache
1、httpcache框架
Cache-Control = “Cache-Control” “:” 1#cache-directive
cache-directive = cache-request-directive
| cache-response-directive
cache-request-directive =
“no-cache”
| “no-store” (backup)
| “max-age” “=” delta-seconds
| “max-stale” [ “=” delta-seconds ]
| “min-fresh” “=” delta-seconds
| “no-transform”
| “only-if-cached”
| cache-extension
cache-response-directive =
“public”
| “private” [ “=” <“> 1#field-name <“> ]
| “no-cache” [ “=” <“> 1#field-name <“> ]
| “no-store”
| “no-transform”
| “must-revalidate”
| “proxy-revalidate”
| “max-age” “=” delta-seconds
| “s-maxage” “=” delta-seconds
| cache-extension

Age:当代理服务器用自己缓存的实体去响应请求时,用该头部表明该实体从产生到现在经过多长时间了。
2、请求:
no-cache(不要缓存的实体,要求现在从WEB服务器去取) 不要用缓存响应我 f5 ctrl+f5:不能用缓存内容回应我(no-cache)
max-age:(只接受 Age 值小于 max-age 值,并且没有过期的对象) 最大存活期的
max-stale:(可以接受过去的对象,但是过期时间必须小于 max-stale 值) 可以接受过期的,但是小于max-stale
min-fresh:(接受其新鲜生命期大于其当前 Age 跟 min-fresh 值之和的缓存对象) 可以接受age+min-fresh之和的
3、响应:
public(可以用 Cached 内容回应任何用户)
private(只能用缓存内容回应先前请求该内容的那个用户)
no-cache(可以缓存,但是只有在跟WEB服务器验证了其有效后,才能返回给客户端) 验证了缓存有效期后才能缓存
max-age:(本响应包含的对象的过期时间) 可缓存时长
s-maxage: (本响应包含的对象的过期时间,强行设置为公共可缓存)公共可缓存时长
4、ALL: no-store(不允许缓存)
5、新鲜度检测机制:
(1)过期日期:
HTTP/1.0 Expires
Expires:Thu, 04 Jun 2015 23:38:18 GMT
HTTP/1.1 Cache-Control: max-age
Cache-Control:max-age=600

(2)有效性再验正:revalidate
如果原始内容未改变,则仅响应首部(不附带body部分),响应码304 (Not Modified)
如果原始内容发生改变,则正常响应,响应码200;
如果原始内容消失,则响应404,此时缓存中的cache object也应该被删除;

(3)条件式请求首部:
If-Modified-Since:基于请求内容的时间戳作验正;
If-Unmodified-Since
If-Match:
If-None-Match:
Etag: faiy89345
二、varnish框架
DSL: vcl:Varnish Configuration Language。缓存策略配置接口;基于“域”的简单编程语言;

1、管理进程:编译VCL并应用新配置;监控vanish;初始化varnish;CLI接口;类似nginx master
commad line:命令行(cli、telnet不安全、web要给钱)需要编译为二进制格式
child process mgm:管理child 与process的,监控vanish
initialition:初始化varnish程序的

2、Child/cache:
Acceptor:接收新的连接请求;
worker threads:处理用户请求;
object Expiry:清理缓存中的过期对象;
backend communication:与后端服务器交流;在v4里有专门处理backend流程体现会非常明白
log/stats:日志状态查看
command line:接受命令行
storage/hashing:存储以及hash计算
比如:每个child只能启动500个work threads,那么1000个请求需要管理进程启动2个child子进程

3、log/file:
varnishlog:普通的日志
varnishstat:状态
varnishhist:历史
varnishtop:排序
varnishncsa:类似web的日志格式

日志:Shared Memory Log,共享内存日志大小默认一般为90MB,分为两部分,前一部分为计数器,后一部分请求相关的数据;

4、安装:
yum -y install epel-release
yum -y install varnish
(1/3): jemalloc-3.6.0-1.el7.x86_64.rpm
(2/3): varnish-libs-4.0.4-3.el7.x86_64.rpm
(3/3): varnish-4.0.4-3.el7.x86_64.rpm
备注:内存分配和回收:malloc(), free()编译安装时需要:jemalloc
5、varnish如何存储缓存对象:
(1)file: 单个文件;varnish专用文件系统,独立自治王国。元数据(key)在内存中,不支持持久机制;(重启后元数据消失,缓存数据失效)会产生大量磁盘io。
(2)malloc: 内存;jemalloc为它服务,内存空间有限,如果使用空间太大,时间久了必然产生内存碎片。(不建议大内存)
备注: menmory:mysql indodb buffer pool。centos 2.6后默认内核自动打开huge page功能。varnish 会激活huge mem,有时候大内存反而会降低性能。
(3)persistent:基于文件的持久存储;varnish 重启后缓存文件不会丢失,4版本都是实验版本,尽量不使用。
比如大量图片基于file缓存,尽量采用固态硬盘,不差钱的pci-e或者那些io接近内存的硬盘来弄。
persistent (experimental)4.0任然是实验
6、如何启动varnish:
/etc/varnish/varnish.params 设定程序启动所用的参数如果centos 6在/etc/sysconfig/varnish
/usr/lib/systemd/system/varnish.service
/usr/lib/systemd/system/varnishlog.service
/usr/lib/systemd/system/varnishncsa.service
(1)附:varnish.params:
# Set this to 1 to make systemd reload try to switch VCL without restart.
RELOAD_VCL=1

# Main configuration file. You probably want to change it.
VARNISH_VCL_CONF=/etc/varnish/default.vcl

# Default address and port to bind to. Blank address means all IPv4
# and IPv6 interfaces, otherwise specify a host name, an IPv4 dotted
# quad, or an IPv6 address in brackets.
# VARNISH_LISTEN_ADDRESS=192.168.1.5
VARNISH_LISTEN_PORT=80

# Admin interface listen address and port
VARNISH_ADMIN_LISTEN_ADDRESS=127.0.0.1
VARNISH_ADMIN_LISTEN_PORT=6082

# Shared secret file for admin interface
VARNISH_SECRET_FILE=/etc/varnish/secret

# Backend storage specification, see Storage Types in the varnishd(5)
# man page for details.
#VARNISH_STORAGE=”malloc,256M”
VARNISH_STORAGE=”file,/var/lib/varnish/varnish_storage.bin,1G”
# User and group for the varnishd worker processes
VARNISH_USER=varnish
VARNISH_GROUP=varnish

# Other options, see the man page varnishd(1)
DAEMON_OPTS=”-p thread_pool_min=5 -p thread_pool_max=500 -p thread_pool_timeout=300″
(2)启动脚本:
ExecStart=/usr/sbin/varnishd \
-P /var/run/varnish.pid \
-f $VARNISH_VCL_CONF \
-a ${VARNISH_LISTEN_ADDRESS}:${VARNISH_LISTEN_PORT} \
-T ${VARNISH_ADMIN_LISTEN_ADDRESS}:${VARNISH_ADMIN_LISTEN_PORT} \
-S $VARNISH_SECRET_FILE \
-s $VARNISH_STORAGE \
$DAEMON_OPTS

三、配置varnish
1、varnishd应用程序的命令行参数;应用程序的工作特性
监听的socket, 使用的存储类型等等;额外的配置参数;
-p param=value
-r param,param,… : 设定只读参数列表;/etc/varnish/varnish.params
-a address[:port][,address[:port][…] 默认6081(服务)web服务一般是80
Listen for client requests on the specified address and port. The address can be a host name (“localhost”), an IPv4 dotted-quad (“127.0.0.1”), or an IPv6 address enclosed in square brackets (“[::1]”). If address is not specified, varnishd will
listen on all available IPv4 and IPv6 interfaces. If port is not specified, the default HTTP port as listed in /etc/services is used. Multiple listening addresses and ports can be specified as a whitespace or comma -separated list.
-b host[:port] 指定后端服务器端口
Use the specified host as backend server. If port is not specified, the default is 8080.
-s [name=]type[,options] 缓存类型,Use the specified storage backend. The storage backends can be one of the following:
· malloc[,size]
· file[,path[,size[,granularity]]] granularity力度
· persistent,path,size 生产可用了4上面persistent (experimental)4.0任然是实验
-T address[:port] (管理端口)Offer a management interface on the specified address and port. See Management Interface for a list of management commands.

2、-p选项指明的参数:线程的工作机制,运行时参数:也可在程序运行中,通过其CLI进行配置;

3、vcl:配置缓存系统的缓存机制;通过vcl配置文件进行配置;先编译,后应用;依赖于c编译器;
4、基本配置:
(1)vcl配置
# vim /etc/varnish/default.vcl
backend default {
.host = “172.16.31.125”;
.port = “80”;
}

sub vcl_recv {
}

sub vcl_backend_response {
}

sub vcl_deliver {
}

# vim /etc/varnish/varnish.params
RELOAD_VCL=1
VARNISH_VCL_CONF=/etc/varnish/default.vcl
VARNISH_LISTEN_PORT=80
VARNISH_ADMIN_LISTEN_ADDRESS=172.16.31.124
VARNISH_ADMIN_LISTEN_PORT=6082
VARNISH_SECRET_FILE=/etc/varnish/secret
VARNISH_STORAGE=”file,/var/lib/varnish/varnish_storage.bin,1G”
VARNISH_USER=varnish
VARNISH_GROUP=varnish
DAEMON_OPTS=”-p thread_pool_min=5 -p thread_pool_max=500 -p thread_pool_timeout=300″
(2)命令行工具:
varnishadm -S /etc/varnish/secret -T IP:PORT
VARNISH_ADMIN_LISTEN_ADDRESS=172.16.31.124
VARNISH_ADMIN_LISTEN_PORT=6082
# varnishadm -S /etc/varnish/secret -T 172.16.31.124:6082
ping
200
PONG 1496379784 1.0
status //状态查看
200
Child in state running
vcl.list
200
active 0 boot
vcl.load haha /etc/varnish/default.vcl //编译生成配置文件
200
VCL compiled.
vcl.use haha //使用配置文件
200
VCL ‘haha’ now active
vcl.list //查看配置文件
200
available 0 boot
active 0 haha
vcl.discard haha //删除配置文件
200

vcl.list
200
active 0 boot
param.show thread_pool_max //查看单个参数
200
thread_pool_max
Value is: 500 [threads]
Default is: 5000
Minimum is: 5

The maximum number of worker threads in each pool.

Do not set this higher than you have to, since excess worker
threads soak up RAM and CPU and generally just get in the way
of getting work done.

Minimum is 10 threads.

NB: This parameter may take quite some time to take (full)
effect.
param.set thread_pool_max 5000 //最大工作线程数
200
panic.show //进程恐慌信息
300
Child has not panicked or panic has been cleared
vcl.show boot //查看编译前配置文件
backend.list //查看后端设备未加入健康检测
200
Backend name Refs Admin Probe
default(172.16.31.125,,80) 1 probe Healthy (no probe)

ban.list //缓存清理规则
200
Present bans:
1496379674.602972 0 C

(3)Log: ctrl+f5:不能用缓存内容回应我(no-cache) f5:不使用本地缓存
varnishlog(普通日志格式)
varnishncsa(httpd类型log)

varnishlog之:varnishncsa
172.16.31.123 – – [02/Jun/2017:02:39:23 -0400] “GET http://172.16.31.124/centos/ HTTP/1.1” 200 240 “http://172.16.31.124/centos/5.11/” “Mozilla/5.0 (Windows NT 6.1; WOW64; rv:53.0) Gecko/20100101 Firefox/53.0”

(4)Statistics
# varnishstat :状态查看

(5)Top:
# varnishtop(各种信息排序)
四vcl:
state engine:各引擎之间存一定程度上的相关性;前一个engine如果可以有多种下游engine,则上游engine需要用return指明要转移的下游engine;
vcl_recv: 收到请求
vcl_hash:hash规范化计算
vcl_hit:命中缓存
vcl_miss:未命中缓存。
vcl_fetch:到后端服务器去取数据
vcl_deliver:回复响应给客户端
vcl_pipe:如果请求方法不可理解,直接一手托两家在前端后段建立一条管道
vcl_pass:不可缓存的,比如cookie,auth等类信息的。
vcl_error: 如果varnish能够知道此类请求会出错,它可以直接构建错误响应
state engine workflow(v3):
vcl_recv –> vcl_hash –> vcl_hit –> vcl_deliver
vcl_recv –> vcl_hash –> vcl_miss –> vcl_fetch –> vcl_deliver
vcl_recv –> vcl_pass –> vcl_fetch –> vcl_deliver
vcl_recv –> vcl_pipe
备注:vcl_hit与vcl_miss 都可以直接交给vcl_pass,这是有特殊功能(手动删除缓存)。
v4:
前端:
vcl_recv: 收到请求
vcl_hash:hash规范化计算
vcl_hit:命中缓存
vcl_miss:未命中缓存。
vcl_deliver:回复响应给客户端
vcl_pass:不可缓存的,比如cookie,auth等类信息的。
vcl_pipe:如果请求方法不可理解,直接一手托两家在前端后段建立一条管道
vcl_purge:删除单个缓存,在varnish v3中是交由vcl_hit处理
vcl_synth
后端:
vcl_backend_fetch
vcl_backend_response
vcl_backend_error
worker threads:处理用户请求;
backend communication:与后端服务器交流;vcl_backend_fetch、vcl_backend_response、vcl_backend_error
每个请求一个线程,而后端一个线程
五、编程语言语法:
1、总体规则
(1) //, #, /* */ 用于注释;会被编译器忽略;
(2) sub $name: 用于定义子例程;
sub vcl_recv {

}
(3) 不支持循环;
(4) 有众多内置的变量,变量的可调用位置与state engine有密切相关性;
(5) 支持终止语句,return(action);没有返回值;
(6) “域”专用;
(7) 操作符:=, ==, ~, !, &&, ||

2、条件判断语句:
if (CONDTION) {

} else {

}
3、变量赋值:
set name=value
unset name
===========================================================================================
req.http.HEADER:调用request报文中http协议的指定的HEADER首部;
req.http.X-Forwarded-For
req.http.Auhtorization
req.http.cookie
req.request: 请求方法
例如:
===========================================================================================
sub vcl_recv {
if (req.method == “PRI”) {
/* We do not support SPDY or HTTP/2.0 不支持spdy与http 2.0*/
return (synth(405));
}

if (req.http.restarts == 0 ) {
/*连接不是多次请求的*/
if (req.http.x-Forwarded-For) {
/*如果x-Forwarded-For值存在,添加client.ip到x-Forwarded-For后面*/
set req.http.x-Forwarded-For = req.http.x-Forwarded-For + “,” + client.ip;
} else {
set req.http.x-Forwarded-For = client.ip;
}
}
if (req.method != “GET” &&
req.method != “HEAD” &&
req.method != “PUT” &&
req.method != “POST” &&
req.method != “TRACE” &&
req.method != “OPTIONS” &&
req.method != “DELETE”) {
/* Non-RFC2616 or CONNECT which is weird. 这些方法varnish无法识别使用管道,*/
return (pipe);
}
if (req.method != “GET” && req.method != “HEAD”) {
/* We only deal with GET and HEAD by default 不是get与head方法,无法使用缓存*/
return (pass);
}
if (req.http.Authorization || req.http.Cookie) {
/* Not cacheable by default 认证与cookie不使用缓存*/
return (pass);
}
return (hash);
}
4、函数
===========================================================================================
hash函数
client.ip: 客户端IP;
例如:
===========================================================================================
sub vcl_hash {
/*hash请求url*/

hash_data(req.url);

if (req.http.host) {
/*以主机名做hash*/
hash_data(req.http.host);
} else {
/*以ip做hash,通常为varnish的外部地址*/
hash_data(server.ip);
}

return (lookup);
}
===========================================================================================
vcl_backend_response
===========================================================================================
sub vcl_backend_response {
if (beresp.ttl <= 0s ||
beresp.http.Set-Cookie ||
beresp.http.Surrogate-control ~ “no-store” ||
(!beresp.http.Surrogate-Control &&
beresp.http.Cache-Control ~ “no-cache|no-store|private”) ||
beresp.http.Vary == “*”) {

/* Mark as “Hit-For-Pass” for the next 2 minutes 两分钟不缓存*/

set beresp.ttl = 120s;
set beresp.uncacheable = true;
}
return (deliver);
}
===========================================================================================

### Varnish Cache 部署指南 #### 1. 安装 Varnish Cache 对于基于 Debian 或 Ubuntu 的系统,可以通过包管理器安装 Varnish: ```bash sudo apt-get update sudo apt-get install varnish ``` 对于 Red Hat/CentOS 系统,可以使用 yum 进行安装: ```bash sudo yum install epel-release sudo yum install varnish ``` 确保已成功安装 Varnish 后,继续配置。 #### 2. 修改默认端口设置 通常 Web 服务器监听的是80端口。为了使流量先经过 Varnish 而不是直接到达后端服务器,需要调整这些服务的监听端口号。编辑 `/etc/default/varnish` 文件来指定 Varnish 使用 HTTP 默认端口80: ```bash DAEMON_OPTS="-a :80 \ -T localhost:6082 \ -f /etc/varnish/default.vcl \ -S /etc/varnish/secret \ -s malloc,256m" ``` 同时修改 Apache/Nginx 等 web server 的配置文件使其不再占用此端口而是改为其他未被使用的端口如 `8080`. #### 3. 编辑 VCL (Varnish Configuration Language) 创建或编辑位于 `/etc/varnish/default.vcl`, 添加自定义规则以优化缓存行为。这里提供了一个简单的例子用于加速静态资源加载速度并提高网站性能[^3]: ```vcl backend default { .host = "127.0.0.1"; .port = "8080"; # 假设Web Server现在运行在这个端口上 } sub vcl_recv { if (req.url ~ "\.(jpg|jpeg|png|gif|css|js)$") { unset req.http.cookie; } } ``` 这段代码表示当请求 URL 结尾匹配给定模式时移除 cookie 头部信息从而允许此类资源被有效缓存。 #### 4. 设置缓存策略 除了上述基本配置外,还需要考虑更复杂的场景下的缓存控制逻辑。例如,在响应头中加入适当的缓存指令可以让浏览器和其他中间代理也参与到整个缓存体系当中去。可以在同一份 `.vcl` 中进一步扩展如下所示: ```vcl sub vcl_deliver { if (obj.hits > 0) { set resp.http.X-Cache = "HIT from Varnish"; } else { set resp.http.X-Cache = "MISS from Varnish"; } # Add cache control headers to instruct browsers and proxies how long they should keep this content. set resp.http.Cache-Control = "public,max-age=3600"; } ``` 以上操作会向客户端发送额外的信息表明当前页面是从缓存获取还是首次访问,并建议它们保存一段时间内的副本减少重复查询次数。 完成所有更改之后重启 VarnishWeb 服务器让新设定生效即可享受更快捷稳定的浏览体验!
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值