http://click.aliyun.com/m/22119/

本文介绍如何利用日志服务分析Nginx访问日志,包括PV、UV统计、热点页面分析等,并提供SQL查询示例帮助理解。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

原文链接:[url]http://click.aliyun.com/m/22119/[/url]
摘要: 个人站长如何分析自己的网站,这里有第一手的经验

简介
很多个人站长在搭建网站时使用nginx作为服务器,为了了解网站的访问情况,一般有两种手段:

使用CNZZ之类的方式,在前端页面插入js,用户访问的时候触发js,记录访问请求。
分析nginx的access log,从日志中挖掘有用信息。
两种方式各有优缺点:

CNZZ使用起来比较简单,各种指标定义清楚。但这种方式只能记录页面的访问请求,像ajax之类的请求是无法记录的,还有爬虫信息也不会记录。
access log对所有的请求都有记录,可以说信息非常详细,但需要站长对访问日志具备详细的了解和动手能力。
两种手段相互补充,才能对网站的状况有更加深入的了解。

日志服务新推出来SQL分析功能,极大的降低了站长们分析access log的门槛,本文将详细介绍如何使用日志服务分析access log中的各种指标。

Nginx访问日志格式
一个典型的nginx访问日志配置:

log_format main '$remote_addr - $remote_user [$time_local] "$request" $http_host '
'$status $request_length $body_bytes_sent "$http_referer" '
'"$http_user_agent" $request_time';

access_log access.log main;
字段解释:

remote_addr : 客户端地址
remote_user : 客户端用户名
time_local : 服务器时间
request : 请求内容,包括方法名,地址,和http协议
http_host : 用户请求是使用的http地址
status : 返回的http 状态码
request_length : 请求大小
body_bytes_sent : 返回的大小
http_referer : 来源页
http_user_agent : 客户端名称
request_time : 整体请求延时
收集访问日志到日志服务
首先把日志收集到日志服务

请参考文档5分钟快速文档

把日志收集到日志服务后,设置每一列的类型:

index_attribute

注:其中request拆分城method 和uri两列
日志样例:

sample_log

分析访问日志
通常,对access log的访问需求有,查看网站的pv,uv,热点页面,热点方法,错误请求,客户端类型,来源页面等等。下文将逐个介绍各个指标的计算方法。
PV统计不仅可以一段时间总的PV,还可以按照小的时间段,查看每段时间的,比如每5分钟pv

统计代码

*|select from_unixtime( __time__- __time__% 300) as t,
count(1) as pv
group by __time__- __time__% 300
order by t limit 60

统计结果

pv

统计一小时内每5分钟的UV

统计代码:

*|select from_unixtime( __time__- __time__% 300) as t,
approx_distinct(remote_addr) as uv
group by __time__- __time__% 300
order by t limit 60
uv_5min

统计一小时内总的UV

统计代码:

*|select approx_distinct(remote_addr)
统计结果:

uv

最近一小时访问最多的10个页面

*|select url,count(1) as pv group by url order by pv desc limit 10
top10page

最近一小时各种请求方法的占比

*| select method, count(1) as pv group by method
method

最近一小时各种http状态码的占比

*| select status, count(1) as pv group by status
status

最近一小时各种浏览器的占比

*| select user_agent, count(1) as pv group by user_agent
user_agent

最近一小时referer来源于不同域名的占比

*|select url_extract_host(http_referer) ,count(1) group by url_extract_host(http_referer)
注:url_extract_host为从url中提取域名
referer

最近一小时用户访问不同域名的占比

*|select http_host ,count(1) group by http_host
host

一些高级功能
除了一些访问指标外,站长常常还需要对一些访问请求进行诊断,查看一下处理请求的延时如何,有哪些比较大的延时,哪些页面的延时比较大。
通过每5分钟的平均延时和最大延时, 对延时的情况有个总体的把握

*|select from_unixtime(__time__ -__time__% 300) as time,
avg(request_time) as avg_latency ,
max(request_time) as max_latency
group by __time__ -__time__% 300
limit 60
avg_max_latency

知道了最大延时之后,我们需要知道最大延时对应的请求页面是哪个,方便进一步优化页面响应。

*|select from_unixtime(__time__ - __time__% 60) ,
max_by(url,request_time)
group by __time__ - __time__%60
top_latency_req

从总体把握,我们需要知道网站的所有请求的延时的分布, 把延时分布在十个桶里边,看每个延时区间的请求个数

*|select numeric_histogram(10,request_time)
latency_histogram1

除了最大的延时,我们还需要知道最大的十个延时,对应的值是多少

*|select max(request_time,10)
top_10_latency

当我们知道了/0这个页面的访问延时最大,为了对/0页面进行调优,接下来需要统计/0这个页面的访问PV,UV,各种method次数,各种status次数,各种浏览器次数,平均延时,最大延时

url:"/0"|select count(1) as pv, approx_distinct(remote_addr) as uv, histogram(method) as method_pv,histogram(status) as status_pv, histogram(user_agent) as user_agent_pv, avg(request_time) as avg_latency, max(request_time) as max_latency
url0

url0method
url0useragent
url0status

同时,我们也可以限定只查看request_time 大于1000的请求的pv,uv,以及各个url的请求次数

request_time > 1000 |select count(1) as pv, approx_distinct(remote_addr) as uv, histogram(url) as url_pv
url_pv

latency1000url

原文链接:[url]http://click.aliyun.com/m/22119/[/url]
------------------------------------------------------------ E:\soft\1\Anaconda\Scripts\pip-script.py run on 03/30/25 17:00:01 Downloading/unpacking click Getting page https://mirrors.aliyun.com/pypi/simple/click URLs to search for versions for click: * https://mirrors.aliyun.com/pypi/simple/click/ Analyzing links from page https://mirrors.aliyun.com/pypi/simple/click/ Skipping link https://mirrors.aliyun.com/pypi/packages/fa/37/45185cb5abbc30d7257104c434fe0b07e5a195a6847506c074527aa599ec/Click-7.0-py2.py3-none-any.whl#sha256=2335065e6395b9e67ca716de5f7526736bfa6ceead690adf616d925bdc622b13 (from https://mirrors.aliyun.com/pypi/simple/click/); unknown archive format: .whl Found link https://mirrors.aliyun.com/pypi/packages/f8/5c/f60e9d8a1e77005f664b76ff8aeaee5bc05d0a91798afd7f53fc998dbc47/Click-7.0.tar.gz#sha256=5b94b49521f6456670fdb30cd82a4eca9412788a93fa6dd6df72c94d5a8ff2d7 (from https://mirrors.aliyun.com/pypi/simple/click/), version: 7.0 Skipping link https://mirrors.aliyun.com/pypi/packages/82/e9/39bc04e46ac4dc16f60d3c95d2a8238f8a86a738ecab723755470e1486d1/click-0.1-py2.py3-none-any.whl#sha256=6ece7fdc438597979abb5e237cd42ec9b0ed9342acfa13aabd0d57dae5122f00 (from https://mirrors.aliyun.com/pypi/simple/click/); unknown archive format: .whl Found link https://mirrors.aliyun.com/pypi/packages/1a/65/bde2803d3f5d217fde361f7773d51d5c9b1332181f740bdd7adb2462607c/click-0.1.tar.gz#sha256=9f8290d502cf11fad5ccc64d19f2724abcbc11549e6a8e2cdafc530109f198b4 (from https://mirrors.aliyun.com/pypi/simple/click/), version: 0.1 Skipping link https://mirrors.aliyun.com/pypi/packages/ee/a5/97f43386352a0658f12842848c152479fce3162251c08339866da45e912e/click-0.2-py2.py3-none-any.whl#sha256=54c90326cb37daf23389b909fa593660db74d68861cbc36d871d8c7ccc2fe003 (from https://mirrors.aliyun.com/pypi/simple/click/); unknown archive format: .whl Found link https://mirrors.aliyun.com/pypi/packages/bb/aa/c8b583d8d7cc5e21c8da30de6d8c605652836b4ef33b2b57c37f6a017c09/cl
03-31
Looking in indexes: https://mirrors.aliyun.com/pypi/simple/ Collecting certifi==2019.3.9 (from -r requirements.txt (line 1)) Using cached https://mirrors.aliyun.com/pypi/packages/60/75/f692a584e85b7eaba0e03827b3d51f45f571c2e793dd731e598828d380aa/certifi-2019.3.9-py2.py3-none-any.whl (158 kB) Collecting chardet==3.0.4 (from -r requirements.txt (line 2)) Using cached https://mirrors.aliyun.com/pypi/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133 kB) Collecting click==6.7 (from -r requirements.txt (line 3)) Using cached https://mirrors.aliyun.com/pypi/packages/34/c1/8806f99713ddb993c5366c362b2f908f18269f8d792aff1abfd700775a77/click-6.7-py2.py3-none-any.whl (71 kB) Collecting Flask==1.0.2 (from -r requirements.txt (line 4)) Using cached https://mirrors.aliyun.com/pypi/packages/7f/e7/08578774ed4536d3242b14dacb4696386634607af824ea997202cd0edb4b/Flask-1.0.2-py2.py3-none-any.whl (91 kB) Collecting Flask-WTF==0.14.2 (from -r requirements.txt (line 5)) Using cached https://mirrors.aliyun.com/pypi/packages/60/3a/58c629472d10539ae5167dc7c1fecfa95dd7d0b7864623931e3776438a24/Flask_WTF-0.14.2-py2.py3-none-any.whl (14 kB) Collecting idna==2.8 (from -r requirements.txt (line 6)) Using cached https://mirrors.aliyun.com/pypi/packages/14/2c/cd551d81dbe15200be1cf41cd03869a46fe7226e7450af7a6545bfc474c9/idna-2.8-py2.py3-none-any.whl (58 kB) Collecting itsdangerous==0.24 (from -r requirements.txt (line 7)) Using cached https://mirrors.aliyun.com/pypi/packages/dc/b4/a60bcdba945c00f6d608d8975131ab3f25b22f2bcfe1dab221165194b2d4/itsdangerous-0.24.tar.gz (46 kB) Preparing metadata (setup.py) ... done Collecting Jinja2==2.10 (from -r requirements.txt (line 8)) Using cached https://mirrors.aliyun.com/pypi/packages/7f/ff/ae64bacdfc95f27a016a7bed8e8686763ba4d277a78ca76f32659220a731/Jinja2-2.10-py2.py3-none-any.whl (126 kB) Requirement already satisfied: setuptools in c:\programdata\anaconda3\envs\3112\lib\site-packages (from -r requirements.txt (line 9)) (78.1.1) Collecting MarkupSafe (from -r requirements.txt (line 10)) Using cached https://mirrors.aliyun.com/pypi/packages/da/b8/3a3bd761922d416f3dc5d00bfbed11f66b1ab89a0c2b6e887240a30b0f6b/MarkupSafe-3.0.2-cp311-cp311-win_amd64.whl (15 kB) Collecting numpy==1.15.4 (from -r requirements.txt (line 11)) Using cached https://mirrors.aliyun.com/pypi/packages/2d/80/1809de155bad674b494248bcfca0e49eb4c5d8bee58f26fe7a0dd45029e2/numpy-1.15.4.zip (4.5 MB) Preparing metadata (setup.py) ... done Collecting pandas==0.23.4 (from -r requirements.txt (line 12)) Using cached https://mirrors.aliyun.com/pypi/packages/e9/ad/5e92ba493eff96055a23b0a1323a9a803af71ec859ae3243ced86fcbd0a4/pandas-0.23.4.tar.gz (10.5 MB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [15 lines of output] C:\Users\Administrator\AppData\Local\Temp\pip-install-mwx2ea8w\pandas_33f6702689674a99929697720e4cfb12\setup.py:12: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html import pkg_resources C:\ProgramData\anaconda3\envs\3112\Lib\site-packages\setuptools\__init__.py:94: _DeprecatedInstaller: setuptools.installer and fetch_build_eggs are deprecated. !! ******************************************************************************** Requirements should be satisfied by a PEP 517 installer. If you are using pip, you can try `pip install --use-pep517`. ******************************************************************************** !! dist.fetch_build_eggs(dist.setup_requires) error in pandas setup command: 'install_requires' must be a string or iterable of strings containing valid project/version requirement specifiers; Expected end or semicolon (after version specifier) pytz >= 2011k ~~~~~~~^ [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed × Encountered error while generating package metadata. ╰─> See above for output. note: This is an issue with the package mentioned above, not pip. hint: See above for details.
最新发布
08-03
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值