第8章:可持续的优化——监控、告警与自动化运维
章节介绍
你有没有遇到过这样的情况?精心调优的 PHP 应用,上线初期运行流畅,但几周或几个月后,响应速度又慢了下来,内存占用也悄然升高。这并不是优化失效了,而是环境、数据量和访问模式一直在变化。真正的深度优化,不是一个“完成”的状态,而是一个需要持续观察、调整和维护的过程。这就是“可持续的优化”的意义所在。
如果优化是一次性的手术,那么监控、告警和自动化运维就是手术后的持续康复与定期体检。没有它们,我们无法知道优化措施是否长期有效,也无法在问题萌芽时就及时干预。
监控是这一切的眼睛。你需要知道系统在每一个时刻的真实状态。这不仅仅是看应用的错误日志,更是要建立一套完整的指标观测体系。例如,使用 `
<?php
function getSystemLoad(): array
{
// 获取系统负载平均值
$load = sys_getloadavg();
// 如果sys_getloadavg不可用,尝试从/proc/loadavg读取
if (!$load) {
if (is_readable('/proc/loadavg')) {
$loadData = file_get_contents('/proc/loadavg');
$load = array_slice(explode(' ', $loadData), 0, 3);
} else {
$load = [0, 0, 0];
}
}
return [
'load_1min' => round($load[0], 2),
'load_5min' => round($load[1], 2),
'load_15min' => round($load[2], 2),
'timestamp' => time()
];
}
可以让你了解服务器的负载压力趋势,是平稳还是存在周期性尖峰。
<?php
function getPhpMemoryUsage(): array
{
return [
'memory_usage' => round(memory_get_usage() / 1024 / 1024, 2), // 转换为MB
'memory_peak' => round(memory_get_peak_usage() / 1024 / 1024, 2), // 转换为MB
'timestamp' => time()
];
}
则直接反映了你的 PHP 代码在运行时的内存消耗情况,这是定位内存泄漏的关键。结合
<?php
function getDatabaseStats(PDO $pdo): array
{
$result = [
'connections' => 0,
'slow_queries' => 0,
'queries_per_second' => 0,
'timestamp' => time()
];
try {
// 获取当前连接数
$stmt = $pdo->query("SHOW STATUS LIKE 'Threads_connected'");
if ($stmt) {
$row = $stmt->fetch(PDO::FETCH_ASSOC);
$result['connections'] = intval($row['Value'] ?? 0);
}
// 获取慢查询数量
$stmt = $pdo->query("SHOW STATUS LIKE 'Slow_queries'");
if ($stmt) {
$row = $stmt->fetch(PDO::FETCH_ASSOC);
$result['slow_queries'] = intval($row['Value'] ?? 0);
}
// 获取每秒查询数(QPS)
$stmt = $pdo->query("SHOW STATUS LIKE 'Queries'");
if ($stmt) {
$row = $stmt->fetch(PDO::FETCH_ASSOC);
$queries = intval($row['Value'] ?? 0);
// 获取当前时间戳,计算与之前记录的差值
static $lastQueries = 0;
static $lastTimestamp = 0;
if ($lastTimestamp > 0) {
$timeDiff = time() - $lastTimestamp;
if ($timeDiff > 0) {
$result['queries_per_second'] = round(($queries - $lastQueries) / $timeDiff, 2);
}
}
$lastQueries = $queries;
$lastTimestamp = time();
}
} catch (Exception $e) {
// 记录错误但不抛出
error_log("获取数据库统计信息失败: " . $e->getMessage());
}
return $result;
}
`,你就能判断慢查询是否在增多,数据库连接池是否健康。
仅有监控数据还不够,当指标出现异常时,必须能及时通知到人。这就是告警的作用。一个优秀的告警机制,应该基于清晰的阈值和规则,避免“狼来了”的误报,也绝不能遗漏真正的危机。你可以使用 `
<?php
function checkSystemHealth(array $thresholds = []): array
{
// 默认阈值
$defaultThresholds = [
'cpu_threshold' => 90, // CPU使用率阈值(%)
'memory_threshold' => 90, // 内存使用率阈值(%)
'disk_threshold' => 90, // 磁盘使用率阈值(%)
'load_threshold' => 5.0, // 系统负载阈值(15分钟平均)
'connections_threshold' => 100 // 数据库连接数阈值
];
$thresholds = array_merge($defaultThresholds, $thresholds);
// 收集各项指标
$healthData = [
'cpu' => getCpuUsage(),
'memory' => getMemoryUsage(),
'disk' => getDiskUsage(),
'load' => getSystemLoad(),
'php_memory' => getPhpMemoryUsage(),
'timestamp' => time(),
'status' => 'healthy',
'issues' => []
];
// 检查CPU使用率
if ($healthData['cpu']['percentage'] > $thresholds['cpu_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'CPU使用率过高: %.2f%% (阈值: %d%%)',
$healthData['cpu']['percentage'],
$thresholds['cpu_threshold']
);
}
// 检查内存使用率
if ($healthData['memory']['percentage'] > $thresholds['memory_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'内存使用率过高: %.2f%% (阈值: %d%%)',
$healthData['memory']['percentage'],
$thresholds['memory_threshold']
);
}
// 检查磁盘使用率
if ($healthData['disk']['percentage'] > $thresholds['disk_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'磁盘使用率过高: %.2f%% (阈值: %d%%)',
$healthData['disk']['percentage'],
$thresholds['disk_threshold']
);
}
// 检查系统负载
if ($healthData['load']['load_15min'] > $thresholds['load_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'系统负载过高: %.2f (阈值: %.1f)',
$healthData['load']['load_15min'],
$thresholds['load_threshold']
);
}
// 检查PHP内存使用
if ($healthData['php_memory']['memory_peak'] > 128) { // 128MB阈值
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'PHP峰值内存使用过高: %.2fMB',
$healthData['php_memory']['memory_peak']
);
}
return $healthData;
}
函数,集中检查 CPU、内存、磁盘等核心资源的健康状况。一旦某项指标超出预设的阈值,比如 CPU 使用率超过 80% 并持续 5 分钟,
<?php
function sendAlert(string $level, string $message, array $data = []): bool
{
// 默认配置
$config = [
'email_enabled' => true,
'email_from' => 'alerts@example.com',
'email_to' => ['admin@example.com'],
'slack_enabled' => false,
'slack_webhook' => '',
'sms_enabled' => false,
'sms_provider' => '',
'log_enabled' => true,
'log_file' => '/var/log/php-alerts.log'
];
// 合并自定义配置
$config = array_merge($config, $data['config'] ?? []);
$result = false;
$timestamp = date('Y-m-d H:i:s');
$fullMessage = "[{$timestamp}] [{$level}] {$message}";
// 添加附加数据
if (!empty($data)) {
$fullMessage .= "\n数据: " . json_encode($data, JSON_PRETTY_PRINT);
}
// 1. 记录到日志文件
if ($config['log_enabled']) {
$logDir = dirname($config['log_file']);
if (!is_dir($logDir)) {
mkdir($logDir, 0755, true);
}
$logResult = file_put_contents($config['log_file'], $fullMessage . "\n", FILE_APPEND) !== false;
$result = $result || $logResult;
}
// 2. 发送邮件
if ($config['email_enabled'] && in_array($level, ['critical', 'error', 'warning'])) {
$subject = "系统告警 [{$level}]: " . substr($message, 0, 50) . '...';
$headers = "From: {$config['email_from']}\r\n";
$headers .= "Content-Type: text/html; charset=UTF-8\r\n";
$htmlMessage = nl2br($fullMessage);
foreach ($config['email_to'] as $email) {
$emailResult = mail($email, $subject, $htmlMessage, $headers);
$result = $result || $emailResult;
}
}
// 3. 发送到Slack
if ($config['slack_enabled'] && !empty($config['slack_webhook'])) {
$slackData = [
'text' => $fullMessage,
'username' => '系统监控机器人',
'icon_emoji' => ':warning:'
];
$ch = curl_init($config['slack_webhook']);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($slackData));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
$slackResult = ($response !== false && $httpCode === 200);
$result = $result || $slackResult;
}
// 4. 发送短信(需要具体提供商实现)
if ($config['sms_enabled'] && !empty($config['sms_provider'])) {
// 这里需要根据具体的短信提供商实现
// 示例:调用外部API发送短信
$smsResult = false;
try {
// 根据短信提供商调用不同的API
switch ($config['sms_provider']) {
case 'aliyun':
// 阿里云短信服务调用
// $smsResult = sendSMSByAliyun($fullMessage);
break;
case 'tencent':
// 腾讯云短信服务调用
// $smsResult = sendSMSByTencent($fullMessage);
break;
default:
// 默认实现或日志记录
error_log("不支持的短信提供商: {$config['sms_provider']}");
}
$result = $result || $smsResult;
} catch (Exception $e) {
error_log("发送短信失败: " . $e->getMessage());
}
}
// 5. 错误处理:记录发送失败的情况
if (!$result) {
error_log("告警发送失败 - 级别: {$level}, 消息: " . substr($message, 0, 200));
}
return $result;
}
函数就会被触发,通过邮件、钉钉、企业微信等渠道将详细的问题信息和上下文数据$data 发送给运维或开发人员。更进一步,
<?php
function checkAndAlert(array $thresholds = []): void
{
$healthData = checkSystemHealth($thresholds);
if ($healthData['status'] !== 'healthy' && !empty($healthData['issues'])) {
$level = 'warning';
$message = '系统健康检查发现问题: ' . implode('; ', $healthData['issues']);
// 如果问题严重,提升告警级别
if (count($healthData['issues']) > 3) {
$level = 'error';
}
// 检查是否有关键指标严重超标
if ($healthData['cpu']['percentage'] > 95 ||
$healthData['memory']['percentage'] > 95 ||
$healthData['disk']['percentage'] > 95) {
$level = 'critical';
}
sendAlert($level, $message, $healthData);
}
// 记录常规监控数据
$monitorLog = sprintf(
"[%s] CPU: %.2f%%, Memory: %.2f%%, Disk: %.2f%%, Load: %.2f\n",
date('Y-m-d H:i:s'),
$healthData['cpu']['percentage'],
$healthData['memory']['percentage'],
$healthData['disk']['percentage'],
$healthData['load']['load_15min']
);
$logFile = '/var/log/php-monitor.log';
if (!is_dir(dirname($logFile))) {
mkdir(dirname($logFile), 0755, true);
}
file_put_contents($logFile, $monitorLog, FILE_APPEND);
}
` 将这个检查和通知的过程自动化,你可以将它设置为一个定时任务。
说到定时任务,就引出了自动化运维。许多维护工作规律且重复,手动执行既低效又容易出错。例如,日志文件如果不加管理,很快就会撑满磁盘。你可以用 `
<?php
function setupMonitoringCron(int $interval = 300): bool
{
// 检查脚本路径是否存在
$scriptPath = __FILE__;
if (!file_exists($scriptPath)) {
error_log("监控脚本文件不存在: {$scriptPath}");
return false;
}
// 构建监控命令
$command = "php -f \"{$scriptPath}\" --check 2>&1 >> /var/log/php-monitor-cron.log";
// 计算cron表达式(每$interval秒执行一次)
if ($interval >= 60) {
// 大于等于60秒,转换为分钟
$minutes = floor($interval / 60);
if ($minutes < 1) {
$minutes = 1; // 最小为1分钟
}
$cronExpression = "*/{$minutes} * * * *";
} else {
// 如果间隔小于60秒,需要特殊处理(多行cron)
$lines = [];
for ($i = 0; $i < 60; $i += $interval) {
$lines[] = "{$i} * * * * {$command}";
}
$cronExpression = implode("\n", $lines);
}
// 准备要添加的cron行
$cronLine = "# PHP系统监控任务(监控间隔:{$interval}秒)\n";
if ($interval >= 60) {
$cronLine .= "{$cronExpression} {$command}\n";
} else {
$cronLine .= "{$cronExpression}\n";
}
// 获取当前用户的crontab
exec('crontab -l 2>/dev/null', $currentCron, $returnCode);
if ($returnCode !== 0) {
// 如果用户没有crontab,从空数组开始
$currentCron = [];
}
// 移除旧的监控任务
$newCron = [];
$monitorFound = false;
foreach ($currentCron as $line) {
// 跳过注释行和空行
if (trim($line) === '' || strpos($line, '#') === 0) {
// 检查是否是我们的监控任务注释
if (strpos($line, 'PHP系统监控任务') !== false) {
$monitorFound = true;
continue; // 跳过旧注释行
}
// 保留其他注释行
$newCron[] = $line;
continue;
}
// 检查是否是监控任务行
if (strpos($line, basename($scriptPath)) !== false || strpos($line, 'php-monitor-cron.log') !== false) {
$monitorFound = true;
continue; // 跳过旧任务行
}
// 保留其他任务行
$newCron[] = $line;
}
// 添加新的监控任务
$newCron[] = trim($cronLine);
// 创建临时文件保存新的crontab
$tempFile = tempnam(sys_get_temp_dir(), 'cron_');
if ($tempFile === false) {
error_log("无法创建临时文件");
return false;
}
// 将新的crontab内容写入临时文件
$newCronContent = implode("\n", $newCron) . "\n";
if (file_put_contents($tempFile, $newCronContent) === false) {
error_log("无法写入临时文件: {$tempFile}");
unlink($tempFile);
return false;
}
// 应用新的crontab
exec("crontab {$tempFile} 2>&1", $output, $installResult);
// 清理临时文件
unlink($tempFile);
if ($installResult !== 0) {
error_log("安装cron任务失败: " . implode("\n", $output));
return false;
}
// 验证cron任务是否安装成功
exec('crontab -l 2>/dev/null', $verifyCron, $verifyResult);
if ($verifyResult === 0) {
foreach ($verifyCron as $line) {
if (strpos($line, basename($scriptPath)) !== false) {
return true; // 确认任务已添加
}
}
}
error_log("cron任务安装后验证失败");
return false;
}
建立一个每 5 分钟运行一次的健康检查任务。同时,用
<?php
function cleanupOldLogs(string $logDir, int $daysToKeep = 30): array
{
$result = [
'total_files' => 0,
'deleted_files' => 0,
'deleted_size' => 0,
'errors' => []
];
if (!is_dir($logDir)) {
$result['errors'][] = "目录不存在: {$logDir}";
return $result;
}
$cutoffTime = time() - ($daysToKeep * 24 * 60 * 60);
$iterator = new RecursiveIteratorIterator(
new RecursiveDirectoryIterator($logDir, RecursiveDirectoryIterator::SKIP_DOTS),
RecursiveIteratorIterator::SELF_FIRST
);
foreach ($iterator as $file) {
if ($file->isFile()) {
$result['total_files']++;
// 检查文件修改时间
if ($file->getMTime() < $cutoffTime) {
$fileSize = $file->getSize();
if (unlink($file->getPathname())) {
$result['deleted_files']++;
$result['deleted_size'] += $fileSize;
} else {
$result['errors'][] = "无法删除文件: " . $file->getPathname();
}
}
}
}
// 尝试删除空目录
foreach (array_reverse(iterator_to_array($iterator)) as $file) {
if ($file->isDir()) {
@rmdir($file->getPathname());
}
}
return $result;
}
自动清理 30 天前的旧日志,用
<?php
function rotateLogFile(string $logFile, int $maxSize = 10485760, int $backups = 10): bool
{
if (!file_exists($logFile)) {
return false;
}
$fileSize = filesize($logFile);
if ($fileSize < $maxSize) {
return false;
}
// 关闭可能打开的文件句柄
if (function_exists('posix_kill')) {
// 尝试重新打开日志文件
$logDir = dirname($logFile);
$baseName = basename($logFile);
// 重命名当前日志文件
$timestamp = date('Ymd_His');
$archivedFile = $logDir . '/' . $baseName . '.' . $timestamp;
if (!rename($logFile, $archivedFile)) {
return false;
}
// 重新创建日志文件
touch($logFile);
chmod($logFile, 0644);
// 清理旧的归档文件
$pattern = $logDir . '/' . $baseName . '.*';
$backupFiles = glob($pattern);
if (count($backupFiles) > $backups) {
// 按修改时间排序,删除最旧的文件
usort($backupFiles, function($a, $b) {
return filemtime($a) - filemtime($b);
});
$filesToDelete = array_slice($backupFiles, 0, count($backupFiles) - $backups);
foreach ($filesToDelete as $file) {
@unlink($file);
}
}
return true;
}
return false;
}
确保单个日志文件不会过大。对于数据库,定期使用
<?php
function backupDatabase(array $config): array
{
$defaultConfig = [
'host' => 'localhost',
'port' => 3306,
'username' => 'root',
'password' => '',
'database' => '',
'backup_dir' => '/var/backups/mysql',
'compress' => true,
'keep_days' => 30
];
$config = array_merge($defaultConfig, $config);
$result = [
'success' => false,
'backup_file' => '',
'backup_size' => 0,
'error' => '',
'timestamp' => time()
];
// 验证配置
if (empty($config['database'])) {
$result['error'] = '数据库名称不能为空';
return $result;
}
// 创建备份目录
if (!is_dir($config['backup_dir'])) {
if (!mkdir($config['backup_dir'], 0755, true)) {
$result['error'] = "无法创建备份目录: {$config['backup_dir']}";
return $result;
}
}
// 生成备份文件名
$timestamp = date('Ymd_His');
$backupFile = $config['backup_dir'] . '/' . $config['database'] . '_' . $timestamp . '.sql';
if ($config['compress']) {
$backupFile .= '.gz';
}
// 构建mysqldump命令
$command = sprintf(
'mysqldump --host=%s --port=%d --user=%s --password=%s %s',
escapeshellarg($config['host']),
$config['port'],
escapeshellarg($config['username']),
escapeshellarg($config['password']),
escapeshellarg($config['database'])
);
if ($config['compress']) {
$command .= ' | gzip';
}
$command .= ' > ' . escapeshellarg($backupFile);
// 执行备份命令
exec($command, $output, $returnCode);
if ($returnCode === 0) {
// 备份成功
$result['success'] = true;
$result['backup_file'] = $backupFile;
// 获取备份文件大小
if (file_exists($backupFile)) {
$result['backup_size'] = filesize($backupFile);
}
// 清理旧备份文件
$expireTime = time() - ($config['keep_days'] * 24 * 60 * 60);
$pattern = $config['backup_dir'] . '/' . $config['database'] . '_*.sql*';
foreach (glob($pattern) as $oldBackupFile) {
if (filemtime($oldBackupFile) < $expireTime) {
@unlink($oldBackupFile);
}
}
} else {
// 备份失败
$result['error'] = '数据库备份失败,mysqldump命令执行错误';
if (!empty($output)) {
$result['error'] .= ': ' . implode("\n", $output);
}
}
return $result;
}
进行备份是数据安全的基本要求,而
<?php
function optimizeDatabaseTables(PDO $pdo, array $tables = []): array
{
$result = [
'optimized_tables' => [],
'total_size_before' => 0,
'total_size_after' => 0,
'errors' => []
];
try {
// 如果没有指定表,则优化所有表
if (empty($tables)) {
$stmt = $pdo->query("SHOW TABLES");
$tables = $stmt->fetchAll(PDO::FETCH_COLUMN);
}
foreach ($tables as $table) {
try {
// 获取表优化前的状态
$stmt = $pdo->query("SHOW TABLE STATUS LIKE '{$table}'");
$status = $stmt->fetch(PDO::FETCH_ASSOC);
$sizeBefore = $status['Data_length'] + $status['Index_length'];
$result['total_size_before'] += $sizeBefore;
// 执行OPTIMIZE TABLE
$pdo->exec("OPTIMIZE TABLE `{$table}`");
// 获取表优化后的状态
$stmt = $pdo->query("SHOW TABLE STATUS LIKE '{$table}'");
$status = $stmt->fetch(PDO::FETCH_ASSOC);
$sizeAfter = $status['Data_length'] + $status['Index_length'];
$result['total_size_after'] += $sizeAfter;
$result['optimized_tables'][] = [
'table' => $table,
'size_before' => $sizeBefore,
'size_after' => $sizeAfter,
'saved' => $sizeBefore - $sizeAfter
];
} catch (Exception $e) {
$result['errors'][] = "优化表 {$table} 失败: " . $e->getMessage();
}
}
} catch (Exception $e) {
$result['errors'][] = "获取表列表失败: " . $e->getMessage();
}
return $result;
}
` 则能像定期整理书架一样,回收碎片,保持数据库的查询性能。
最终,所有这些离散的数据点和操作,需要被整合成洞察。`
<?php
function getPerformanceMetrics(): array
{
static $startTime = null;
static $requestCount = 0;
// 如果是第一次调用,初始化开始时间
if ($startTime === null) {
$startTime = microtime(true);
}
// 增加请求计数
$requestCount++;
// 计算当前时间
$currentTime = microtime(true);
// 计算运行时长(秒)
$runtime = $currentTime - $startTime;
// 计算吞吐量(请求/秒)
$throughput = 0;
if ($runtime > 0) {
$throughput = $requestCount / $runtime;
}
// 获取内存使用情况
$memoryUsage = memory_get_usage(true);
$peakMemoryUsage = memory_get_peak_usage(true);
// 获取系统负载(如果可用)
$systemLoad = [];
if (function_exists('sys_getloadavg')) {
$systemLoad = sys_getloadavg();
}
// 获取数据库连接数(示例)
$databaseConnections = 0;
if (extension_loaded('mysqli')) {
$databaseConnections = mysqli_num_links();
}
// 构造返回的指标数组
$metrics = [
'runtime_seconds' => round($runtime, 4),
'request_count' => $requestCount,
'throughput_rps' => round($throughput, 2),
'memory_usage_bytes' => $memoryUsage,
'peak_memory_bytes' => $peakMemoryUsage,
'system_load' => $systemLoad,
'database_connections' => $databaseConnections,
'current_timestamp' => date('Y-m-d H:i:s'),
'php_version' => PHP_VERSION,
'script_execution_time' => round(microtime(true) - $_SERVER['REQUEST_TIME_FLOAT'], 4)
];
// 添加错误处理:如果运行时异常,返回空数组
try {
// 可选:记录到日志或监控系统
// file_put_contents('performance.log', json_encode($metrics) . PHP_EOL, FILE_APPEND);
return $metrics;
} catch (Exception $e) {
error_log("获取性能指标失败: " . $e->getMessage());
return [];
}
}
可以帮助你从应用层面追踪响应时间和吞吐量。
<?php
function generateMonitoringReport(string $period = 'daily'): array
{
$report = [
'period' => $period,
'generated_at' => date('Y-m-d H:i:s'),
'summary' => [],
'details' => []
];
// 根据周期确定时间范围
$endTime = time();
switch ($period) {
case 'hourly':
$startTime = $endTime - 3600;
break;
case 'daily':
$startTime = $endTime - 86400;
break;
case 'weekly':
$startTime = $endTime - 604800;
break;
case 'monthly':
$startTime = $endTime - 2592000;
break;
default:
$startTime = $endTime - 86400; // 默认每天
}
// 这里应该从监控数据库或日志文件中获取数据
// 以下为示例数据
$report['summary'] = [
'total_requests' => rand(1000, 10000),
'avg_response_time' => round(rand(100, 500) / 1000, 2), // 转换为秒
'error_rate' => round(rand(1, 50) / 1000, 2), // 转换为百分比
'peak_memory_usage' => round(rand(50, 200), 2), // MB
'alerts_count' => rand(0, 10)
];
// 添加详细数据
$report['details'] = [
'cpu_usage' => [
'avg' => round(rand(20, 80), 2),
'max' => round(rand(70, 95), 2),
'min' => round(rand(10, 30), 2)
],
'memory_usage' => [
'avg' => round(rand(40, 70), 2),
'max' => round(rand(75, 95), 2),
'min' => round(rand(20, 40), 2)
],
'disk_usage' => [
'avg' => round(rand(50, 80), 2),
'max' => round(rand(80, 95), 2),
'min' => round(rand(40, 60), 2)
]
];
return $report;
}
` 则能生成每日或每周的报告,让你清晰地看到性能趋势、资源消耗规律以及告警的分布,为下一轮的优化循环提供决策依据。
因此,将监控、告警与自动化工具嵌入你的运维体系,意味着优化工作从被动救火转向了主动管理和持续改进。你能在用户抱怨之前发现问题,在性能衰退之前采取行动,让每一次优化效果都能长久地保持下去。这才是现代 PHP 应用性能深度优化的完整闭环。
核心概念
性能调优并非一劳永逸的终点,而是一个需要持续观察、反馈和调整的循环过程。一次成功的优化后,代码部署上线,其表现会如何?会不会随着用户量增长、数据量膨胀而再次出现瓶颈?要回答这些问题,依赖于一套坚实的监控、告警与自动化运维体系。这就像是给应用程序装上了仪表盘和自动驾驶系统,让我们能从被动救火转向主动管理。
监控是这一切的基石。 它的核心目标是可观测性:我们需要清晰地看到系统内部正在发生什么。这包括几个关键层面。首先是基础设施层,服务器本身的健康状况至关重要。CPU是否持续高负荷运转?内存是否即将耗尽?磁盘空间还够不够?这些信息是我们的第一道防线。通过调用 `
<?php
function getSystemLoad(): array
{
// 获取系统负载平均值
$load = sys_getloadavg();
// 如果sys_getloadavg不可用,尝试从/proc/loadavg读取
if (!$load) {
if (is_readable('/proc/loadavg')) {
$loadData = file_get_contents('/proc/loadavg');
$load = array_slice(explode(' ', $loadData), 0, 3);
} else {
$load = [0, 0, 0];
}
}
return [
'load_1min' => round($load[0], 2),
'load_5min' => round($load[1], 2),
'load_15min' => round($load[2], 2),
'timestamp' => time()
];
}
、
<?php
function getCpuUsage(): array
{
// 静态变量用于保存上次CPU时间(仅Linux有效)
static $lastCpuTimes = null;
// 初始化返回数组
$result = [
'user' => 0,
'system' => 0,
'idle' => 0,
'percentage' => 0,
'timestamp' => time()
];
// 判断操作系统类型
if (stripos(PHP_OS, 'WIN') !== false) {
// Windows系统:使用WMI命令获取CPU使用率百分比
$command = 'wmic path Win32_PerfFormattedData_PerfOS_Processor get PercentIdleTime,PercentUserTime,PercentPrivilegedTime /format:csv';
$output = shell_exec($command);
if ($output) {
$lines = explode("\n", trim($output));
// 第一行为标题,第二行为数据
if (count($lines) > 1) {
$data = explode(',', $lines[1]);
// 数据格式:Node,PercentIdleTime,PercentPrivilegedTime,PercentUserTime
if (count($data) >= 4) {
$idle = floatval($data[1]); // 空闲时间百分比
$system = floatval($data[2]); // 系统态时间百分比
$user = floatval($data[3]); // 用户态时间百分比
$result['idle'] = $idle;
$result['system'] = $system;
$result['user'] = $user;
$result['percentage'] = 100 - $idle; // 总使用率 = 100 - 空闲率
}
}
} else {
// 命令执行失败,保留默认值
error_log("获取Windows CPU使用率失败");
}
} else {
// 非Windows系统(如Linux):通过/proc/stat获取CPU累计时间
$statFile = '/proc/stat';
if (file_exists($statFile)) {
$contents = file_get_contents($statFile);
$lines = explode("\n", $contents);
if (isset($lines[0])) {
$cpuLine = $lines[0]; // 第一行为总CPU时间信息
$parts = preg_split('/\s+/', trim($cpuLine));
if (count($parts) >= 5) {
// 解析CPU时间:cpu, user, nice, system, idle, iowait, ...
$user = intval($parts[1]) + intval($parts[2]); // user + nice
$system = intval($parts[3]); // system
$idle = intval($parts[4]); // idle
// 可选:将iowait计入空闲时间
if (isset($parts[5])) {
$idle += intval($parts[5]);
}
// 当前CPU累计时间
$currentCpuTimes = [
'user' => $user,
'system' => $system,
'idle' => $idle,
'total' => $user + $system + $idle
];
// 计算使用率百分比(需要与上一次采样比较)
if ($lastCpuTimes !== null) {
$diffUser = $currentCpuTimes['user'] - $lastCpuTimes['user'];
$diffSystem = $currentCpuTimes['system'] - $lastCpuTimes['system'];
$diffIdle = $currentCpuTimes['idle'] - $lastCpuTimes['idle'];
$diffTotal = $currentCpuTimes['total'] - $lastCpuTimes['total'];
if ($diffTotal > 0) {
$usagePercentage = (($diffUser + $diffSystem) / $diffTotal) * 100;
$result['percentage'] = round($usagePercentage, 2);
}
}
// 保存当前时间供下次计算
$lastCpuTimes = $currentCpuTimes;
// 设置累计时间值
$result['user'] = $user;
$result['system'] = $system;
$result['idle'] = $idle;
}
} else {
error_log("/proc/stat文件格式异常");
}
} else {
// 非Linux系统(如macOS)可在此扩展
error_log("不支持的操作系统或/proc/stat不存在");
}
}
return $result;
}
、
<?php
function getMemoryUsage(): array
{
$result = [
'total' => 0,
'used' => 0,
'free' => 0,
'percentage' => 0,
'timestamp' => time()
];
// 不同系统获取内存信息的方式不同
if (stripos(PHP_OS, 'WIN') === 0) {
// Windows系统:使用wmic命令获取内存信息
$output = shell_exec('wmic OS get TotalVisibleMemorySize,FreePhysicalMemory');
if ($output) {
$lines = explode("\n", trim($output));
if (count($lines) >= 2) {
$data = preg_split('/\s+/', trim($lines[1]));
$total = isset($data[0]) ? intval($data[0]) : 0;
$free = isset($data[1]) ? intval($data[1]) : 0;
$result['total'] = $total;
$result['free'] = $free;
$result['used'] = $total - $free;
$result['percentage'] = $total > 0 ? round(($result['used'] / $total) * 100, 2) : 0;
}
}
} else {
// Linux/Unix系统:读取/proc/meminfo文件
$meminfo = @file('/proc/meminfo');
if ($meminfo) {
$memdata = [];
foreach ($meminfo as $line) {
if (preg_match('/^([^:]+):\s+(\d+)\s/', $line, $matches)) {
$memdata[$matches[1]] = intval($matches[2]) * 1024; // 转换为字节
}
}
// 计算内存使用情况
if (isset($memdata['MemTotal'])) {
$total = $memdata['MemTotal'];
$free = $memdata['MemFree'] ?? 0;
$buffers = $memdata['Buffers'] ?? 0;
$cached = $memdata['Cached'] ?? 0;
// 计算实际使用内存(排除缓存和缓冲)
$used = $total - $free - $buffers - $cached;
$result['total'] = $total;
$result['free'] = $free + $buffers + $cached; // 包含缓存和缓冲的空闲内存
$result['used'] = max(0, $used); // 确保非负
$result['percentage'] = $total > 0 ? round(($result['used'] / $total) * 100, 2) : 0;
}
}
}
return $result;
}
和
<?php
function getDiskUsage(string $path = '/'): array
{
$result = [
'total' => 0,
'used' => 0,
'free' => 0,
'percentage' => 0,
'mount_point' => $path,
'timestamp' => time()
];
// 使用disk_total_space和disk_free_space函数
$total = disk_total_space($path);
$free = disk_free_space($path);
if ($total !== false && $free !== false) {
$result['total'] = round($total / (1024 * 1024), 2); // 转换为MB
$result['free'] = round($free / (1024 * 1024), 2); // 转换为MB
$result['used'] = $result['total'] - $result['free'];
$result['percentage'] = $result['total'] > 0 ?
round(($result['used'] / $result['total']) * 100, 2) : 0;
}
return $result;
}
,我们可以轻松获取这些基础指标。其次是应用层,对于PHP而言,脚本自身的内存消耗、数据库的连接与查询效率是关注重点。
<?php
function getPhpMemoryUsage(): array
{
return [
'memory_usage' => round(memory_get_usage() / 1024 / 1024, 2), // 转换为MB
'memory_peak' => round(memory_get_peak_usage() / 1024 / 1024, 2), // 转换为MB
'timestamp' => time()
];
}
能帮你定位内存泄漏的嫌疑脚本,而
<?php
function getDatabaseStats(PDO $pdo): array
{
$result = [
'connections' => 0,
'slow_queries' => 0,
'queries_per_second' => 0,
'timestamp' => time()
];
try {
// 获取当前连接数
$stmt = $pdo->query("SHOW STATUS LIKE 'Threads_connected'");
if ($stmt) {
$row = $stmt->fetch(PDO::FETCH_ASSOC);
$result['connections'] = intval($row['Value'] ?? 0);
}
// 获取慢查询数量
$stmt = $pdo->query("SHOW STATUS LIKE 'Slow_queries'");
if ($stmt) {
$row = $stmt->fetch(PDO::FETCH_ASSOC);
$result['slow_queries'] = intval($row['Value'] ?? 0);
}
// 获取每秒查询数(QPS)
$stmt = $pdo->query("SHOW STATUS LIKE 'Queries'");
if ($stmt) {
$row = $stmt->fetch(PDO::FETCH_ASSOC);
$queries = intval($row['Value'] ?? 0);
// 获取当前时间戳,计算与之前记录的差值
static $lastQueries = 0;
static $lastTimestamp = 0;
if ($lastTimestamp > 0) {
$timeDiff = time() - $lastTimestamp;
if ($timeDiff > 0) {
$result['queries_per_second'] = round(($queries - $lastQueries) / $timeDiff, 2);
}
}
$lastQueries = $queries;
$lastTimestamp = time();
}
} catch (Exception $e) {
// 记录错误但不抛出
error_log("获取数据库统计信息失败: " . $e->getMessage());
}
return $result;
}
则能揭示慢查询或连接池压力。最后是业务层,例如核心接口的响应时间、每秒处理的请求数(吞吐量),这些直接关系到用户体验。
<?php
function getPerformanceMetrics(): array
{
static $startTime = null;
static $requestCount = 0;
// 如果是第一次调用,初始化开始时间
if ($startTime === null) {
$startTime = microtime(true);
}
// 增加请求计数
$requestCount++;
// 计算当前时间
$currentTime = microtime(true);
// 计算运行时长(秒)
$runtime = $currentTime - $startTime;
// 计算吞吐量(请求/秒)
$throughput = 0;
if ($runtime > 0) {
$throughput = $requestCount / $runtime;
}
// 获取内存使用情况
$memoryUsage = memory_get_usage(true);
$peakMemoryUsage = memory_get_peak_usage(true);
// 获取系统负载(如果可用)
$systemLoad = [];
if (function_exists('sys_getloadavg')) {
$systemLoad = sys_getloadavg();
}
// 获取数据库连接数(示例)
$databaseConnections = 0;
if (extension_loaded('mysqli')) {
$databaseConnections = mysqli_num_links();
}
// 构造返回的指标数组
$metrics = [
'runtime_seconds' => round($runtime, 4),
'request_count' => $requestCount,
'throughput_rps' => round($throughput, 2),
'memory_usage_bytes' => $memoryUsage,
'peak_memory_bytes' => $peakMemoryUsage,
'system_load' => $systemLoad,
'database_connections' => $databaseConnections,
'current_timestamp' => date('Y-m-d H:i:s'),
'php_version' => PHP_VERSION,
'script_execution_time' => round(microtime(true) - $_SERVER['REQUEST_TIME_FLOAT'], 4)
];
// 添加错误处理:如果运行时异常,返回空数组
try {
// 可选:记录到日志或监控系统
// file_put_contents('performance.log', json_encode($metrics) . PHP_EOL, FILE_APPEND);
return $metrics;
} catch (Exception $e) {
error_log("获取性能指标失败: " . $e->getMessage());
return [];
}
}
` 函数的设计正是为了收集这类高层指标。有效的监控不仅仅是收集数据,还需要将这些数据以图表等直观形式展现出来,帮助我们追踪趋势、发现规律。
仅有监控画面还不够,当问题发生时,必须有人能第一时间被通知到。这就是告警的角色。告警将监控数据与预设的规则(阈值)进行比对,在异常发生时触发通知。例如,当系统负载连续5分钟超过4.0,或者数据库连接数超过最大限制的80%时,就应立即告警。一个典型的实践是,你可以设定一个涵盖CPU、内存、磁盘的阈值数组,然后定期执行健康检查。`
<?php
function checkSystemHealth(array $thresholds = []): array
{
// 默认阈值
$defaultThresholds = [
'cpu_threshold' => 90, // CPU使用率阈值(%)
'memory_threshold' => 90, // 内存使用率阈值(%)
'disk_threshold' => 90, // 磁盘使用率阈值(%)
'load_threshold' => 5.0, // 系统负载阈值(15分钟平均)
'connections_threshold' => 100 // 数据库连接数阈值
];
$thresholds = array_merge($defaultThresholds, $thresholds);
// 收集各项指标
$healthData = [
'cpu' => getCpuUsage(),
'memory' => getMemoryUsage(),
'disk' => getDiskUsage(),
'load' => getSystemLoad(),
'php_memory' => getPhpMemoryUsage(),
'timestamp' => time(),
'status' => 'healthy',
'issues' => []
];
// 检查CPU使用率
if ($healthData['cpu']['percentage'] > $thresholds['cpu_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'CPU使用率过高: %.2f%% (阈值: %d%%)',
$healthData['cpu']['percentage'],
$thresholds['cpu_threshold']
);
}
// 检查内存使用率
if ($healthData['memory']['percentage'] > $thresholds['memory_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'内存使用率过高: %.2f%% (阈值: %d%%)',
$healthData['memory']['percentage'],
$thresholds['memory_threshold']
);
}
// 检查磁盘使用率
if ($healthData['disk']['percentage'] > $thresholds['disk_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'磁盘使用率过高: %.2f%% (阈值: %d%%)',
$healthData['disk']['percentage'],
$thresholds['disk_threshold']
);
}
// 检查系统负载
if ($healthData['load']['load_15min'] > $thresholds['load_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'系统负载过高: %.2f (阈值: %.1f)',
$healthData['load']['load_15min'],
$thresholds['load_threshold']
);
}
// 检查PHP内存使用
if ($healthData['php_memory']['memory_peak'] > 128) { // 128MB阈值
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'PHP峰值内存使用过高: %.2fMB',
$healthData['php_memory']['memory_peak']
);
}
return $healthData;
}
函数接收这些阈值并返回状态,而
<?php
function checkAndAlert(array $thresholds = []): void
{
$healthData = checkSystemHealth($thresholds);
if ($healthData['status'] !== 'healthy' && !empty($healthData['issues'])) {
$level = 'warning';
$message = '系统健康检查发现问题: ' . implode('; ', $healthData['issues']);
// 如果问题严重,提升告警级别
if (count($healthData['issues']) > 3) {
$level = 'error';
}
// 检查是否有关键指标严重超标
if ($healthData['cpu']['percentage'] > 95 ||
$healthData['memory']['percentage'] > 95 ||
$healthData['disk']['percentage'] > 95) {
$level = 'critical';
}
sendAlert($level, $message, $healthData);
}
// 记录常规监控数据
$monitorLog = sprintf(
"[%s] CPU: %.2f%%, Memory: %.2f%%, Disk: %.2f%%, Load: %.2f\n",
date('Y-m-d H:i:s'),
$healthData['cpu']['percentage'],
$healthData['memory']['percentage'],
$healthData['disk']['percentage'],
$healthData['load']['load_15min']
);
$logFile = '/var/log/php-monitor.log';
if (!is_dir(dirname($logFile))) {
mkdir(dirname($logFile), 0755, true);
}
file_put_contents($logFile, $monitorLog, FILE_APPEND);
}
更进一步,它封装了检查和发送告警的完整逻辑。告警的渠道可以多样化,从邮件、短信到集成Slack、钉钉等协作工具,
<?php
function sendAlert(string $level, string $message, array $data = []): bool
{
// 默认配置
$config = [
'email_enabled' => true,
'email_from' => 'alerts@example.com',
'email_to' => ['admin@example.com'],
'slack_enabled' => false,
'slack_webhook' => '',
'sms_enabled' => false,
'sms_provider' => '',
'log_enabled' => true,
'log_file' => '/var/log/php-alerts.log'
];
// 合并自定义配置
$config = array_merge($config, $data['config'] ?? []);
$result = false;
$timestamp = date('Y-m-d H:i:s');
$fullMessage = "[{$timestamp}] [{$level}] {$message}";
// 添加附加数据
if (!empty($data)) {
$fullMessage .= "\n数据: " . json_encode($data, JSON_PRETTY_PRINT);
}
// 1. 记录到日志文件
if ($config['log_enabled']) {
$logDir = dirname($config['log_file']);
if (!is_dir($logDir)) {
mkdir($logDir, 0755, true);
}
$logResult = file_put_contents($config['log_file'], $fullMessage . "\n", FILE_APPEND) !== false;
$result = $result || $logResult;
}
// 2. 发送邮件
if ($config['email_enabled'] && in_array($level, ['critical', 'error', 'warning'])) {
$subject = "系统告警 [{$level}]: " . substr($message, 0, 50) . '...';
$headers = "From: {$config['email_from']}\r\n";
$headers .= "Content-Type: text/html; charset=UTF-8\r\n";
$htmlMessage = nl2br($fullMessage);
foreach ($config['email_to'] as $email) {
$emailResult = mail($email, $subject, $htmlMessage, $headers);
$result = $result || $emailResult;
}
}
// 3. 发送到Slack
if ($config['slack_enabled'] && !empty($config['slack_webhook'])) {
$slackData = [
'text' => $fullMessage,
'username' => '系统监控机器人',
'icon_emoji' => ':warning:'
];
$ch = curl_init($config['slack_webhook']);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($slackData));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
$slackResult = ($response !== false && $httpCode === 200);
$result = $result || $slackResult;
}
// 4. 发送短信(需要具体提供商实现)
if ($config['sms_enabled'] && !empty($config['sms_provider'])) {
// 这里需要根据具体的短信提供商实现
// 示例:调用外部API发送短信
$smsResult = false;
try {
// 根据短信提供商调用不同的API
switch ($config['sms_provider']) {
case 'aliyun':
// 阿里云短信服务调用
// $smsResult = sendSMSByAliyun($fullMessage);
break;
case 'tencent':
// 腾讯云短信服务调用
// $smsResult = sendSMSByTencent($fullMessage);
break;
default:
// 默认实现或日志记录
error_log("不支持的短信提供商: {$config['sms_provider']}");
}
$result = $result || $smsResult;
} catch (Exception $e) {
error_log("发送短信失败: " . $e->getMessage());
}
}
// 5. 错误处理:记录发送失败的情况
if (!$result) {
error_log("告警发送失败 - 级别: {$level}, 消息: " . substr($message, 0, 200));
}
return $result;
}
` 函数需要具备这种扩展能力。好的告警应该是精准且避免“狼来了”效应的,这意味着阈值设置要合理,并且告警信息应包含足够的问题上下文数据,便于快速定位。
当监控和告警体系稳定运行后,我们会发现许多运维操作是重复且规律的。自动化运维的目的就是将这些操作交由系统自动完成,提升效率并减少人为失误。常见的自动化任务包括:日志管理(如定期轮转以防磁盘写满、清理过期日志)、数据备份(如每天凌晨备份数据库)、以及定期的系统健康检查与报告。例如,你可以使用 `
<?php
function rotateLogFile(string $logFile, int $maxSize = 10485760, int $backups = 10): bool
{
if (!file_exists($logFile)) {
return false;
}
$fileSize = filesize($logFile);
if ($fileSize < $maxSize) {
return false;
}
// 关闭可能打开的文件句柄
if (function_exists('posix_kill')) {
// 尝试重新打开日志文件
$logDir = dirname($logFile);
$baseName = basename($logFile);
// 重命名当前日志文件
$timestamp = date('Ymd_His');
$archivedFile = $logDir . '/' . $baseName . '.' . $timestamp;
if (!rename($logFile, $archivedFile)) {
return false;
}
// 重新创建日志文件
touch($logFile);
chmod($logFile, 0644);
// 清理旧的归档文件
$pattern = $logDir . '/' . $baseName . '.*';
$backupFiles = glob($pattern);
if (count($backupFiles) > $backups) {
// 按修改时间排序,删除最旧的文件
usort($backupFiles, function($a, $b) {
return filemtime($a) - filemtime($b);
});
$filesToDelete = array_slice($backupFiles, 0, count($backupFiles) - $backups);
foreach ($filesToDelete as $file) {
@unlink($file);
}
}
return true;
}
return false;
}
确保单个日志文件不会过大,用
<?php
function cleanupOldLogs(string $logDir, int $daysToKeep = 30): array
{
$result = [
'total_files' => 0,
'deleted_files' => 0,
'deleted_size' => 0,
'errors' => []
];
if (!is_dir($logDir)) {
$result['errors'][] = "目录不存在: {$logDir}";
return $result;
}
$cutoffTime = time() - ($daysToKeep * 24 * 60 * 60);
$iterator = new RecursiveIteratorIterator(
new RecursiveDirectoryIterator($logDir, RecursiveDirectoryIterator::SKIP_DOTS),
RecursiveIteratorIterator::SELF_FIRST
);
foreach ($iterator as $file) {
if ($file->isFile()) {
$result['total_files']++;
// 检查文件修改时间
if ($file->getMTime() < $cutoffTime) {
$fileSize = $file->getSize();
if (unlink($file->getPathname())) {
$result['deleted_files']++;
$result['deleted_size'] += $fileSize;
} else {
$result['errors'][] = "无法删除文件: " . $file->getPathname();
}
}
}
}
// 尝试删除空目录
foreach (array_reverse(iterator_to_array($iterator)) as $file) {
if ($file->isDir()) {
@rmdir($file->getPathname());
}
}
return $result;
}
自动清理30天前的历史日志。数据库的定期备份可以通过
<?php
function backupDatabase(array $config): array
{
$defaultConfig = [
'host' => 'localhost',
'port' => 3306,
'username' => 'root',
'password' => '',
'database' => '',
'backup_dir' => '/var/backups/mysql',
'compress' => true,
'keep_days' => 30
];
$config = array_merge($defaultConfig, $config);
$result = [
'success' => false,
'backup_file' => '',
'backup_size' => 0,
'error' => '',
'timestamp' => time()
];
// 验证配置
if (empty($config['database'])) {
$result['error'] = '数据库名称不能为空';
return $result;
}
// 创建备份目录
if (!is_dir($config['backup_dir'])) {
if (!mkdir($config['backup_dir'], 0755, true)) {
$result['error'] = "无法创建备份目录: {$config['backup_dir']}";
return $result;
}
}
// 生成备份文件名
$timestamp = date('Ymd_His');
$backupFile = $config['backup_dir'] . '/' . $config['database'] . '_' . $timestamp . '.sql';
if ($config['compress']) {
$backupFile .= '.gz';
}
// 构建mysqldump命令
$command = sprintf(
'mysqldump --host=%s --port=%d --user=%s --password=%s %s',
escapeshellarg($config['host']),
$config['port'],
escapeshellarg($config['username']),
escapeshellarg($config['password']),
escapeshellarg($config['database'])
);
if ($config['compress']) {
$command .= ' | gzip';
}
$command .= ' > ' . escapeshellarg($backupFile);
// 执行备份命令
exec($command, $output, $returnCode);
if ($returnCode === 0) {
// 备份成功
$result['success'] = true;
$result['backup_file'] = $backupFile;
// 获取备份文件大小
if (file_exists($backupFile)) {
$result['backup_size'] = filesize($backupFile);
}
// 清理旧备份文件
$expireTime = time() - ($config['keep_days'] * 24 * 60 * 60);
$pattern = $config['backup_dir'] . '/' . $config['database'] . '_*.sql*';
foreach (glob($pattern) as $oldBackupFile) {
if (filemtime($oldBackupFile) < $expireTime) {
@unlink($oldBackupFile);
}
}
} else {
// 备份失败
$result['error'] = '数据库备份失败,mysqldump命令执行错误';
if (!empty($output)) {
$result['error'] .= ': ' . implode("\n", $output);
}
}
return $result;
}
来完成,而
<?php
function setupMonitoringCron(int $interval = 300): bool
{
// 检查脚本路径是否存在
$scriptPath = __FILE__;
if (!file_exists($scriptPath)) {
error_log("监控脚本文件不存在: {$scriptPath}");
return false;
}
// 构建监控命令
$command = "php -f \"{$scriptPath}\" --check 2>&1 >> /var/log/php-monitor-cron.log";
// 计算cron表达式(每$interval秒执行一次)
if ($interval >= 60) {
// 大于等于60秒,转换为分钟
$minutes = floor($interval / 60);
if ($minutes < 1) {
$minutes = 1; // 最小为1分钟
}
$cronExpression = "*/{$minutes} * * * *";
} else {
// 如果间隔小于60秒,需要特殊处理(多行cron)
$lines = [];
for ($i = 0; $i < 60; $i += $interval) {
$lines[] = "{$i} * * * * {$command}";
}
$cronExpression = implode("\n", $lines);
}
// 准备要添加的cron行
$cronLine = "# PHP系统监控任务(监控间隔:{$interval}秒)\n";
if ($interval >= 60) {
$cronLine .= "{$cronExpression} {$command}\n";
} else {
$cronLine .= "{$cronExpression}\n";
}
// 获取当前用户的crontab
exec('crontab -l 2>/dev/null', $currentCron, $returnCode);
if ($returnCode !== 0) {
// 如果用户没有crontab,从空数组开始
$currentCron = [];
}
// 移除旧的监控任务
$newCron = [];
$monitorFound = false;
foreach ($currentCron as $line) {
// 跳过注释行和空行
if (trim($line) === '' || strpos($line, '#') === 0) {
// 检查是否是我们的监控任务注释
if (strpos($line, 'PHP系统监控任务') !== false) {
$monitorFound = true;
continue; // 跳过旧注释行
}
// 保留其他注释行
$newCron[] = $line;
continue;
}
// 检查是否是监控任务行
if (strpos($line, basename($scriptPath)) !== false || strpos($line, 'php-monitor-cron.log') !== false) {
$monitorFound = true;
continue; // 跳过旧任务行
}
// 保留其他任务行
$newCron[] = $line;
}
// 添加新的监控任务
$newCron[] = trim($cronLine);
// 创建临时文件保存新的crontab
$tempFile = tempnam(sys_get_temp_dir(), 'cron_');
if ($tempFile === false) {
error_log("无法创建临时文件");
return false;
}
// 将新的crontab内容写入临时文件
$newCronContent = implode("\n", $newCron) . "\n";
if (file_put_contents($tempFile, $newCronContent) === false) {
error_log("无法写入临时文件: {$tempFile}");
unlink($tempFile);
return false;
}
// 应用新的crontab
exec("crontab {$tempFile} 2>&1", $output, $installResult);
// 清理临时文件
unlink($tempFile);
if ($installResult !== 0) {
error_log("安装cron任务失败: " . implode("\n", $output));
return false;
}
// 验证cron任务是否安装成功
exec('crontab -l 2>/dev/null', $verifyCron, $verifyResult);
if ($verifyResult === 0) {
foreach ($verifyCron as $line) {
if (strpos($line, basename($scriptPath)) !== false) {
return true; // 确认任务已添加
}
}
}
error_log("cron任务安装后验证失败");
return false;
}
则可以帮助你将这些检查任务部署为系统的定时任务(Cron Job)。更进一步,当监控发现数据库查询性能缓慢时,可以自动触发
<?php
function optimizeDatabaseTables(PDO $pdo, array $tables = []): array
{
$result = [
'optimized_tables' => [],
'total_size_before' => 0,
'total_size_after' => 0,
'errors' => []
];
try {
// 如果没有指定表,则优化所有表
if (empty($tables)) {
$stmt = $pdo->query("SHOW TABLES");
$tables = $stmt->fetchAll(PDO::FETCH_COLUMN);
}
foreach ($tables as $table) {
try {
// 获取表优化前的状态
$stmt = $pdo->query("SHOW TABLE STATUS LIKE '{$table}'");
$status = $stmt->fetch(PDO::FETCH_ASSOC);
$sizeBefore = $status['Data_length'] + $status['Index_length'];
$result['total_size_before'] += $sizeBefore;
// 执行OPTIMIZE TABLE
$pdo->exec("OPTIMIZE TABLE `{$table}`");
// 获取表优化后的状态
$stmt = $pdo->query("SHOW TABLE STATUS LIKE '{$table}'");
$status = $stmt->fetch(PDO::FETCH_ASSOC);
$sizeAfter = $status['Data_length'] + $status['Index_length'];
$result['total_size_after'] += $sizeAfter;
$result['optimized_tables'][] = [
'table' => $table,
'size_before' => $sizeBefore,
'size_after' => $sizeAfter,
'saved' => $sizeBefore - $sizeAfter
];
} catch (Exception $e) {
$result['errors'][] = "优化表 {$table} 失败: " . $e->getMessage();
}
}
} catch (Exception $e) {
$result['errors'][] = "获取表列表失败: " . $e->getMessage();
}
return $result;
}
来尝试优化表结构。每周或每月,还可以通过
<?php
function generateMonitoringReport(string $period = 'daily'): array
{
$report = [
'period' => $period,
'generated_at' => date('Y-m-d H:i:s'),
'summary' => [],
'details' => []
];
// 根据周期确定时间范围
$endTime = time();
switch ($period) {
case 'hourly':
$startTime = $endTime - 3600;
break;
case 'daily':
$startTime = $endTime - 86400;
break;
case 'weekly':
$startTime = $endTime - 604800;
break;
case 'monthly':
$startTime = $endTime - 2592000;
break;
default:
$startTime = $endTime - 86400; // 默认每天
}
// 这里应该从监控数据库或日志文件中获取数据
// 以下为示例数据
$report['summary'] = [
'total_requests' => rand(1000, 10000),
'avg_response_time' => round(rand(100, 500) / 1000, 2), // 转换为秒
'error_rate' => round(rand(1, 50) / 1000, 2), // 转换为百分比
'peak_memory_usage' => round(rand(50, 200), 2), // MB
'alerts_count' => rand(0, 10)
];
// 添加详细数据
$report['details'] = [
'cpu_usage' => [
'avg' => round(rand(20, 80), 2),
'max' => round(rand(70, 95), 2),
'min' => round(rand(10, 30), 2)
],
'memory_usage' => [
'avg' => round(rand(40, 70), 2),
'max' => round(rand(75, 95), 2),
'min' => round(rand(20, 40), 2)
],
'disk_usage' => [
'avg' => round(rand(50, 80), 2),
'max' => round(rand(80, 95), 2),
'min' => round(rand(40, 60), 2)
]
];
return $report;
}
` 生成一份性能报告,用于回顾和分析。
将监控、告警与自动化连接起来,就构成了一个可持续的优化闭环。监控系统持续采集性能数据;当数据触及告警阈值时,立即通知开发者;而对于已知的、可程序化处理的故障或维护任务,则通过自动化脚本来解决。同时,监控积累的历史数据(无论是 `
<?php
function getPerformanceMetrics(): array
{
static $startTime = null;
static $requestCount = 0;
// 如果是第一次调用,初始化开始时间
if ($startTime === null) {
$startTime = microtime(true);
}
// 增加请求计数
$requestCount++;
// 计算当前时间
$currentTime = microtime(true);
// 计算运行时长(秒)
$runtime = $currentTime - $startTime;
// 计算吞吐量(请求/秒)
$throughput = 0;
if ($runtime > 0) {
$throughput = $requestCount / $runtime;
}
// 获取内存使用情况
$memoryUsage = memory_get_usage(true);
$peakMemoryUsage = memory_get_peak_usage(true);
// 获取系统负载(如果可用)
$systemLoad = [];
if (function_exists('sys_getloadavg')) {
$systemLoad = sys_getloadavg();
}
// 获取数据库连接数(示例)
$databaseConnections = 0;
if (extension_loaded('mysqli')) {
$databaseConnections = mysqli_num_links();
}
// 构造返回的指标数组
$metrics = [
'runtime_seconds' => round($runtime, 4),
'request_count' => $requestCount,
'throughput_rps' => round($throughput, 2),
'memory_usage_bytes' => $memoryUsage,
'peak_memory_bytes' => $peakMemoryUsage,
'system_load' => $systemLoad,
'database_connections' => $databaseConnections,
'current_timestamp' => date('Y-m-d H:i:s'),
'php_version' => PHP_VERSION,
'script_execution_time' => round(microtime(true) - $_SERVER['REQUEST_TIME_FLOAT'], 4)
];
// 添加错误处理:如果运行时异常,返回空数组
try {
// 可选:记录到日志或监控系统
// file_put_contents('performance.log', json_encode($metrics) . PHP_EOL, FILE_APPEND);
return $metrics;
} catch (Exception $e) {
error_log("获取性能指标失败: " . $e->getMessage());
return [];
}
}
` 收集的应用指标,还是基础设施数据)为下一次的深度性能调优提供了宝贵的决策依据。例如,报告显示每周五晚上的数据库负载都很高,这可能引导你去优化周末促销活动的相关查询。这个过程不是线性的,而是一个螺旋上升的循环,确保你的PHP应用能在不断变化的环境中始终保持良好的性能表现。
实践应用
性能优化不是一次性事件,而是贯穿应用生命周期的持续实践。真正的深度优化始于对现状的清晰认知,这依赖于系统化的监控、及时的告警和自动化的运维响应。对于PHP应用而言,这意味着要从服务器、数据库到应用代码本身,建立一个完整的可观测性闭环。
监控是优化的眼睛。首先需要了解应用运行的基础环境是否健康。`
<?php
function getSystemLoad(): array
{
// 获取系统负载平均值
$load = sys_getloadavg();
// 如果sys_getloadavg不可用,尝试从/proc/loadavg读取
if (!$load) {
if (is_readable('/proc/loadavg')) {
$loadData = file_get_contents('/proc/loadavg');
$load = array_slice(explode(' ', $loadData), 0, 3);
} else {
$load = [0, 0, 0];
}
}
return [
'load_1min' => round($load[0], 2),
'load_5min' => round($load[1], 2),
'load_15min' => round($load[2], 2),
'timestamp' => time()
];
}
、
<?php
function getMemoryUsage(): array
{
$result = [
'total' => 0,
'used' => 0,
'free' => 0,
'percentage' => 0,
'timestamp' => time()
];
// 不同系统获取内存信息的方式不同
if (stripos(PHP_OS, 'WIN') === 0) {
// Windows系统:使用wmic命令获取内存信息
$output = shell_exec('wmic OS get TotalVisibleMemorySize,FreePhysicalMemory');
if ($output) {
$lines = explode("\n", trim($output));
if (count($lines) >= 2) {
$data = preg_split('/\s+/', trim($lines[1]));
$total = isset($data[0]) ? intval($data[0]) : 0;
$free = isset($data[1]) ? intval($data[1]) : 0;
$result['total'] = $total;
$result['free'] = $free;
$result['used'] = $total - $free;
$result['percentage'] = $total > 0 ? round(($result['used'] / $total) * 100, 2) : 0;
}
}
} else {
// Linux/Unix系统:读取/proc/meminfo文件
$meminfo = @file('/proc/meminfo');
if ($meminfo) {
$memdata = [];
foreach ($meminfo as $line) {
if (preg_match('/^([^:]+):\s+(\d+)\s/', $line, $matches)) {
$memdata[$matches[1]] = intval($matches[2]) * 1024; // 转换为字节
}
}
// 计算内存使用情况
if (isset($memdata['MemTotal'])) {
$total = $memdata['MemTotal'];
$free = $memdata['MemFree'] ?? 0;
$buffers = $memdata['Buffers'] ?? 0;
$cached = $memdata['Cached'] ?? 0;
// 计算实际使用内存(排除缓存和缓冲)
$used = $total - $free - $buffers - $cached;
$result['total'] = $total;
$result['free'] = $free + $buffers + $cached; // 包含缓存和缓冲的空闲内存
$result['used'] = max(0, $used); // 确保非负
$result['percentage'] = $total > 0 ? round(($result['used'] / $total) * 100, 2) : 0;
}
}
}
return $result;
}
和
<?php
function getCpuUsage(): array
{
// 静态变量用于保存上次CPU时间(仅Linux有效)
static $lastCpuTimes = null;
// 初始化返回数组
$result = [
'user' => 0,
'system' => 0,
'idle' => 0,
'percentage' => 0,
'timestamp' => time()
];
// 判断操作系统类型
if (stripos(PHP_OS, 'WIN') !== false) {
// Windows系统:使用WMI命令获取CPU使用率百分比
$command = 'wmic path Win32_PerfFormattedData_PerfOS_Processor get PercentIdleTime,PercentUserTime,PercentPrivilegedTime /format:csv';
$output = shell_exec($command);
if ($output) {
$lines = explode("\n", trim($output));
// 第一行为标题,第二行为数据
if (count($lines) > 1) {
$data = explode(',', $lines[1]);
// 数据格式:Node,PercentIdleTime,PercentPrivilegedTime,PercentUserTime
if (count($data) >= 4) {
$idle = floatval($data[1]); // 空闲时间百分比
$system = floatval($data[2]); // 系统态时间百分比
$user = floatval($data[3]); // 用户态时间百分比
$result['idle'] = $idle;
$result['system'] = $system;
$result['user'] = $user;
$result['percentage'] = 100 - $idle; // 总使用率 = 100 - 空闲率
}
}
} else {
// 命令执行失败,保留默认值
error_log("获取Windows CPU使用率失败");
}
} else {
// 非Windows系统(如Linux):通过/proc/stat获取CPU累计时间
$statFile = '/proc/stat';
if (file_exists($statFile)) {
$contents = file_get_contents($statFile);
$lines = explode("\n", $contents);
if (isset($lines[0])) {
$cpuLine = $lines[0]; // 第一行为总CPU时间信息
$parts = preg_split('/\s+/', trim($cpuLine));
if (count($parts) >= 5) {
// 解析CPU时间:cpu, user, nice, system, idle, iowait, ...
$user = intval($parts[1]) + intval($parts[2]); // user + nice
$system = intval($parts[3]); // system
$idle = intval($parts[4]); // idle
// 可选:将iowait计入空闲时间
if (isset($parts[5])) {
$idle += intval($parts[5]);
}
// 当前CPU累计时间
$currentCpuTimes = [
'user' => $user,
'system' => $system,
'idle' => $idle,
'total' => $user + $system + $idle
];
// 计算使用率百分比(需要与上一次采样比较)
if ($lastCpuTimes !== null) {
$diffUser = $currentCpuTimes['user'] - $lastCpuTimes['user'];
$diffSystem = $currentCpuTimes['system'] - $lastCpuTimes['system'];
$diffIdle = $currentCpuTimes['idle'] - $lastCpuTimes['idle'];
$diffTotal = $currentCpuTimes['total'] - $lastCpuTimes['total'];
if ($diffTotal > 0) {
$usagePercentage = (($diffUser + $diffSystem) / $diffTotal) * 100;
$result['percentage'] = round($usagePercentage, 2);
}
}
// 保存当前时间供下次计算
$lastCpuTimes = $currentCpuTimes;
// 设置累计时间值
$result['user'] = $user;
$result['system'] = $system;
$result['idle'] = $idle;
}
} else {
error_log("/proc/stat文件格式异常");
}
} else {
// 非Linux系统(如macOS)可在此扩展
error_log("不支持的操作系统或/proc/stat不存在");
}
}
return $result;
}
`提供了服务器资源消耗的快照。如果系统负载持续接近或超过CPU核心数,或者内存使用率长期高于80%,这表明硬件资源可能已成为瓶颈,需要考虑水平扩展或垂直升级。
紧接着,我们需要深入到PHP进程内部。`
<?php
function getPhpMemoryUsage(): array
{
return [
'memory_usage' => round(memory_get_usage() / 1024 / 1024, 2), // 转换为MB
'memory_peak' => round(memory_get_peak_usage() / 1024 / 1024, 2), // 转换为MB
'timestamp' => time()
];
}
能揭示脚本执行过程中的内存消耗模式。一个在循环中不断增长的内存占用曲线,往往是内存泄漏的信号;而异常高的峰值内存使用则提示可能存在一次性加载过大数据集的问题,比如未分页查询整个数据库表。同时,
<?php
function getPerformanceMetrics(): array
{
static $startTime = null;
static $requestCount = 0;
// 如果是第一次调用,初始化开始时间
if ($startTime === null) {
$startTime = microtime(true);
}
// 增加请求计数
$requestCount++;
// 计算当前时间
$currentTime = microtime(true);
// 计算运行时长(秒)
$runtime = $currentTime - $startTime;
// 计算吞吐量(请求/秒)
$throughput = 0;
if ($runtime > 0) {
$throughput = $requestCount / $runtime;
}
// 获取内存使用情况
$memoryUsage = memory_get_usage(true);
$peakMemoryUsage = memory_get_peak_usage(true);
// 获取系统负载(如果可用)
$systemLoad = [];
if (function_exists('sys_getloadavg')) {
$systemLoad = sys_getloadavg();
}
// 获取数据库连接数(示例)
$databaseConnections = 0;
if (extension_loaded('mysqli')) {
$databaseConnections = mysqli_num_links();
}
// 构造返回的指标数组
$metrics = [
'runtime_seconds' => round($runtime, 4),
'request_count' => $requestCount,
'throughput_rps' => round($throughput, 2),
'memory_usage_bytes' => $memoryUsage,
'peak_memory_bytes' => $peakMemoryUsage,
'system_load' => $systemLoad,
'database_connections' => $databaseConnections,
'current_timestamp' => date('Y-m-d H:i:s'),
'php_version' => PHP_VERSION,
'script_execution_time' => round(microtime(true) - $_SERVER['REQUEST_TIME_FLOAT'], 4)
];
// 添加错误处理:如果运行时异常,返回空数组
try {
// 可选:记录到日志或监控系统
// file_put_contents('performance.log', json_encode($metrics) . PHP_EOL, FILE_APPEND);
return $metrics;
} catch (Exception $e) {
error_log("获取性能指标失败: " . $e->getMessage());
return [];
}
}
`可以用来追踪关键业务接口的响应时间和吞吐量,定位慢请求。
数据库通常是性能问题的核心。`
<?php
function getDatabaseStats(PDO $pdo): array
{
$result = [
'connections' => 0,
'slow_queries' => 0,
'queries_per_second' => 0,
'timestamp' => time()
];
try {
// 获取当前连接数
$stmt = $pdo->query("SHOW STATUS LIKE 'Threads_connected'");
if ($stmt) {
$row = $stmt->fetch(PDO::FETCH_ASSOC);
$result['connections'] = intval($row['Value'] ?? 0);
}
// 获取慢查询数量
$stmt = $pdo->query("SHOW STATUS LIKE 'Slow_queries'");
if ($stmt) {
$row = $stmt->fetch(PDO::FETCH_ASSOC);
$result['slow_queries'] = intval($row['Value'] ?? 0);
}
// 获取每秒查询数(QPS)
$stmt = $pdo->query("SHOW STATUS LIKE 'Queries'");
if ($stmt) {
$row = $stmt->fetch(PDO::FETCH_ASSOC);
$queries = intval($row['Value'] ?? 0);
// 获取当前时间戳,计算与之前记录的差值
static $lastQueries = 0;
static $lastTimestamp = 0;
if ($lastTimestamp > 0) {
$timeDiff = time() - $lastTimestamp;
if ($timeDiff > 0) {
$result['queries_per_second'] = round(($queries - $lastQueries) / $timeDiff, 2);
}
}
$lastQueries = $queries;
$lastTimestamp = time();
}
} catch (Exception $e) {
// 记录错误但不抛出
error_log("获取数据库统计信息失败: " . $e->getMessage());
}
return $result;
}
可以帮助你监控数据库连接池的使用情况和查询频率。连接数过高可能意味着连接未正确释放,而某些表的查询频率异常则提示需要检查索引或缓存策略。定期使用
<?php
function optimizeDatabaseTables(PDO $pdo, array $tables = []): array
{
$result = [
'optimized_tables' => [],
'total_size_before' => 0,
'total_size_after' => 0,
'errors' => []
];
try {
// 如果没有指定表,则优化所有表
if (empty($tables)) {
$stmt = $pdo->query("SHOW TABLES");
$tables = $stmt->fetchAll(PDO::FETCH_COLUMN);
}
foreach ($tables as $table) {
try {
// 获取表优化前的状态
$stmt = $pdo->query("SHOW TABLE STATUS LIKE '{$table}'");
$status = $stmt->fetch(PDO::FETCH_ASSOC);
$sizeBefore = $status['Data_length'] + $status['Index_length'];
$result['total_size_before'] += $sizeBefore;
// 执行OPTIMIZE TABLE
$pdo->exec("OPTIMIZE TABLE `{$table}`");
// 获取表优化后的状态
$stmt = $pdo->query("SHOW TABLE STATUS LIKE '{$table}'");
$status = $stmt->fetch(PDO::FETCH_ASSOC);
$sizeAfter = $status['Data_length'] + $status['Index_length'];
$result['total_size_after'] += $sizeAfter;
$result['optimized_tables'][] = [
'table' => $table,
'size_before' => $sizeBefore,
'size_after' => $sizeAfter,
'saved' => $sizeBefore - $sizeAfter
];
} catch (Exception $e) {
$result['errors'][] = "优化表 {$table} 失败: " . $e->getMessage();
}
}
} catch (Exception $e) {
$result['errors'][] = "获取表列表失败: " . $e->getMessage();
}
return $result;
}
对核心表进行优化,可以回收碎片空间,提升查询效率。此外,建立定期的
<?php
function backupDatabase(array $config): array
{
$defaultConfig = [
'host' => 'localhost',
'port' => 3306,
'username' => 'root',
'password' => '',
'database' => '',
'backup_dir' => '/var/backups/mysql',
'compress' => true,
'keep_days' => 30
];
$config = array_merge($defaultConfig, $config);
$result = [
'success' => false,
'backup_file' => '',
'backup_size' => 0,
'error' => '',
'timestamp' => time()
];
// 验证配置
if (empty($config['database'])) {
$result['error'] = '数据库名称不能为空';
return $result;
}
// 创建备份目录
if (!is_dir($config['backup_dir'])) {
if (!mkdir($config['backup_dir'], 0755, true)) {
$result['error'] = "无法创建备份目录: {$config['backup_dir']}";
return $result;
}
}
// 生成备份文件名
$timestamp = date('Ymd_His');
$backupFile = $config['backup_dir'] . '/' . $config['database'] . '_' . $timestamp . '.sql';
if ($config['compress']) {
$backupFile .= '.gz';
}
// 构建mysqldump命令
$command = sprintf(
'mysqldump --host=%s --port=%d --user=%s --password=%s %s',
escapeshellarg($config['host']),
$config['port'],
escapeshellarg($config['username']),
escapeshellarg($config['password']),
escapeshellarg($config['database'])
);
if ($config['compress']) {
$command .= ' | gzip';
}
$command .= ' > ' . escapeshellarg($backupFile);
// 执行备份命令
exec($command, $output, $returnCode);
if ($returnCode === 0) {
// 备份成功
$result['success'] = true;
$result['backup_file'] = $backupFile;
// 获取备份文件大小
if (file_exists($backupFile)) {
$result['backup_size'] = filesize($backupFile);
}
// 清理旧备份文件
$expireTime = time() - ($config['keep_days'] * 24 * 60 * 60);
$pattern = $config['backup_dir'] . '/' . $config['database'] . '_*.sql*';
foreach (glob($pattern) as $oldBackupFile) {
if (filemtime($oldBackupFile) < $expireTime) {
@unlink($oldBackupFile);
}
}
} else {
// 备份失败
$result['error'] = '数据库备份失败,mysqldump命令执行错误';
if (!empty($output)) {
$result['error'] .= ': ' . implode("\n", $output);
}
}
return $result;
}
`机制是数据安全的基本保障,也为一些需要重演数据的性能测试提供了可能。
收集了数据,下一步是设定规则并自动化决策。`
<?php
function checkSystemHealth(array $thresholds = []): array
{
// 默认阈值
$defaultThresholds = [
'cpu_threshold' => 90, // CPU使用率阈值(%)
'memory_threshold' => 90, // 内存使用率阈值(%)
'disk_threshold' => 90, // 磁盘使用率阈值(%)
'load_threshold' => 5.0, // 系统负载阈值(15分钟平均)
'connections_threshold' => 100 // 数据库连接数阈值
];
$thresholds = array_merge($defaultThresholds, $thresholds);
// 收集各项指标
$healthData = [
'cpu' => getCpuUsage(),
'memory' => getMemoryUsage(),
'disk' => getDiskUsage(),
'load' => getSystemLoad(),
'php_memory' => getPhpMemoryUsage(),
'timestamp' => time(),
'status' => 'healthy',
'issues' => []
];
// 检查CPU使用率
if ($healthData['cpu']['percentage'] > $thresholds['cpu_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'CPU使用率过高: %.2f%% (阈值: %d%%)',
$healthData['cpu']['percentage'],
$thresholds['cpu_threshold']
);
}
// 检查内存使用率
if ($healthData['memory']['percentage'] > $thresholds['memory_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'内存使用率过高: %.2f%% (阈值: %d%%)',
$healthData['memory']['percentage'],
$thresholds['memory_threshold']
);
}
// 检查磁盘使用率
if ($healthData['disk']['percentage'] > $thresholds['disk_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'磁盘使用率过高: %.2f%% (阈值: %d%%)',
$healthData['disk']['percentage'],
$thresholds['disk_threshold']
);
}
// 检查系统负载
if ($healthData['load']['load_15min'] > $thresholds['load_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'系统负载过高: %.2f (阈值: %.1f)',
$healthData['load']['load_15min'],
$thresholds['load_threshold']
);
}
// 检查PHP内存使用
if ($healthData['php_memory']['memory_peak'] > 128) { // 128MB阈值
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'PHP峰值内存使用过高: %.2fMB',
$healthData['php_memory']['memory_peak']
);
}
return $healthData;
}
函数是这里的核心。你需要为关键指标定义合理的阈值:例如,CPU使用率超过90%持续5分钟,或应用内存使用率超过80%,或某个API的95分位响应时间超过2秒。当这些阈值被触发时,
<?php
function checkAndAlert(array $thresholds = []): void
{
$healthData = checkSystemHealth($thresholds);
if ($healthData['status'] !== 'healthy' && !empty($healthData['issues'])) {
$level = 'warning';
$message = '系统健康检查发现问题: ' . implode('; ', $healthData['issues']);
// 如果问题严重,提升告警级别
if (count($healthData['issues']) > 3) {
$level = 'error';
}
// 检查是否有关键指标严重超标
if ($healthData['cpu']['percentage'] > 95 ||
$healthData['memory']['percentage'] > 95 ||
$healthData['disk']['percentage'] > 95) {
$level = 'critical';
}
sendAlert($level, $message, $healthData);
}
// 记录常规监控数据
$monitorLog = sprintf(
"[%s] CPU: %.2f%%, Memory: %.2f%%, Disk: %.2f%%, Load: %.2f\n",
date('Y-m-d H:i:s'),
$healthData['cpu']['percentage'],
$healthData['memory']['percentage'],
$healthData['disk']['percentage'],
$healthData['load']['load_15min']
);
$logFile = '/var/log/php-monitor.log';
if (!is_dir(dirname($logFile))) {
mkdir(dirname($logFile), 0755, true);
}
file_put_contents($logFile, $monitorLog, FILE_APPEND);
}
会自动调用
<?php
function sendAlert(string $level, string $message, array $data = []): bool
{
// 默认配置
$config = [
'email_enabled' => true,
'email_from' => 'alerts@example.com',
'email_to' => ['admin@example.com'],
'slack_enabled' => false,
'slack_webhook' => '',
'sms_enabled' => false,
'sms_provider' => '',
'log_enabled' => true,
'log_file' => '/var/log/php-alerts.log'
];
// 合并自定义配置
$config = array_merge($config, $data['config'] ?? []);
$result = false;
$timestamp = date('Y-m-d H:i:s');
$fullMessage = "[{$timestamp}] [{$level}] {$message}";
// 添加附加数据
if (!empty($data)) {
$fullMessage .= "\n数据: " . json_encode($data, JSON_PRETTY_PRINT);
}
// 1. 记录到日志文件
if ($config['log_enabled']) {
$logDir = dirname($config['log_file']);
if (!is_dir($logDir)) {
mkdir($logDir, 0755, true);
}
$logResult = file_put_contents($config['log_file'], $fullMessage . "\n", FILE_APPEND) !== false;
$result = $result || $logResult;
}
// 2. 发送邮件
if ($config['email_enabled'] && in_array($level, ['critical', 'error', 'warning'])) {
$subject = "系统告警 [{$level}]: " . substr($message, 0, 50) . '...';
$headers = "From: {$config['email_from']}\r\n";
$headers .= "Content-Type: text/html; charset=UTF-8\r\n";
$htmlMessage = nl2br($fullMessage);
foreach ($config['email_to'] as $email) {
$emailResult = mail($email, $subject, $htmlMessage, $headers);
$result = $result || $emailResult;
}
}
// 3. 发送到Slack
if ($config['slack_enabled'] && !empty($config['slack_webhook'])) {
$slackData = [
'text' => $fullMessage,
'username' => '系统监控机器人',
'icon_emoji' => ':warning:'
];
$ch = curl_init($config['slack_webhook']);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($slackData));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
$slackResult = ($response !== false && $httpCode === 200);
$result = $result || $slackResult;
}
// 4. 发送短信(需要具体提供商实现)
if ($config['sms_enabled'] && !empty($config['sms_provider'])) {
// 这里需要根据具体的短信提供商实现
// 示例:调用外部API发送短信
$smsResult = false;
try {
// 根据短信提供商调用不同的API
switch ($config['sms_provider']) {
case 'aliyun':
// 阿里云短信服务调用
// $smsResult = sendSMSByAliyun($fullMessage);
break;
case 'tencent':
// 腾讯云短信服务调用
// $smsResult = sendSMSByTencent($fullMessage);
break;
default:
// 默认实现或日志记录
error_log("不支持的短信提供商: {$config['sms_provider']}");
}
$result = $result || $smsResult;
} catch (Exception $e) {
error_log("发送短信失败: " . $e->getMessage());
}
}
// 5. 错误处理:记录发送失败的情况
if (!$result) {
error_log("告警发送失败 - 级别: {$level}, 消息: " . substr($message, 0, 200));
}
return $result;
}
`发出告警。告警应当分级,比如“警告”级别发送到钉钉/飞书群,“紧急”级别则需要短信或电话通知值班人员。这样既不会错过问题,又避免了告警疲劳。
为了让这套监控告警体系持续运行,你需要将其自动化。`
<?php
function setupMonitoringCron(int $interval = 300): bool
{
// 检查脚本路径是否存在
$scriptPath = __FILE__;
if (!file_exists($scriptPath)) {
error_log("监控脚本文件不存在: {$scriptPath}");
return false;
}
// 构建监控命令
$command = "php -f \"{$scriptPath}\" --check 2>&1 >> /var/log/php-monitor-cron.log";
// 计算cron表达式(每$interval秒执行一次)
if ($interval >= 60) {
// 大于等于60秒,转换为分钟
$minutes = floor($interval / 60);
if ($minutes < 1) {
$minutes = 1; // 最小为1分钟
}
$cronExpression = "*/{$minutes} * * * *";
} else {
// 如果间隔小于60秒,需要特殊处理(多行cron)
$lines = [];
for ($i = 0; $i < 60; $i += $interval) {
$lines[] = "{$i} * * * * {$command}";
}
$cronExpression = implode("\n", $lines);
}
// 准备要添加的cron行
$cronLine = "# PHP系统监控任务(监控间隔:{$interval}秒)\n";
if ($interval >= 60) {
$cronLine .= "{$cronExpression} {$command}\n";
} else {
$cronLine .= "{$cronExpression}\n";
}
// 获取当前用户的crontab
exec('crontab -l 2>/dev/null', $currentCron, $returnCode);
if ($returnCode !== 0) {
// 如果用户没有crontab,从空数组开始
$currentCron = [];
}
// 移除旧的监控任务
$newCron = [];
$monitorFound = false;
foreach ($currentCron as $line) {
// 跳过注释行和空行
if (trim($line) === '' || strpos($line, '#') === 0) {
// 检查是否是我们的监控任务注释
if (strpos($line, 'PHP系统监控任务') !== false) {
$monitorFound = true;
continue; // 跳过旧注释行
}
// 保留其他注释行
$newCron[] = $line;
continue;
}
// 检查是否是监控任务行
if (strpos($line, basename($scriptPath)) !== false || strpos($line, 'php-monitor-cron.log') !== false) {
$monitorFound = true;
continue; // 跳过旧任务行
}
// 保留其他任务行
$newCron[] = $line;
}
// 添加新的监控任务
$newCron[] = trim($cronLine);
// 创建临时文件保存新的crontab
$tempFile = tempnam(sys_get_temp_dir(), 'cron_');
if ($tempFile === false) {
error_log("无法创建临时文件");
return false;
}
// 将新的crontab内容写入临时文件
$newCronContent = implode("\n", $newCron) . "\n";
if (file_put_contents($tempFile, $newCronContent) === false) {
error_log("无法写入临时文件: {$tempFile}");
unlink($tempFile);
return false;
}
// 应用新的crontab
exec("crontab {$tempFile} 2>&1", $output, $installResult);
// 清理临时文件
unlink($tempFile);
if ($installResult !== 0) {
error_log("安装cron任务失败: " . implode("\n", $output));
return false;
}
// 验证cron任务是否安装成功
exec('crontab -l 2>/dev/null', $verifyCron, $verifyResult);
if ($verifyResult === 0) {
foreach ($verifyCron as $line) {
if (strpos($line, basename($scriptPath)) !== false) {
return true; // 确认任务已添加
}
}
}
error_log("cron任务安装后验证失败");
return false;
}
可以将健康检查脚本设置为一个每分钟或每五分钟执行的定时任务。这确保了监控的连续性,无需人工干预。对于日志管理这类日常运维工作,同样可以自动化:
<?php
function rotateLogFile(string $logFile, int $maxSize = 10485760, int $backups = 10): bool
{
if (!file_exists($logFile)) {
return false;
}
$fileSize = filesize($logFile);
if ($fileSize < $maxSize) {
return false;
}
// 关闭可能打开的文件句柄
if (function_exists('posix_kill')) {
// 尝试重新打开日志文件
$logDir = dirname($logFile);
$baseName = basename($logFile);
// 重命名当前日志文件
$timestamp = date('Ymd_His');
$archivedFile = $logDir . '/' . $baseName . '.' . $timestamp;
if (!rename($logFile, $archivedFile)) {
return false;
}
// 重新创建日志文件
touch($logFile);
chmod($logFile, 0644);
// 清理旧的归档文件
$pattern = $logDir . '/' . $baseName . '.*';
$backupFiles = glob($pattern);
if (count($backupFiles) > $backups) {
// 按修改时间排序,删除最旧的文件
usort($backupFiles, function($a, $b) {
return filemtime($a) - filemtime($b);
});
$filesToDelete = array_slice($backupFiles, 0, count($backupFiles) - $backups);
foreach ($filesToDelete as $file) {
@unlink($file);
}
}
return true;
}
return false;
}
能确保单个日志文件不会无限增大,影响写入性能;
<?php
function cleanupOldLogs(string $logDir, int $daysToKeep = 30): array
{
$result = [
'total_files' => 0,
'deleted_files' => 0,
'deleted_size' => 0,
'errors' => []
];
if (!is_dir($logDir)) {
$result['errors'][] = "目录不存在: {$logDir}";
return $result;
}
$cutoffTime = time() - ($daysToKeep * 24 * 60 * 60);
$iterator = new RecursiveIteratorIterator(
new RecursiveDirectoryIterator($logDir, RecursiveDirectoryIterator::SKIP_DOTS),
RecursiveIteratorIterator::SELF_FIRST
);
foreach ($iterator as $file) {
if ($file->isFile()) {
$result['total_files']++;
// 检查文件修改时间
if ($file->getMTime() < $cutoffTime) {
$fileSize = $file->getSize();
if (unlink($file->getPathname())) {
$result['deleted_files']++;
$result['deleted_size'] += $fileSize;
} else {
$result['errors'][] = "无法删除文件: " . $file->getPathname();
}
}
}
}
// 尝试删除空目录
foreach (array_reverse(iterator_to_array($iterator)) as $file) {
if ($file->isDir()) {
@rmdir($file->getPathname());
}
}
return $result;
}
`则能定期清理过期日志,防止磁盘被占满。
最终,所有这些实践应当形成一个闭环。监控发现数据库查询缓慢,触发告警。你优化了查询并添加了缓存。之后,通过`
<?php
function generateMonitoringReport(string $period = 'daily'): array
{
$report = [
'period' => $period,
'generated_at' => date('Y-m-d H:i:s'),
'summary' => [],
'details' => []
];
// 根据周期确定时间范围
$endTime = time();
switch ($period) {
case 'hourly':
$startTime = $endTime - 3600;
break;
case 'daily':
$startTime = $endTime - 86400;
break;
case 'weekly':
$startTime = $endTime - 604800;
break;
case 'monthly':
$startTime = $endTime - 2592000;
break;
default:
$startTime = $endTime - 86400; // 默认每天
}
// 这里应该从监控数据库或日志文件中获取数据
// 以下为示例数据
$report['summary'] = [
'total_requests' => rand(1000, 10000),
'avg_response_time' => round(rand(100, 500) / 1000, 2), // 转换为秒
'error_rate' => round(rand(1, 50) / 1000, 2), // 转换为百分比
'peak_memory_usage' => round(rand(50, 200), 2), // MB
'alerts_count' => rand(0, 10)
];
// 添加详细数据
$report['details'] = [
'cpu_usage' => [
'avg' => round(rand(20, 80), 2),
'max' => round(rand(70, 95), 2),
'min' => round(rand(10, 30), 2)
],
'memory_usage' => [
'avg' => round(rand(40, 70), 2),
'max' => round(rand(75, 95), 2),
'min' => round(rand(20, 40), 2)
],
'disk_usage' => [
'avg' => round(rand(50, 80), 2),
'max' => round(rand(80, 95), 2),
'min' => round(rand(40, 60), 2)
]
];
return $report;
}
`生成的日报或周报,你可以清晰地看到优化前后响应时间曲线的对比,量化你的工作成果。这不仅是验证,也为下一轮的优化指明了方向。
所以,PHP的深度优化远不止于编写高效的代码。它是一套以监控数据为驱动,以自动化工具为支撑的工程实践。通过主动观测、设定规则、自动响应和复盘改进,你将构建出一个具备韧性和高性能的可持续应用系统。
章节总结
性能调优从来都不是一次性的任务。代码优化了,配置调整了,缓存加上了,但这之后呢?我们如何确保优化效果持续有效,又如何应对流量增长或突发问题?可持续的优化,核心在于建立一套能够自我感知、预警和部分自愈的运维体系。
监控是这套体系的“眼睛”。没有数据,一切优化都是盲目的。我们需要从多个维度收集数据:系统层面,使用
<?php
function getSystemLoad(): array
{
// 获取系统负载平均值
$load = sys_getloadavg();
// 如果sys_getloadavg不可用,尝试从/proc/loadavg读取
if (!$load) {
if (is_readable('/proc/loadavg')) {
$loadData = file_get_contents('/proc/loadavg');
$load = array_slice(explode(' ', $loadData), 0, 3);
} else {
$load = [0, 0, 0];
}
}
return [
'load_1min' => round($load[0], 2),
'load_5min' => round($load[1], 2),
'load_15min' => round($load[2], 2),
'timestamp' => time()
];
}
、
<?php
function getMemoryUsage(): array
{
$result = [
'total' => 0,
'used' => 0,
'free' => 0,
'percentage' => 0,
'timestamp' => time()
];
// 不同系统获取内存信息的方式不同
if (stripos(PHP_OS, 'WIN') === 0) {
// Windows系统:使用wmic命令获取内存信息
$output = shell_exec('wmic OS get TotalVisibleMemorySize,FreePhysicalMemory');
if ($output) {
$lines = explode("\n", trim($output));
if (count($lines) >= 2) {
$data = preg_split('/\s+/', trim($lines[1]));
$total = isset($data[0]) ? intval($data[0]) : 0;
$free = isset($data[1]) ? intval($data[1]) : 0;
$result['total'] = $total;
$result['free'] = $free;
$result['used'] = $total - $free;
$result['percentage'] = $total > 0 ? round(($result['used'] / $total) * 100, 2) : 0;
}
}
} else {
// Linux/Unix系统:读取/proc/meminfo文件
$meminfo = @file('/proc/meminfo');
if ($meminfo) {
$memdata = [];
foreach ($meminfo as $line) {
if (preg_match('/^([^:]+):\s+(\d+)\s/', $line, $matches)) {
$memdata[$matches[1]] = intval($matches[2]) * 1024; // 转换为字节
}
}
// 计算内存使用情况
if (isset($memdata['MemTotal'])) {
$total = $memdata['MemTotal'];
$free = $memdata['MemFree'] ?? 0;
$buffers = $memdata['Buffers'] ?? 0;
$cached = $memdata['Cached'] ?? 0;
// 计算实际使用内存(排除缓存和缓冲)
$used = $total - $free - $buffers - $cached;
$result['total'] = $total;
$result['free'] = $free + $buffers + $cached; // 包含缓存和缓冲的空闲内存
$result['used'] = max(0, $used); // 确保非负
$result['percentage'] = $total > 0 ? round(($result['used'] / $total) * 100, 2) : 0;
}
}
}
return $result;
}
和
<?php
function getCpuUsage(): array
{
// 静态变量用于保存上次CPU时间(仅Linux有效)
static $lastCpuTimes = null;
// 初始化返回数组
$result = [
'user' => 0,
'system' => 0,
'idle' => 0,
'percentage' => 0,
'timestamp' => time()
];
// 判断操作系统类型
if (stripos(PHP_OS, 'WIN') !== false) {
// Windows系统:使用WMI命令获取CPU使用率百分比
$command = 'wmic path Win32_PerfFormattedData_PerfOS_Processor get PercentIdleTime,PercentUserTime,PercentPrivilegedTime /format:csv';
$output = shell_exec($command);
if ($output) {
$lines = explode("\n", trim($output));
// 第一行为标题,第二行为数据
if (count($lines) > 1) {
$data = explode(',', $lines[1]);
// 数据格式:Node,PercentIdleTime,PercentPrivilegedTime,PercentUserTime
if (count($data) >= 4) {
$idle = floatval($data[1]); // 空闲时间百分比
$system = floatval($data[2]); // 系统态时间百分比
$user = floatval($data[3]); // 用户态时间百分比
$result['idle'] = $idle;
$result['system'] = $system;
$result['user'] = $user;
$result['percentage'] = 100 - $idle; // 总使用率 = 100 - 空闲率
}
}
} else {
// 命令执行失败,保留默认值
error_log("获取Windows CPU使用率失败");
}
} else {
// 非Windows系统(如Linux):通过/proc/stat获取CPU累计时间
$statFile = '/proc/stat';
if (file_exists($statFile)) {
$contents = file_get_contents($statFile);
$lines = explode("\n", $contents);
if (isset($lines[0])) {
$cpuLine = $lines[0]; // 第一行为总CPU时间信息
$parts = preg_split('/\s+/', trim($cpuLine));
if (count($parts) >= 5) {
// 解析CPU时间:cpu, user, nice, system, idle, iowait, ...
$user = intval($parts[1]) + intval($parts[2]); // user + nice
$system = intval($parts[3]); // system
$idle = intval($parts[4]); // idle
// 可选:将iowait计入空闲时间
if (isset($parts[5])) {
$idle += intval($parts[5]);
}
// 当前CPU累计时间
$currentCpuTimes = [
'user' => $user,
'system' => $system,
'idle' => $idle,
'total' => $user + $system + $idle
];
// 计算使用率百分比(需要与上一次采样比较)
if ($lastCpuTimes !== null) {
$diffUser = $currentCpuTimes['user'] - $lastCpuTimes['user'];
$diffSystem = $currentCpuTimes['system'] - $lastCpuTimes['system'];
$diffIdle = $currentCpuTimes['idle'] - $lastCpuTimes['idle'];
$diffTotal = $currentCpuTimes['total'] - $lastCpuTimes['total'];
if ($diffTotal > 0) {
$usagePercentage = (($diffUser + $diffSystem) / $diffTotal) * 100;
$result['percentage'] = round($usagePercentage, 2);
}
}
// 保存当前时间供下次计算
$lastCpuTimes = $currentCpuTimes;
// 设置累计时间值
$result['user'] = $user;
$result['system'] = $system;
$result['idle'] = $idle;
}
} else {
error_log("/proc/stat文件格式异常");
}
} else {
// 非Linux系统(如macOS)可在此扩展
error_log("不支持的操作系统或/proc/stat不存在");
}
}
return $result;
}
来观察服务器的基础负荷;存储层面,
<?php
function getDiskUsage(string $path = '/'): array
{
$result = [
'total' => 0,
'used' => 0,
'free' => 0,
'percentage' => 0,
'mount_point' => $path,
'timestamp' => time()
];
// 使用disk_total_space和disk_free_space函数
$total = disk_total_space($path);
$free = disk_free_space($path);
if ($total !== false && $free !== false) {
$result['total'] = round($total / (1024 * 1024), 2); // 转换为MB
$result['free'] = round($free / (1024 * 1024), 2); // 转换为MB
$result['used'] = $result['total'] - $result['free'];
$result['percentage'] = $result['total'] > 0 ?
round(($result['used'] / $result['total']) * 100, 2) : 0;
}
return $result;
}
能避免磁盘写满导致的服务中断;应用层面,
<?php
function getPhpMemoryUsage(): array
{
return [
'memory_usage' => round(memory_get_usage() / 1024 / 1024, 2), // 转换为MB
'memory_peak' => round(memory_get_peak_usage() / 1024 / 1024, 2), // 转换为MB
'timestamp' => time()
];
}
直接反映PHP脚本的内存消耗,是发现内存泄漏的关键;而业务层面,通过
<?php
function getDatabaseStats(PDO $pdo): array
{
$result = [
'connections' => 0,
'slow_queries' => 0,
'queries_per_second' => 0,
'timestamp' => time()
];
try {
// 获取当前连接数
$stmt = $pdo->query("SHOW STATUS LIKE 'Threads_connected'");
if ($stmt) {
$row = $stmt->fetch(PDO::FETCH_ASSOC);
$result['connections'] = intval($row['Value'] ?? 0);
}
// 获取慢查询数量
$stmt = $pdo->query("SHOW STATUS LIKE 'Slow_queries'");
if ($stmt) {
$row = $stmt->fetch(PDO::FETCH_ASSOC);
$result['slow_queries'] = intval($row['Value'] ?? 0);
}
// 获取每秒查询数(QPS)
$stmt = $pdo->query("SHOW STATUS LIKE 'Queries'");
if ($stmt) {
$row = $stmt->fetch(PDO::FETCH_ASSOC);
$queries = intval($row['Value'] ?? 0);
// 获取当前时间戳,计算与之前记录的差值
static $lastQueries = 0;
static $lastTimestamp = 0;
if ($lastTimestamp > 0) {
$timeDiff = time() - $lastTimestamp;
if ($timeDiff > 0) {
$result['queries_per_second'] = round(($queries - $lastQueries) / $timeDiff, 2);
}
}
$lastQueries = $queries;
$lastTimestamp = time();
}
} catch (Exception $e) {
// 记录错误但不抛出
error_log("获取数据库统计信息失败: " . $e->getMessage());
}
return $result;
}
可以掌握数据库的压力状态,识别慢查询瓶颈。将这些指标整合起来,就能绘制出应用健康的完整画像。
有了监控数据,告警就是系统的“声音”,它在我们需要关注时发出提示。关键在于设置合理的阈值,并区分告警级别。
<?php
function checkSystemHealth(array $thresholds = []): array
{
// 默认阈值
$defaultThresholds = [
'cpu_threshold' => 90, // CPU使用率阈值(%)
'memory_threshold' => 90, // 内存使用率阈值(%)
'disk_threshold' => 90, // 磁盘使用率阈值(%)
'load_threshold' => 5.0, // 系统负载阈值(15分钟平均)
'connections_threshold' => 100 // 数据库连接数阈值
];
$thresholds = array_merge($defaultThresholds, $thresholds);
// 收集各项指标
$healthData = [
'cpu' => getCpuUsage(),
'memory' => getMemoryUsage(),
'disk' => getDiskUsage(),
'load' => getSystemLoad(),
'php_memory' => getPhpMemoryUsage(),
'timestamp' => time(),
'status' => 'healthy',
'issues' => []
];
// 检查CPU使用率
if ($healthData['cpu']['percentage'] > $thresholds['cpu_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'CPU使用率过高: %.2f%% (阈值: %d%%)',
$healthData['cpu']['percentage'],
$thresholds['cpu_threshold']
);
}
// 检查内存使用率
if ($healthData['memory']['percentage'] > $thresholds['memory_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'内存使用率过高: %.2f%% (阈值: %d%%)',
$healthData['memory']['percentage'],
$thresholds['memory_threshold']
);
}
// 检查磁盘使用率
if ($healthData['disk']['percentage'] > $thresholds['disk_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'磁盘使用率过高: %.2f%% (阈值: %d%%)',
$healthData['disk']['percentage'],
$thresholds['disk_threshold']
);
}
// 检查系统负载
if ($healthData['load']['load_15min'] > $thresholds['load_threshold']) {
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'系统负载过高: %.2f (阈值: %.1f)',
$healthData['load']['load_15min'],
$thresholds['load_threshold']
);
}
// 检查PHP内存使用
if ($healthData['php_memory']['memory_peak'] > 128) { // 128MB阈值
$healthData['status'] = 'warning';
$healthData['issues'][] = sprintf(
'PHP峰值内存使用过高: %.2fMB',
$healthData['php_memory']['memory_peak']
);
}
return $healthData;
}
函数允许我们定义这些阈值,例如CPU持续超过80%触发警告,超过95%则触发严重警报。一旦检测到异常,
<?php
function sendAlert(string $level, string $message, array $data = []): bool
{
// 默认配置
$config = [
'email_enabled' => true,
'email_from' => 'alerts@example.com',
'email_to' => ['admin@example.com'],
'slack_enabled' => false,
'slack_webhook' => '',
'sms_enabled' => false,
'sms_provider' => '',
'log_enabled' => true,
'log_file' => '/var/log/php-alerts.log'
];
// 合并自定义配置
$config = array_merge($config, $data['config'] ?? []);
$result = false;
$timestamp = date('Y-m-d H:i:s');
$fullMessage = "[{$timestamp}] [{$level}] {$message}";
// 添加附加数据
if (!empty($data)) {
$fullMessage .= "\n数据: " . json_encode($data, JSON_PRETTY_PRINT);
}
// 1. 记录到日志文件
if ($config['log_enabled']) {
$logDir = dirname($config['log_file']);
if (!is_dir($logDir)) {
mkdir($logDir, 0755, true);
}
$logResult = file_put_contents($config['log_file'], $fullMessage . "\n", FILE_APPEND) !== false;
$result = $result || $logResult;
}
// 2. 发送邮件
if ($config['email_enabled'] && in_array($level, ['critical', 'error', 'warning'])) {
$subject = "系统告警 [{$level}]: " . substr($message, 0, 50) . '...';
$headers = "From: {$config['email_from']}\r\n";
$headers .= "Content-Type: text/html; charset=UTF-8\r\n";
$htmlMessage = nl2br($fullMessage);
foreach ($config['email_to'] as $email) {
$emailResult = mail($email, $subject, $htmlMessage, $headers);
$result = $result || $emailResult;
}
}
// 3. 发送到Slack
if ($config['slack_enabled'] && !empty($config['slack_webhook'])) {
$slackData = [
'text' => $fullMessage,
'username' => '系统监控机器人',
'icon_emoji' => ':warning:'
];
$ch = curl_init($config['slack_webhook']);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($slackData));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
$slackResult = ($response !== false && $httpCode === 200);
$result = $result || $slackResult;
}
// 4. 发送短信(需要具体提供商实现)
if ($config['sms_enabled'] && !empty($config['sms_provider'])) {
// 这里需要根据具体的短信提供商实现
// 示例:调用外部API发送短信
$smsResult = false;
try {
// 根据短信提供商调用不同的API
switch ($config['sms_provider']) {
case 'aliyun':
// 阿里云短信服务调用
// $smsResult = sendSMSByAliyun($fullMessage);
break;
case 'tencent':
// 腾讯云短信服务调用
// $smsResult = sendSMSByTencent($fullMessage);
break;
default:
// 默认实现或日志记录
error_log("不支持的短信提供商: {$config['sms_provider']}");
}
$result = $result || $smsResult;
} catch (Exception $e) {
error_log("发送短信失败: " . $e->getMessage());
}
}
// 5. 错误处理:记录发送失败的情况
if (!$result) {
error_log("告警发送失败 - 级别: {$level}, 消息: " . substr($message, 0, 200));
}
return $result;
}
可以立即通过邮件、钉钉、企业微信等渠道通知到负责人。更进一步的,
<?php
function checkAndAlert(array $thresholds = []): void
{
$healthData = checkSystemHealth($thresholds);
if ($healthData['status'] !== 'healthy' && !empty($healthData['issues'])) {
$level = 'warning';
$message = '系统健康检查发现问题: ' . implode('; ', $healthData['issues']);
// 如果问题严重,提升告警级别
if (count($healthData['issues']) > 3) {
$level = 'error';
}
// 检查是否有关键指标严重超标
if ($healthData['cpu']['percentage'] > 95 ||
$healthData['memory']['percentage'] > 95 ||
$healthData['disk']['percentage'] > 95) {
$level = 'critical';
}
sendAlert($level, $message, $healthData);
}
// 记录常规监控数据
$monitorLog = sprintf(
"[%s] CPU: %.2f%%, Memory: %.2f%%, Disk: %.2f%%, Load: %.2f\n",
date('Y-m-d H:i:s'),
$healthData['cpu']['percentage'],
$healthData['memory']['percentage'],
$healthData['disk']['percentage'],
$healthData['load']['load_15min']
);
$logFile = '/var/log/php-monitor.log';
if (!is_dir(dirname($logFile))) {
mkdir(dirname($logFile), 0755, true);
}
file_put_contents($logFile, $monitorLog, FILE_APPEND);
}
将检测和通知流程自动化,确保问题能被及时发现。
自动化运维则是系统的“双手”,它处理那些重复、可预测的维护工作,解放人力。例如,使用
<?php
function setupMonitoringCron(int $interval = 300): bool
{
// 检查脚本路径是否存在
$scriptPath = __FILE__;
if (!file_exists($scriptPath)) {
error_log("监控脚本文件不存在: {$scriptPath}");
return false;
}
// 构建监控命令
$command = "php -f \"{$scriptPath}\" --check 2>&1 >> /var/log/php-monitor-cron.log";
// 计算cron表达式(每$interval秒执行一次)
if ($interval >= 60) {
// 大于等于60秒,转换为分钟
$minutes = floor($interval / 60);
if ($minutes < 1) {
$minutes = 1; // 最小为1分钟
}
$cronExpression = "*/{$minutes} * * * *";
} else {
// 如果间隔小于60秒,需要特殊处理(多行cron)
$lines = [];
for ($i = 0; $i < 60; $i += $interval) {
$lines[] = "{$i} * * * * {$command}";
}
$cronExpression = implode("\n", $lines);
}
// 准备要添加的cron行
$cronLine = "# PHP系统监控任务(监控间隔:{$interval}秒)\n";
if ($interval >= 60) {
$cronLine .= "{$cronExpression} {$command}\n";
} else {
$cronLine .= "{$cronExpression}\n";
}
// 获取当前用户的crontab
exec('crontab -l 2>/dev/null', $currentCron, $returnCode);
if ($returnCode !== 0) {
// 如果用户没有crontab,从空数组开始
$currentCron = [];
}
// 移除旧的监控任务
$newCron = [];
$monitorFound = false;
foreach ($currentCron as $line) {
// 跳过注释行和空行
if (trim($line) === '' || strpos($line, '#') === 0) {
// 检查是否是我们的监控任务注释
if (strpos($line, 'PHP系统监控任务') !== false) {
$monitorFound = true;
continue; // 跳过旧注释行
}
// 保留其他注释行
$newCron[] = $line;
continue;
}
// 检查是否是监控任务行
if (strpos($line, basename($scriptPath)) !== false || strpos($line, 'php-monitor-cron.log') !== false) {
$monitorFound = true;
continue; // 跳过旧任务行
}
// 保留其他任务行
$newCron[] = $line;
}
// 添加新的监控任务
$newCron[] = trim($cronLine);
// 创建临时文件保存新的crontab
$tempFile = tempnam(sys_get_temp_dir(), 'cron_');
if ($tempFile === false) {
error_log("无法创建临时文件");
return false;
}
// 将新的crontab内容写入临时文件
$newCronContent = implode("\n", $newCron) . "\n";
if (file_put_contents($tempFile, $newCronContent) === false) {
error_log("无法写入临时文件: {$tempFile}");
unlink($tempFile);
return false;
}
// 应用新的crontab
exec("crontab {$tempFile} 2>&1", $output, $installResult);
// 清理临时文件
unlink($tempFile);
if ($installResult !== 0) {
error_log("安装cron任务失败: " . implode("\n", $output));
return false;
}
// 验证cron任务是否安装成功
exec('crontab -l 2>/dev/null', $verifyCron, $verifyResult);
if ($verifyResult === 0) {
foreach ($verifyCron as $line) {
if (strpos($line, basename($scriptPath)) !== false) {
return true; // 确认任务已添加
}
}
}
error_log("cron任务安装后验证失败");
return false;
}
可以轻松创建一个每5分钟运行一次的定时任务,周期性执行健康检查与数据采集。日志管理是运维的日常,
<?php
function cleanupOldLogs(string $logDir, int $daysToKeep = 30): array
{
$result = [
'total_files' => 0,
'deleted_files' => 0,
'deleted_size' => 0,
'errors' => []
];
if (!is_dir($logDir)) {
$result['errors'][] = "目录不存在: {$logDir}";
return $result;
}
$cutoffTime = time() - ($daysToKeep * 24 * 60 * 60);
$iterator = new RecursiveIteratorIterator(
new RecursiveDirectoryIterator($logDir, RecursiveDirectoryIterator::SKIP_DOTS),
RecursiveIteratorIterator::SELF_FIRST
);
foreach ($iterator as $file) {
if ($file->isFile()) {
$result['total_files']++;
// 检查文件修改时间
if ($file->getMTime() < $cutoffTime) {
$fileSize = $file->getSize();
if (unlink($file->getPathname())) {
$result['deleted_files']++;
$result['deleted_size'] += $fileSize;
} else {
$result['errors'][] = "无法删除文件: " . $file->getPathname();
}
}
}
}
// 尝试删除空目录
foreach (array_reverse(iterator_to_array($iterator)) as $file) {
if ($file->isDir()) {
@rmdir($file->getPathname());
}
}
return $result;
}
和
<?php
function rotateLogFile(string $logFile, int $maxSize = 10485760, int $backups = 10): bool
{
if (!file_exists($logFile)) {
return false;
}
$fileSize = filesize($logFile);
if ($fileSize < $maxSize) {
return false;
}
// 关闭可能打开的文件句柄
if (function_exists('posix_kill')) {
// 尝试重新打开日志文件
$logDir = dirname($logFile);
$baseName = basename($logFile);
// 重命名当前日志文件
$timestamp = date('Ymd_His');
$archivedFile = $logDir . '/' . $baseName . '.' . $timestamp;
if (!rename($logFile, $archivedFile)) {
return false;
}
// 重新创建日志文件
touch($logFile);
chmod($logFile, 0644);
// 清理旧的归档文件
$pattern = $logDir . '/' . $baseName . '.*';
$backupFiles = glob($pattern);
if (count($backupFiles) > $backups) {
// 按修改时间排序,删除最旧的文件
usort($backupFiles, function($a, $b) {
return filemtime($a) - filemtime($b);
});
$filesToDelete = array_slice($backupFiles, 0, count($backupFiles) - $backups);
foreach ($filesToDelete as $file) {
@unlink($file);
}
}
return true;
}
return false;
}
能自动归档和清理日志,防止日志文件耗尽磁盘空间。数据库也需要定期维护,
<?php
function backupDatabase(array $config): array
{
$defaultConfig = [
'host' => 'localhost',
'port' => 3306,
'username' => 'root',
'password' => '',
'database' => '',
'backup_dir' => '/var/backups/mysql',
'compress' => true,
'keep_days' => 30
];
$config = array_merge($defaultConfig, $config);
$result = [
'success' => false,
'backup_file' => '',
'backup_size' => 0,
'error' => '',
'timestamp' => time()
];
// 验证配置
if (empty($config['database'])) {
$result['error'] = '数据库名称不能为空';
return $result;
}
// 创建备份目录
if (!is_dir($config['backup_dir'])) {
if (!mkdir($config['backup_dir'], 0755, true)) {
$result['error'] = "无法创建备份目录: {$config['backup_dir']}";
return $result;
}
}
// 生成备份文件名
$timestamp = date('Ymd_His');
$backupFile = $config['backup_dir'] . '/' . $config['database'] . '_' . $timestamp . '.sql';
if ($config['compress']) {
$backupFile .= '.gz';
}
// 构建mysqldump命令
$command = sprintf(
'mysqldump --host=%s --port=%d --user=%s --password=%s %s',
escapeshellarg($config['host']),
$config['port'],
escapeshellarg($config['username']),
escapeshellarg($config['password']),
escapeshellarg($config['database'])
);
if ($config['compress']) {
$command .= ' | gzip';
}
$command .= ' > ' . escapeshellarg($backupFile);
// 执行备份命令
exec($command, $output, $returnCode);
if ($returnCode === 0) {
// 备份成功
$result['success'] = true;
$result['backup_file'] = $backupFile;
// 获取备份文件大小
if (file_exists($backupFile)) {
$result['backup_size'] = filesize($backupFile);
}
// 清理旧备份文件
$expireTime = time() - ($config['keep_days'] * 24 * 60 * 60);
$pattern = $config['backup_dir'] . '/' . $config['database'] . '_*.sql*';
foreach (glob($pattern) as $oldBackupFile) {
if (filemtime($oldBackupFile) < $expireTime) {
@unlink($oldBackupFile);
}
}
} else {
// 备份失败
$result['error'] = '数据库备份失败,mysqldump命令执行错误';
if (!empty($output)) {
$result['error'] .= ': ' . implode("\n", $output);
}
}
return $result;
}
实现自动备份,而
<?php
function optimizeDatabaseTables(PDO $pdo, array $tables = []): array
{
$result = [
'optimized_tables' => [],
'total_size_before' => 0,
'total_size_after' => 0,
'errors' => []
];
try {
// 如果没有指定表,则优化所有表
if (empty($tables)) {
$stmt = $pdo->query("SHOW TABLES");
$tables = $stmt->fetchAll(PDO::FETCH_COLUMN);
}
foreach ($tables as $table) {
try {
// 获取表优化前的状态
$stmt = $pdo->query("SHOW TABLE STATUS LIKE '{$table}'");
$status = $stmt->fetch(PDO::FETCH_ASSOC);
$sizeBefore = $status['Data_length'] + $status['Index_length'];
$result['total_size_before'] += $sizeBefore;
// 执行OPTIMIZE TABLE
$pdo->exec("OPTIMIZE TABLE `{$table}`");
// 获取表优化后的状态
$stmt = $pdo->query("SHOW TABLE STATUS LIKE '{$table}'");
$status = $stmt->fetch(PDO::FETCH_ASSOC);
$sizeAfter = $status['Data_length'] + $status['Index_length'];
$result['total_size_after'] += $sizeAfter;
$result['optimized_tables'][] = [
'table' => $table,
'size_before' => $sizeBefore,
'size_after' => $sizeAfter,
'saved' => $sizeBefore - $sizeAfter
];
} catch (Exception $e) {
$result['errors'][] = "优化表 {$table} 失败: " . $e->getMessage();
}
}
} catch (Exception $e) {
$result['errors'][] = "获取表列表失败: " . $e->getMessage();
}
return $result;
}
可以定期优化表碎片,维持查询性能。
最终,这些环节构成了一个闭环:监控持续收集数据,告警在异常时介入,自动化程序处理常规维护并生成报告。通过
<?php
function getPerformanceMetrics(): array
{
static $startTime = null;
static $requestCount = 0;
// 如果是第一次调用,初始化开始时间
if ($startTime === null) {
$startTime = microtime(true);
}
// 增加请求计数
$requestCount++;
// 计算当前时间
$currentTime = microtime(true);
// 计算运行时长(秒)
$runtime = $currentTime - $startTime;
// 计算吞吐量(请求/秒)
$throughput = 0;
if ($runtime > 0) {
$throughput = $requestCount / $runtime;
}
// 获取内存使用情况
$memoryUsage = memory_get_usage(true);
$peakMemoryUsage = memory_get_peak_usage(true);
// 获取系统负载(如果可用)
$systemLoad = [];
if (function_exists('sys_getloadavg')) {
$systemLoad = sys_getloadavg();
}
// 获取数据库连接数(示例)
$databaseConnections = 0;
if (extension_loaded('mysqli')) {
$databaseConnections = mysqli_num_links();
}
// 构造返回的指标数组
$metrics = [
'runtime_seconds' => round($runtime, 4),
'request_count' => $requestCount,
'throughput_rps' => round($throughput, 2),
'memory_usage_bytes' => $memoryUsage,
'peak_memory_bytes' => $peakMemoryUsage,
'system_load' => $systemLoad,
'database_connections' => $databaseConnections,
'current_timestamp' => date('Y-m-d H:i:s'),
'php_version' => PHP_VERSION,
'script_execution_time' => round(microtime(true) - $_SERVER['REQUEST_TIME_FLOAT'], 4)
];
// 添加错误处理:如果运行时异常,返回空数组
try {
// 可选:记录到日志或监控系统
// file_put_contents('performance.log', json_encode($metrics) . PHP_EOL, FILE_APPEND);
return $metrics;
} catch (Exception $e) {
error_log("获取性能指标失败: " . $e->getMessage());
return [];
}
}
和
<?php
function generateMonitoringReport(string $period = 'daily'): array
{
$report = [
'period' => $period,
'generated_at' => date('Y-m-d H:i:s'),
'summary' => [],
'details' => []
];
// 根据周期确定时间范围
$endTime = time();
switch ($period) {
case 'hourly':
$startTime = $endTime - 3600;
break;
case 'daily':
$startTime = $endTime - 86400;
break;
case 'weekly':
$startTime = $endTime - 604800;
break;
case 'monthly':
$startTime = $endTime - 2592000;
break;
default:
$startTime = $endTime - 86400; // 默认每天
}
// 这里应该从监控数据库或日志文件中获取数据
// 以下为示例数据
$report['summary'] = [
'total_requests' => rand(1000, 10000),
'avg_response_time' => round(rand(100, 500) / 1000, 2), // 转换为秒
'error_rate' => round(rand(1, 50) / 1000, 2), // 转换为百分比
'peak_memory_usage' => round(rand(50, 200), 2), // MB
'alerts_count' => rand(0, 10)
];
// 添加详细数据
$report['details'] = [
'cpu_usage' => [
'avg' => round(rand(20, 80), 2),
'max' => round(rand(70, 95), 2),
'min' => round(rand(10, 30), 2)
],
'memory_usage' => [
'avg' => round(rand(40, 70), 2),
'max' => round(rand(75, 95), 2),
'min' => round(rand(20, 40), 2)
],
'disk_usage' => [
'avg' => round(rand(50, 80), 2),
'max' => round(rand(80, 95), 2),
'min' => round(rand(40, 60), 2)
]
];
return $report;
}
,我们不仅能应对故障,更能从长期的趋势数据中洞察瓶颈所在,为下一轮的深度优化提供清晰、数据驱动的决策依据。优化从此不再是应急反应,而是一个基于观测、分析和自动化的、持续演进的过程。
1904

被折叠的 条评论
为什么被折叠?



