架构之WebSocket推送
概述
在分布式系统中,服务端向客户端的实时数据推送是一个常见需求。传统的HTTP轮询或长轮询方式存在资源浪费、延迟高、扩展性差等问题。WebSocket协议提供了一种在单个TCP连接上进行全双工通信的机制,能够高效实现服务端向客户端的实时推送。
法则核心:服务端向客户端的推送,使用WebSocket,最好使用Netty架设WebSocket服务器,单节点支持100W连接。
核心原理
WebSocket协议特性
WebSocket是HTML5开始提供的一种在单个TCP连接上进行全双工通讯的协议。它具有以下核心特性:
- 全双工通信:客户端和服务端可以同时发送和接收数据
- 持久连接:建立连接后,连接保持打开状态,避免重复握手开销
- 低开销:数据包头部小,传输效率高
- 实时性强:消息可以即时推送,无需轮询
为什么选择Netty
Netty是一个异步的、基于事件驱动的网络应用框架,用于快速开发可维护、高性能的网络服务器和客户端。选择Netty构建WebSocket服务器的优势:
- NIO模型:基于Java NIO,单线程可处理大量连接,避免传统BIO的线程阻塞问题
- 零拷贝:通过ByteBuf实现零拷贝,减少内存复制开销
- 内存池化:PooledByteBufAllocator减少GC压力
- 高性能:经过大量生产环境验证,单节点可支撑百万级连接
- 协议支持完善:内置WebSocket编解码器,开箱即用
百万连接的技术基础
单节点支持100万连接需要满足以下条件:
| 条件 | 要求 | 说明 |
|---|---|---|
| 文件句柄 | >100万 | 操作系统限制,需调整ulimit |
| 端口范围 | >65535 | 使用多IP或端口复用 |
| 内存 | 8GB+ | 每连接约1-2KB元数据 |
| 网络带宽 | 10Gbps+ | 取决于消息频率和大小 |
| CPU | 16核+ | 高并发场景需要 |
技术实现
Netty WebSocket服务器架构
核心配置参数
1. Netty线程模型配置
// Boss线程:处理连接建立
EventLoopGroup bossGroup = new NioEventLoopGroup(1);
// Worker线程:处理IO操作
// 线程数建议:CPU核心数 * 2
int workerThreads = Runtime.getRuntime().availableProcessors() * 2;
EventLoopGroup workerGroup = new NioEventLoopGroup(workerThreads);
2. Channel配置优化
ServerBootstrap b = new ServerBootstrap();
b.group(bossGroup, workerGroup)
.channel(NioServerSocketChannel.class)
.option(ChannelOption.SO_BACKLOG, 1024) // 连接队列大小
.option(ChannelOption.SO_REUSEADDR, true) // 端口复用
.childOption(ChannelOption.TCP_NODELAY, true) // 禁用Nagle算法
.childOption(ChannelOption.SO_KEEPALIVE, true) // 保持连接
.childOption(ChannelOption.SO_RCVBUF, 32 * 1024) // 接收缓冲区
.childOption(ChannelOption.SO_SNDBUF, 32 * 1024) // 发送缓冲区
.childOption(ChannelOption.ALLOCATOR, PooledByteBufAllocator.DEFAULT) // 内存池化
.childHandler(new WebSocketServerInitializer());
3. WebSocket握手配置
public class WebSocketServerInitializer extends ChannelInitializer<SocketChannel> {
@Override
protected void initChannel(SocketChannel ch) {
ChannelPipeline pipeline = ch.pipeline();
// HTTP编解码器
pipeline.addLast(new HttpServerCodec());
// HTTP消息聚合,最大消息长度8MB
pipeline.addLast(new HttpObjectAggregator(8192));
// WebSocket协议处理器,指定WebSocket路径
pipeline.addLast(new WebSocketServerProtocolHandler("/ws"));
// 自定义业务处理器
pipeline.addLast(new WebSocketServerHandler());
}
}
百万连接优化策略
1. 操作系统层面优化
# /etc/sysctl.conf 配置
# 增加系统最大文件句柄数
fs.file-max = 2097152
# TCP连接优化
net.ipv4.tcp_max_syn_backlog = 8192
net.core.somaxconn = 8192
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
# TCP缓冲区优化
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
# 应用生效
sysctl -p
# /etc/security/limits.conf 配置
* soft nofile 1000000
* hard nofile 1000000
2. JVM层面优化
# JVM启动参数
-Xms8g -Xmx8g \
-XX:+UseG1GC \
-XX:MaxGCPauseMillis=200 \
-XX:ParallelGCThreads=16 \
-XX:ConcGCThreads=4 \
-XX:+UseStringDeduplication \
-XX:+OptimizeStringConcat \
-Dio.netty.leakDetection.level=disabled
3. Netty内存管理优化
// 使用堆外内存
ByteBuf directBuffer = ByteBufAllocator.DEFAULT.directBuffer(1024);
// 使用内存池
ByteBuf pooledBuffer = PooledByteBufAllocator.DEFAULT.buffer(1024);
// 及时释放资源
ReferenceCountUtil.release(buffer);
4. 连接管理优化
public class ConnectionManager {
// 使用ConcurrentHashMap管理连接
private final ConcurrentHashMap<String, Channel> connections = new ConcurrentHashMap<>();
// 心跳检测
private static final int HEARTBEAT_INTERVAL = 30; // 秒
// 连接超时清理
private static final int CONNECTION_TIMEOUT = 60; // 秒
public void addConnection(String userId, Channel channel) {
connections.put(userId, channel);
// 添加空闲检测
channel.pipeline().addLast(new IdleStateHandler(
HEARTBEAT_INTERVAL, 0, 0, TimeUnit.SECONDS
));
}
public void removeConnection(String userId) {
Channel channel = connections.remove(userId);
if (channel != null && channel.isActive()) {
channel.close();
}
}
}
最佳实践
1. 心跳机制
心跳机制用于检测连接的有效性,及时发现和清理无效连接。
public class HeartbeatHandler extends ChannelInboundHandlerAdapter {
private static final ByteBuf HEARTBEAT_DATA =
Unpooled.unreleasableBuffer(Unpooled.copiedBuffer("PING", CharsetUtil.UTF_8));
@Override
public void userEventTriggered(ChannelHandlerContext ctx, Object evt) {
if (evt instanceof IdleStateEvent) {
IdleStateEvent event = (IdleStateEvent) evt;
if (event.state() == IdleState.READER_IDLE) {
// 读超时,发送心跳
ctx.writeAndFlush(new TextWebSocketFrame("PING"));
} else if (event.state() == IdleState.WRITER_IDLE) {
// 写超时,关闭连接
ctx.close();
}
}
}
}
2. 消息推送策略
批量推送优化
public class MessagePusher {
private final ExecutorService pushExecutor = Executors.newFixedThreadPool(16);
// 批量推送接口
public void batchPush(List<String> userIds, String message) {
List<CompletableFuture<Void>> futures = userIds.stream()
.map(userId -> CompletableFuture.runAsync(() ->
pushToUser(userId, message), pushExecutor))
.collect(Collectors.toList());
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
}
// 单用户推送
private void pushToUser(String userId, String message) {
Channel channel = connectionManager.getConnection(userId);
if (channel != null && channel.isActive()) {
channel.writeAndFlush(new TextWebSocketFrame(message));
}
}
}
消息队列集成
public class WebSocketMessageConsumer implements MessageListener {
private final MessagePusher messagePusher;
@Override
public void onMessage(Message message) {
String targetUser = message.getTargetUser();
String content = message.getContent();
messagePusher.pushToUser(targetUser, content);
}
}
3. 连接限流
防止恶意连接或异常流量导致服务器资源耗尽。
public class ConnectionLimiter extends ChannelInboundHandlerAdapter {
private final AtomicInteger connectionCount = new AtomicInteger(0);
private final int maxConnections;
public ConnectionLimiter(int maxConnections) {
this.maxConnections = maxConnections;
}
@Override
public void channelActive(ChannelHandlerContext ctx) {
int current = connectionCount.incrementAndGet();
if (current > maxConnections) {
connectionCount.decrementAndGet();
ctx.writeAndFlush(new CloseWebSocketFrame(
1008, "Server is overloaded"
)).addListener(ChannelFutureListener.CLOSE);
return;
}
super.channelActive(ctx);
}
@Override
public void channelInactive(ChannelHandlerContext ctx) {
connectionCount.decrementAndGet();
super.channelInactive(ctx);
}
}
4. 集群部署方案
单节点虽然有百万连接能力,但生产环境建议采用集群部署。
会话共享方案
使用Redis存储用户连接映射关系:
public class SessionManager {
private final RedisTemplate<String, String> redisTemplate;
// 用户连接注册
public void registerSession(String userId, String serverId) {
String key = "ws:session:" + userId;
redisTemplate.opsForValue().set(key, serverId, 24, TimeUnit.HOURS);
}
// 获取用户所在服务器
public String getServerId(String userId) {
String key = "ws:session:" + userId;
return redisTemplate.opsForValue().get(key);
}
// 跨服务器消息推送
public void pushCrossServer(String userId, String message) {
String serverId = getServerId(userId);
if (serverId != null) {
// 通过消息队列推送到目标服务器
messageQueue.publish("ws:push:" + serverId,
new PushMessage(userId, message));
}
}
}
注意事项
1. 安全性考虑
身份认证
public class WebSocketAuthHandler extends ChannelInboundHandlerAdapter {
@Override
public void channelRead(ChannelHandlerContext ctx, Object msg) {
if (msg instanceof FullHttpRequest) {
FullHttpRequest request = (FullHttpRequest) msg;
String token = request.headers().get("Authorization");
if (!validateToken(token)) {
ctx.writeAndFlush(new FullHttpResponse(
HTTP_1_1, UNAUTHORIZED
)).addListener(ChannelFutureListener.CLOSE);
return;
}
}
super.channelRead(ctx, msg);
}
}
消息加密
生产环境建议使用WSS(WebSocket over TLS):
// 配置SSL上下文
SSLContext sslContext = SSLContext.getInstance("TLS");
sslContext.init(null, null, null);
SSLEngine sslEngine = sslContext.createSSLEngine();
sslEngine.setUseClientMode(false);
// 在Pipeline中添加SSL处理器
pipeline.addLast(new SslHandler(sslEngine));
2. 资源泄漏防范
ByteBuf泄漏检测
// 开发环境开启泄漏检测
ResourceLeakDetector.setLevel(ResourceLeakDetector.Level.PARANOID);
// 生产环境使用简单检测
ResourceLeakDetector.setLevel(ResourceLeakDetector.Level.SIMPLE);
优雅关闭
public class WebSocketServer {
private EventLoopGroup bossGroup;
private EventLoopGroup workerGroup;
public void shutdown() {
// 优雅关闭,拒绝新连接
bossGroup.shutdownGracefully();
// 等待现有连接处理完成
workerGroup.shutdownGracefully();
// 等待关闭完成
try {
bossGroup.terminationFuture().sync();
workerGroup.terminationFuture().sync();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
3. 监控指标
关键监控指标
| 指标 | 说明 | 告警阈值 |
|---|---|---|
| 当前连接数 | 实时在线连接数 | > 90万 |
| 消息发送速率 | 每秒发送消息数 | 异常波动 |
| 消息接收速率 | 每秒接收消息数 | 异常波动 |
| 消息延迟 | 消息端到端延迟 | > 1s |
| 连接建立失败率 | 握手失败比例 | > 1% |
| 内存使用率 | JVM堆内存使用 | > 80% |
| GC频率 | 每分钟GC次数 | > 10次 |
| CPU使用率 | 服务器CPU使用率 | > 80% |
监控实现
public class WebSocketMetrics {
private final AtomicLong connectionCount = new AtomicLong(0);
private final AtomicLong totalMessagesSent = new AtomicLong(0);
private final AtomicLong totalMessagesReceived = new AtomicLong(0);
public void recordConnection() {
connectionCount.incrementAndGet();
}
public void recordDisconnection() {
connectionCount.decrementAndGet();
}
public void recordMessageSent() {
totalMessagesSent.incrementAndGet();
}
public void recordMessageReceived() {
totalMessagesReceived.incrementAndGet();
}
public MetricsSnapshot getSnapshot() {
return new MetricsSnapshot(
connectionCount.get(),
totalMessagesSent.get(),
totalMessagesReceived.get()
);
}
}
4. 常见问题
问题1:连接数达到瓶颈
现象:连接数无法继续增长,新连接被拒绝
原因:
- 操作系统文件句柄限制
- 端口耗尽
- 内存不足
解决方案:
# 调整文件句柄限制
ulimit -n 1000000
# 调整TCP参数
sysctl -w net.ipv4.ip_local_port_range="1024 65535"
# 增加服务器内存
问题2:内存持续增长
现象:长时间运行后内存持续增长,GC频繁
原因:
- ByteBuf未释放
- 连接未清理
- 消息堆积
解决方案:
// 使用try-finally确保资源释放
ByteBuf buffer = ...;
try {
// 使用buffer
} finally {
ReferenceCountUtil.release(buffer);
}
// 定期清理无效连接
scheduler.scheduleAtFixedRate(() -> {
connectionManager.cleanInactiveConnections();
}, 1, 1, TimeUnit.MINUTES);
问题3:消息延迟高
现象:消息推送延迟超过预期
原因:
- 网络拥塞
- 服务器负载过高
- 消息队列堆积
解决方案:
// 使用异步非阻塞发送
channel.writeAndFlush(message).addListener(future -> {
if (!future.isSuccess()) {
log.error("Send failed", future.cause());
}
});
// 优化消息处理线程池
ExecutorService executor = new ThreadPoolExecutor(
16, 32, 60, TimeUnit.SECONDS,
new LinkedBlockingQueue<>(10000),
new ThreadPoolExecutor.CallerRunsPolicy()
);
代码示例
完整的Netty WebSocket服务器实现
public class WebSocketServer {
private final int port;
private EventLoopGroup bossGroup;
private EventLoopGroup workerGroup;
public WebSocketServer(int port) {
this.port = port;
}
public void start() throws Exception {
bossGroup = new NioEventLoopGroup(1);
workerGroup = new NioEventLoopGroup(
Runtime.getRuntime().availableProcessors() * 2
);
try {
ServerBootstrap b = new ServerBootstrap();
b.group(bossGroup, workerGroup)
.channel(NioServerSocketChannel.class)
.option(ChannelOption.SO_BACKLOG, 1024)
.option(ChannelOption.SO_REUSEADDR, true)
.childOption(ChannelOption.TCP_NODELAY, true)
.childOption(ChannelOption.SO_KEEPALIVE, true)
.childOption(ChannelOption.SO_RCVBUF, 32 * 1024)
.childOption(ChannelOption.SO_SNDBUF, 32 * 1024)
.childOption(ChannelOption.ALLOCATOR,
PooledByteBufAllocator.DEFAULT)
.childHandler(new WebSocketServerInitializer());
ChannelFuture f = b.bind(port).sync();
System.out.println("WebSocket Server started on port: " + port);
f.channel().closeFuture().sync();
} finally {
shutdown();
}
}
public void shutdown() {
if (bossGroup != null) {
bossGroup.shutdownGracefully();
}
if (workerGroup != null) {
workerGroup.shutdownGracefully();
}
}
public static void main(String[] args) throws Exception {
new WebSocketServer(8080).start();
}
}
WebSocket业务处理器
@ChannelHandler.Sharable
public class WebSocketServerHandler extends SimpleChannelInboundHandler<Object> {
private static final Logger logger = LoggerFactory.getLogger(WebSocketServerHandler.class);
private WebSocketHandshaker handshaker;
private final ConnectionManager connectionManager;
public WebSocketServerHandler(ConnectionManager connectionManager) {
this.connectionManager = connectionManager;
}
@Override
protected void channelRead0(ChannelHandlerContext ctx, Object msg) {
// HTTP握手阶段
if (msg instanceof FullHttpRequest) {
handleHttpRequest(ctx, (FullHttpRequest) msg);
return;
}
// WebSocket消息阶段
if (msg instanceof WebSocketFrame) {
handleWebSocketFrame(ctx, (WebSocketFrame) msg);
}
}
private void handleHttpRequest(ChannelHandlerContext ctx, FullHttpRequest req) {
// 检查是否是WebSocket握手请求
if (!req.decoderResult().isSuccess()) {
sendHttpResponse(ctx, req,
new FullHttpResponse(HTTP_1_1, BAD_REQUEST));
return;
}
// 构建握手响应
WebSocketServerHandshakerFactory wsFactory =
new WebSocketServerHandshakerFactory(
getWebSocketLocation(req), null, true, 8192);
handshaker = wsFactory.newHandshaker(req);
if (handshaker == null) {
WebSocketServerHandshakerFactory.sendUnsupportedVersionResponse(ctx.channel());
} else {
handshaker.handshake(ctx.channel(), req);
// 握手成功,注册连接
String userId = extractUserId(req);
connectionManager.addConnection(userId, ctx.channel());
}
}
private void handleWebSocketFrame(ChannelHandlerContext ctx, WebSocketFrame frame) {
// 检查是否是关闭帧
if (frame instanceof CloseWebSocketFrame) {
handshaker.close(ctx.channel(),
(CloseWebSocketFrame) frame.retain());
return;
}
// 检查是否是Ping帧
if (frame instanceof PingWebSocketFrame) {
ctx.write(new PongWebSocketFrame(frame.content().retain()));
return;
}
// 只支持文本帧
if (!(frame instanceof TextWebSocketFrame)) {
throw new UnsupportedOperationException(
String.format("%s frame types not supported",
frame.getClass().getName()));
}
// 处理业务消息
String request = ((TextWebSocketFrame) frame).text();
logger.info("Received message: {}", request);
// 处理业务逻辑...
processMessage(ctx, request);
}
private void processMessage(ChannelHandlerContext ctx, String message) {
// 解析消息
Message msg = JSON.parseObject(message, Message.class);
// 业务处理
String response = handleBusiness(msg);
// 响应客户端
ctx.channel().writeAndFlush(new TextWebSocketFrame(response));
}
private String handleBusiness(Message msg) {
// 实现具体业务逻辑
return "Response: " + msg.getContent();
}
private static void sendHttpResponse(ChannelHandlerContext ctx,
FullHttpRequest req, FullHttpResponse res) {
if (res.status().code() != 200) {
ByteBuf buf = Unpooled.copiedBuffer(res.status().toString(),
CharsetUtil.UTF_8);
res.content().writeBytes(buf);
buf.release();
HttpUtil.setContentLength(res, res.content().readableBytes());
}
ChannelFuture f = ctx.channel().writeAndFlush(res);
if (!HttpUtil.isKeepAlive(req) || res.status().code() != 200) {
f.addListener(ChannelFutureListener.CLOSE);
}
}
private static String getWebSocketLocation(FullHttpRequest req) {
String location = req.headers().get(HttpHeaderNames.HOST) + "/ws";
return "ws://" + location;
}
private String extractUserId(FullHttpRequest req) {
// 从请求中提取用户ID
String token = req.headers().get("Authorization");
return TokenUtil.parseUserId(token);
}
@Override
public void channelInactive(ChannelHandlerContext ctx) {
String userId = connectionManager.getUserIdByChannel(ctx.channel());
if (userId != null) {
connectionManager.removeConnection(userId);
logger.info("User disconnected: {}", userId);
}
}
@Override
public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) {
logger.error("WebSocket exception", cause);
ctx.close();
}
}
客户端示例
// WebSocket客户端连接
const ws = new WebSocket('ws://localhost:8080/ws');
// 连接建立
ws.onopen = function() {
console.log('WebSocket connected');
// 发送认证token
ws.send(JSON.stringify({
type: 'auth',
token: 'user_token_here'
}));
};
// 接收消息
ws.onmessage = function(event) {
console.log('Received message:', event.data);
const message = JSON.parse(event.data);
handleMessage(message);
};
// 连接关闭
ws.onclose = function(event) {
console.log('WebSocket disconnected:', event.code, event.reason);
// 自动重连
setTimeout(function() {
reconnect();
}, 3000);
};
// 错误处理
ws.onerror = function(error) {
console.error('WebSocket error:', error);
};
// 心跳保活
setInterval(function() {
if (ws.readyState === WebSocket.OPEN) {
ws.send('PING');
}
}, 30000);
// 重连函数
function reconnect() {
console.log('Reconnecting...');
const newWs = new WebSocket('ws://localhost:8080/ws');
newWs.onopen = ws.onopen;
newWs.onmessage = ws.onmessage;
newWs.onclose = ws.onclose;
newWs.onerror = ws.onerror;
window.ws = newWs;
}
// 消息处理
function handleMessage(message) {
switch (message.type) {
case 'notification':
showNotification(message.content);
break;
case 'chat':
updateChat(message);
break;
default:
console.log('Unknown message type:', message.type);
}
}
总结
WebSocket推送法则的核心要点:
- 协议选择:使用WebSocket替代HTTP轮询,实现高效的服务端推送
- 框架选择:使用Netty构建WebSocket服务器,充分利用其NIO高性能特性
- 性能优化:通过操作系统、JVM、Netty三层优化,实现百万级连接支持
- 架构设计:采用集群部署,结合消息队列实现可扩展的推送架构
- 稳定性保障:实现心跳检测、连接限流、优雅关闭等机制
- 监控运维:建立完善的监控体系,及时发现和解决问题
通过遵循本法则,可以构建出高性能、高可用的实时推送系统,满足大规模用户的实时通信需求。
283

被折叠的 条评论
为什么被折叠?



