java操作redis,放数或者取数,从上手程度上看-不难。但是,一次一次的异常以及解决会让我们对redis client进一步了解。
一个存数据的java redis客户端项目,测试时能够存放数十条数据,使用pipeline的方式。
但是,一旦存放千万乃至亿级数据的时候,每次导入数据必定会报同一个错:connection reset
Object pool = SpringUtils.getBean("jedisPool");
jedis = ((RoundRobinJedisPool) pool).getResource();
Pipeline pipeline = jedis.pipelined();
String deviceNum = "";
int count = 0;
while (rs.next()) {
Map map = new HashMap();
deviceNum = rs.getString(1);
System.out.println(deviceNum);
map.put("CUST_SEX", rs.getString(2));
map.put("CUST_AGE_SPLIT", rs.getString(3));
map.put("PROV_ID", rs.getString(4));
pipeline.set((keyStr+deviceNum).getBytes(), DeserializeUtil.serialize(map));
pipeline.expire((keyStr+deviceNum).getBytes(), 60*60*24*30);
count++;
}
pipeline.sync();
jedis.close();
错误日志:
redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketException: Connection reset
at redis.clients.jedis.Protocol.sendCommand(Protocol.java:98)
at redis.clients.jedis.Protocol.sendCommand(Protocol.java:78)
at redis.clients.jedis.Connection.sendCommand(Connection.java:101)
at redis.clients.jedis.BinaryClient.set(BinaryClient.java:97)
at redis.clients.jedis.PipelineBase.set(PipelineBase.java:503)
at com.bonc.RedisUtil.dataImport(RedisUtil.java:106)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.springframework.util.MethodInvoker.invoke(MethodInvoker.java:273)
at org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean$MethodInvokingJob.executeInternal(MethodInvokingJobDetailFactoryBean.java:311)
at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:113)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:531)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:118)
at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
at redis.clients.util.RedisOutputStream.flushBuffer(RedisOutputStream.java:52)
at redis.clients.util.RedisOutputStream.write(RedisOutputStream.java:74)
at redis.clients.util.RedisOutputStream.write(RedisOutputStream.java:65)
at redis.clients.jedis.Protocol.sendCommand(Protocol.java:94)
... 14 more
关于redis的connection reset问题,网上有一个说法是“缓冲区不够用了,导致socket发生异常”。
在上面的异常中,报错的那一行是pipeline.set();下面跟进源码,进一步解决问题
(一)jedis.set 和 pipeline.set 有啥区别,为什么用pipeline.set
因为,client 每一次set时,都会从redis server收到一个response;而每一次jedis.set必须在收到上一个response之后才会进行的操作,如果有多条大量的数据需要存放,那这每条数据的存放是串行的。势必会造成时间的大量消耗,和内存资源长时间占有。pipeline就解决了这问题,pipeline允许连续的set数据而不用等待上一个的响应,可以在所有数据set之后统一读取response,所以时间上快了不少。
之所以要讲pipeline,是因为,正由于pipeline的这种优势导致上面的报错
(二)pipeline.set 的时候做了什么
// PipelineBase.java
public Response<String> set(byte[] key, byte[] value) {
getClient(key).set(key, value);
return getResponse(BuilderFactory.STRING);
}
首先会调用父类PipelineBase里的set方法,方法的第一行,获得Client实例,由Client执行具体的set;在执行之前会对key和value进行安全编码,源码上说是为了兼容java 1.5
然后发送执行命令
// BinaryClient.java BinaryClient是Client的父类
public void set(final byte[] key, final byte[] value) {
sendCommand(Command.SET, key, value);
}
// Connection.java
protected Connection sendCommand(final Command cmd, final byte[]... args) {
try {
connect(); // 获得redis server的socket连接,和连接过程中的输入、输出流对象
Protocol.sendCommand(outputStream, cmd, args); // 将要执行的命令、key-value等发送给server
pipelinedCommands++; // 记录执行命令的次数,也就是set的个数
return this;
} catch (JedisConnectionException ex) {
/*
* When client send request which formed by invalid protocol, Redis send back error message
* before close connection. We try to read it to provide reason of failure.
*/
try {
String errorMessage = Protocol.readErrorLineIfPossible(inputStream);
if (errorMessage != null && errorMessage.length() > 0) {
ex = new JedisConnectionException(errorMessage, ex.getCause());
}
} catch (Exception e) {
/*
* Catch any IOException or JedisConnectionException occurred from InputStream#read and just
* ignore. This approach is safe because reading error message is optional and connection
* will eventually be closed.
*/
}
// Any other exceptions related to connection?
broken = true;
throw ex;
}
}
通过RedisOutputStream写入请求数据,server会将response放入RedisInputStream缓冲区,由客户端读取RedisInputStream中的响应。
pipeline.sync();的作用就是读取RedisInputStream缓冲区。
敲黑板,重点来了。代码中存放的数据是千万级的,也就是set的次数时千万级,但是,但是却在所有数据set完才执行sync方法。在set过程中,RedisInputStream缓冲区一直在添加response,不对缓冲区进行处理,导致缓冲区大小不够,response放不进去了,下一条数据也就没法set,抛出异常。
解决方法:
在循环中添加计数器,当set的次数到达10000时,清空一次缓冲区,最后不足10000的由循环外面的sync统一清空。这样就避免了缓冲区溢出的情况。
修改之后的代码:
Object pool = SpringUtils.getBean("jedisPool");
jedis = ((RoundRobinJedisPool) pool).getResource();
Pipeline pipeline = jedis.pipelined();
String deviceNum = "";
int count = 0;
while (rs.next()) {
Map map = new HashMap();
deviceNum = rs.getString(1);
System.out.println(deviceNum);
map.put("CUST_SEX", rs.getString(2));
map.put("CUST_AGE_SPLIT", rs.getString(3));
map.put("PROV_ID", rs.getString(4));
pipeline.set((keyStr+deviceNum).getBytes(), DeserializeUtil.serialize(map));
pipeline.expire((keyStr+deviceNum).getBytes(), 60*60*24*30);
if(++count%10000==0){
pipeline.sync();
}
}
pipeline.sync();
jedis.close();
如有不当,欢迎指正!