现象
同事windows电脑上的storm工程没办法正常运行
lowing topologys is going to re-download the jars, [test-1-1540444367]
- Downloading code for storm id test-1-1540444367 from C:\Users\Magnum\AppData\Local\Temp\\b4ea59e2-eb28-4249-b0b9-83242a88e179
- The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries [20]
- Starting
- Initiating client connection, connectString=localhost:2000/jstorm sessionTimeout=20000 watcher=shade.storm.org.apache.curator.ConnectionState@5e92efee
- Creating new blob store based in C:\Users\Magnum\AppData\Local\Temp\b4ea59e2-eb28-4249-b0b9-83242a88e179\blobs
- Opening socket connection to server 127.0.0.1/127.0.0.1:2000. Will not attempt to authenticate using SASL (unknown error)
- Accepted socket connection from /127.0.0.1:52231
- Socket connection established to 127.0.0.1/127.0.0.1:2000, initiating session
- Client attempting to establish new session at /127.0.0.1:52231
- list blob keys, size:2, cost:26
- list blob keys, size:2, cost:18
- Session establishment complete on server 127.0.0.1/127.0.0.1:2000, sessionid = 0x166a9a3349e0099, negotiated timeout = 20000
- Established session 0x166a9a3349e0099 with negotiated timeout 20000 for client /127.0.0.1:52231
- Processed session termination for sessionid: 0x166a9a3349e0099
- register metrics, topology:__NIMBUS__, size:6, cost:0
- register metrics, topology:__SUPERVISOR__, size:6, cost:0
- Session: 0x166a9a3349e0099 closed
- EventThread shut down
- caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x166a9a3349e0099, likely client has closed socket
at shade.storm.org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at shade.storm.org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
- Closed socket connection for client /127.0.0.1:52231 which had sessionid 0x166a9a3349e0099
- java.io.IOException: Unable to delete file: C:\Users\Magnum\AppData\Local\Temp\0887a5dc-bb32-4890-b6df-df9bb2230bb6\supervisor\tmp\03bc5b95-3d34-4ad5-8f46-a12953ef6b16\stormconf.ser downloadStormCode failed topologyId:test-1-1540444367 masterCodeDir:C:\Users\Magnum\AppData\Local\Temp\\b4ea59e2-eb28-4249-b0b9-83242a88e179
- The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries [20]
- Starting
- Initiating client connection, connectString=localhost:2000/jstorm sessionTimeout=20000 watcher=shade.storm.org.apache.curator.ConnectionState@11226326
- Creating new blob store based in C:\Users\Magnum\AppData\Local\Temp\b4ea59e2-eb28-4249-b0b9-83242a88e179\blobs
- Opening socket connection to server 127.0.0.1/127.0.0.1:2000. Will not attempt to authenticate using SASL (unknown error)
- Socket connection established to 127.0.0.1/127.0.0.1:2000, initiating session
- Accepted socket connection from /127.0.0.1:52234
- Client attempting to establish new session at /127.0.0.1:52234
- list blob keys, size:2, cost:19
- Established session 0x166a9a3349e009a with negotiated timeout 20000 for client /127.0.0.1:52234
- Session establishment complete on server 127.0.0.1/127.0.0.1:2000, sessionid = 0x166a9a3349e009a, negotiated timeout = 20000
- State change: CONNECTED
- list blob keys, size:2, cost:14
- Processed session termination for sessionid: 0x166a9a3349e009a
- Session: 0x166a9a3349e009a closed
- EventThread shut down
- Closed socket connection for client /127.0.0.1:52234 which had sessionid 0x166a9a3349e009a
- Copying resources at jar:file:/C:/Users/Magnum/.m2/repository/com/alibaba/jstorm/jstorm-core/2.2.1/jstorm-core-2.2.1.jar!/resources to C:\Users\Magnum\AppData\Local\Temp\\0887a5dc-bb32-4890-b6df-df9bb2230bb6\supervisor\stormdist\test-1-1540444367/resources
- java.io.FileNotFoundException: Source 'file:\C:\Users\Magnum\.m2\repository\com\alibaba\jstorm\jstorm-core\2.2.1\jstorm-core-2.2.1.jar!\resources' does not exist downloadStormCode failed topologyId:test-1-1540444367 masterCodeDir:C:\Users\Magnum\AppData\Local\Temp\\b4ea59e2-eb28-4249-b0b9-83242a88e179
- The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries [20]
- Starting
- Initiating client connection, connectString=localhost:2000/jstorm sessionTimeout=20000 watcher=shade.storm.org.apache.curator.ConnectionState@29488a6d
- Creating new blob store based in C:\Users\Magnum\AppData\Local\Temp\b4ea59e2-eb28-4249-b0b9-83242a88e179\blobs
- Opening socket connection to server 127.0.0.1/127.0.0.1:2000. Will not attempt to authenticate using SASL (unknown error)
- Socket connection established to 127.0.0.1/127.0.0.1:2000, initiating session
- Accepted socket connection from /127.0.0.1:52237
- Client attempting to establish new session at /127.0.0.1:52237
- list blob keys, size:2, cost:21
- list blob keys, size:2, cost:13
- Established session 0x166a9a3349e009b with negotiated timeout 20000 for client /127.0.0.1:52237
- Session establishment complete on server 127.0.0.1/127.0.0.1:2000, sessionid = 0x166a9a3349e009b, negotiated timeout = 20000
- Processed session termination for sessionid: 0x166a9a3349e009b
- Session: 0x166a9a3349e009b closed
- EventThread shut down
- Closed socket connection for client /127.0.0.1:52237 which had sessionid 0x16逻辑a9a3349e009b
- Copying resources at jar:file:/C:/Users/Magnum/.m2/repository/com/alibaba/jstorm/jstorm-core/2.2.1/jstorm-core-2.2.1.jar!/resources to C:\Users\Magnum\AppData\Local\Temp\\0887a5dc-bb32-4890-b6df-df9bb2230bb6\supervisor\stormdist\test-1-1540444367/resources
- java.io.FileNotFoundException: Source 'file:\C:\Users\Magnum\.m2\repository\com\alibaba\jstorm\jstorm-core\2.2.1\jstorm-core-2.2.1.jar!\resources' does not exist downloadStormCode failed topologyId:test-1-15404443i67 maisterCodeDir:C:\Users\Magnum\AppData\Local\Temp\\b4ea59e2-eb28-4249-b0b9-83242a88e179
- Cann't download code for storm id test-1-1540444367 from C:\Users\Magnum\AppData\Local\Temp\\b4ea59e2-eb28-4249-b0b9-83242a88e179
- Can't start this worker: 6900 about the topology: test-1-1540444367, due to the damaged binary !!
其中的异常堆栈为
java.io.FileNotFoundException: Source 'file:\C:\Users\Magnum\.m2\repository\com\alibaba\jstorm\jstorm-core\2.2.1\jstorm-core-2.2.1.jar!\resources' does 'jnot'j exist
at shade.storm.org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1368)
at shade.storm.org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1261)
at shade.storm.org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at shade.storm.org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1230)
at shade.storm.org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at com.alibaba.jstorm.daemon.supervisor.SyncSuperoviosorEvent.downloadLocalStormCode(SyncSupervisorEvent.java:286)
at java.lang.Thread.run(Thread.java:745)
at com.alibaba.jstorm.daemon.supervisor.SyncSupervisorEvent.downloadStormCode(SyncSupervisorEvent.java:234)
at com.alibaba.jstorm.daemon.supervisor.SyncSupervisorEvent.downloadTopology(SyncSupervisorEvent.java:481)
at com.alibaba.jstorm.daemon.supervisor.SyncSupervisorEvent.run(SyncSupervisorEvent.java:184)
at com.alibaba.jstorm.event.EventManagerImp.run(EventManagerImp.java:71)
at cuoum.alibaba.jstorm.callback.AsyncLoopRunnable.run(AsyncLoopRunnable.java:95)
at java.lang.Thread.run(Thread.java:745)
排查
首先从直接原因入手,supervisor下载topo代码的逻辑如下
private void downloadLocalStormCode(Map conf, String topologyId, String masterCodeDir) throws IOException, TException {
// STORM_LOCAL_DIR/supervisor/tmp/(UUID)
String tmproot = StormConfig.supervisorTmpDir(conf) + File.separator + UUID.randomUUID().toString();
// STORM-LOCAL-DIR/supervisor/stormdist/storm-id
String sllot = StormConfig.supervisor_stormdist_root(conf, topologyId);
BlobStore blobStore = null;
try {
blobStore = BlobStoreUtils.getNimbusBlobStore(conf, masterCodeDir, null);
FileUtils.forceMkdir(new File(tmproot));
blobStore.readBlobTo(StormConfig.master_stormcode_key(topologyId), new FileOutputStream(StormConfig.stormcode_path(tmproot)));
blobStore.readBlobTo(StormConfig.master_stormconf_key(topologyId), new FileOutputStream(StormConfig.stormconf_path(tmproot)));
} finally {
if (blobStore != null)
blobStore.shutdown();
}
File srcDir = new File(tmproot);
File destDir = new File(stormroot);
try {
FileUtils.moveDirectory(srcDir, destDir);
} catch (FileExistsException e) {
FileUtils.copyDirectory(srcDir, destDir);
FileUtils.deleteQuietly(srcDir);
}
ClassLoader classloader = Thread.currentThread().getContextClassLoader();
String resourcesJar = resourcesJar();
URL url = classloader.getResource(StormConfig.RESOURCES_SUBDIR);
String targetDir = stormroot + '/' + StormConfig.RESOURCES_SUBDIR;
if (resourcesJar != null) {
//应该走的逻辑,从jar包中加载
LOG.info("Extracting resources from jar at " + resourcesJar + " to " + targetDir);
JStormUtils.extractDirFromJar(resourcesJar, StormConfig.RESOURCES_SUBDIR, stormroot);// extract dir
// from jar;;
// util.clj
} else if (url != null) {
//不应该的逻辑,直接试图从文件中读取,此时jar没有解压缩,路径也不对,肯定读取不到指定文件
LOG.info("Copying resources at " + url.toString() + " to " + targetDir);
FileUtils.copyDirectory(new File(url.getFile()), (new File(targetDir)));
}
}
那么就是检查为什么没有成功的加载resourcejar
private String resourcesJar() {
String path = System.getProperty("java.class.path");
if (path == null) {
return null;
}
String[] paths = path.split(File.pathSeparator);
List<String> jarPaths = new ArrayList<String>();
for (String s : paths) {
if (s.endsWith(".jar")) {
jarPaths.add(s);
}
}
/**
* FIXME, this place seems exist problem
*/
List<String> rtn = new ArrayList<String>();
int size = jarPaths.size();
for (int i = 0; i < size; i++) {
if (JStormUtils.zipContainsDir(jarPaths.get(i), StormConfig.RESOURCES_SUBDIR)) {
rtn.add(jarPaths.get(i));
}
}
if (rtn.size() == 0)
return null;
return rtn.get(0);
}
发现classpath居然是一个classpath文件。这其实是IDEA在windows在某些版本的一个机制。idea的想法是操作系统可能有启动命令的最大长度限制,所以在classpath过长时会将其转化为一个classpath.jar的文件。但是jstorm在设计的时候显然没有考虑到这一点。其加载机制不能兼容这种办法。
解决
将idea配置文件workspace.xml中的
<property name="dynamic.classpath" value="true" />
设置为false。