工作中遇到的问题记录
kafka异常退出:
解决思路:
1、排查kafka log,log.dirs:
2、日志如下:
# pwd
/data/log/kafka
# ls -lh
总用量 78M
-rw-r--r-- 1 kafka kafka 78M 9月 18 14:18 kafka-broker-rc-nmg-kfk-rds-woasis1.log
drwxr-xr-x 2 kafka kafka 10 5月 18 18:29 stacks
183308 2017-09-16 08:16:00,397 INFO kafka.controller.RequestSendThread: [Controller-115-to-broker-115-send-thread], Controller 115 connected to rc-n mg-kfk-rds-woasis1:9092 (id: 115 rack: null) for sending state change requests
183309 2017-09-16 08:16:28,180 ERROR kafka.network.Acceptor: Error while accepting connection
183310 java.io.IOException: 打开的文件过多
183311 at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
183312 at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241)
183313 at kafka.network.Acceptor.accept(SocketServer.scala:332)
183314 at kafka.network.Acceptor.run(SocketServer.scala:275)
183315 at java.lang.Thread.run(Thread.java:745)
提示打开文件过多。
解决方法:查看机器文件描述符大小。
# ulimit -n
cdh集群中,使用root用户登陆,操作hdfs,无法删除hdfs上的文件:
图片如下:
排查原因是由于权限问题,排查思路和log如下:
# hadoop fs -ls /tmp
Found 3 items
drwxrwxrwx - hdfs supergroup 0 2017-09-18 11:07 /tmp/.cloudera_health_monitoring_canary_files
-rw-r--r-- 3 root supergroup 0 2017-09-18 10:55 /tmp/a.txt
drwx--x--x - hbase supergroup 0 2017-08-02 19:05 /tmp/hbase-staging
# hadoop fs -rm -r /tmp/a.txt
17/09/18 11:06:05 WARN fs.TrashPolicyDefault: Can't create trash directory: hdfs://nameservice1/user/root/.Trash/Current/tmp
org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:281)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:262)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:242)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:169)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6632)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6614)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6566)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4359)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4329)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4302)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:869)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(AuthorizationProviderProxyClientProtocol.java:323)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:608)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3104)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:3069)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:957)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:953)
at org.apache.hadoo