服务器上装了ai2thor
,ssh
到服务器上跑程序,程序开了多进程调用ai2thor
进行仿真。
结果出现如下提示,然后一直卡住不动。
/usr/lib/xorg/Xorg.wrap: Only console users are allowed to run the X server
Ctrl+C
结束程序发现所有进程都卡在fifo_server.py
的open
函数上
controller = ai2thor.controller.Controller(**controller_kwargs)
File "/home/zzy/.conda/envs/cow/lib/python3.7/site-packages/ai2thor/controller.py", line 498, in __init__
host=host,
File "/home/zzy/.conda/envs/cow/lib/python3.7/site-packages/ai2thor/controller.py", line 1299, in start
self.last_event = self.server.receive()
File "/home/zzy/.conda/envs/cow/lib/python3.7/site-packages/ai2thor/fifo_server.py", line 182, in receive
metadata, files = self._recv_message()
File "/home/zzy/.conda/envs/cow/lib/python3.7/site-packages/ai2thor/fifo_server.py", line 103, in _recv_message
self.server_pipe = open(self.server_pipe_path, "rb")
搜索卡住的函数没找到解决方案,试着直接在python里面初始化一个Controller
,出现如下报错。
>>> from ai2thor.controller import Controller
>>> c=Controller()
/home/zzy/.conda/envs/cow/lib/python3.7/site-packages/ai2thor/platform.py:155: UserWarning: could not connect to X Display: 0, Can't connect to display ":0": b'No protocol specified\n'
"could not connect to X Display: %s, %s" % (display_str, e)
/home/zzy/.conda/envs/cow/lib/python3.7/site-packages/ai2thor/platform.py:155: UserWarning: could not connect to X Display: 1, Can't connect to display ":1": b'No protocol specified\n'
"could not connect to X Display: %s, %s" % (display_str, e)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/zzy/.conda/envs/cow/lib/python3.7/site-packages/ai2thor/controller.py", line 487, in __init__
self._build = self.find_build(local_build, commit_id, branch, platform)
File "/home/zzy/.conda/envs/cow/lib/python3.7/site-packages/ai2thor/controller.py", line 1214, in find_build
raise Exception("\n".join(error_messages))
Exception: The following builds were found, but had missing dependencies. Only one valid platform is required to run AI2-THOR.
Platform Linux64 failed validation with the following errors: No valid X display found
Linux64 requires a X11 server to be running with GLX. If you have a NVIDIA GPU, please run: sudo ai2thor-xorg start
查了说是要设置DISPLAY
环境变量,echo $DISPLAY
确实是没有设置的,但是试了:0
,:1
,:2
等都一样是报错,但有额外的报错信息:
对于:0
,:1
已经在服务器上登录了的图形桌面(其它账号登录的)所对应的号码,是如下提示:
Platform Linux64 failed validation with the following errors: Invalid display: :0. Failed to connect Can't connect to display ":0": b'No protocol specified\n'
对于:2
没有图形桌面对应的号码,是如下提示:
Platform Linux64 failed validation with the following errors: Invalid display: :2. Failed to connect Can't connect to display ":2": [Errno 111] Connection refused
但根据这些新提示也没有查到什么有用的解决方法。
想着可能需要ssh
时加上x11转发有图形界面才能开启Controller
,但网络问题支持不了开启转发,于是根据ai2thor官网的说法准备使用CloudRendering
看看。等着下载的时候想着可能是ai2thor
配置不正确,决定到机房服务器上直接登录再试,发现的确是可以运行的,同时也查看了一下$DISPLAY
变量,发现是:2
。
在不登出桌面环境的情况下,再通过ssh连接到服务器并且设置DISPLAY
为:2
发现能够初始化Controller
了,桌面环境上看的话会出现ai2thor
的仿真环境。
PS1: 后面试了如果不设置DISPLAY
环境变量或者传参x_display
的话,初始化的时候会自动遍历所有现有的display
,如果有能运行的就会自动选取进行初始化,没有就会报错。
PS2: 设置DISPLAY
为空字符串的话也是会卡在open
函数上面。
PS3:开多进程跑ai2thor
的时候如果初始化出错的话好像并不会导致进程崩溃,而是一直卡在open函数上面没有任何提示。
因此,如果要ssh
到服务器上面跑ai2thor
的话可以尝试一下先在服务器开一个图形界面,查一下界面对应的display
(echo $DISPLAY
),然后ssh
登录后设置DISPLAY
为对应的值由或者初始化Controller
时设置传参x_display=“:2” #相应的display值
又或者可以试试x11转发或者CloudRendering