昨天学python的select的时候,发现一个非常诡异的事情,到现在也没弄明白,所以在这记录一下
服务端实现了一个简单的echo服务器,相关代码如下:
#coding=utf-8
import select
import socket
import Queue
import time
import os
#创建socket 套接字
server = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
server.setblocking(False)
#配置参数
server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR , 1)
server_address= ('127.0.0.1',9999)
server.bind(server_address)
server.listen(1000)
inputs = [server]
outputs = []
message_queues = {}
f = open("a.txt","w")
#timeout = 20
while inputs:
print "waiting for next event"
# readable , writable , exceptional = select.select(inputs, outputs, inputs, timeout) 最后一个是超时,当前连接要是超过这个时间的话,就会kill
readable , writable , exceptional = select.select(inputs, outputs, inputs)
# When timeout reached , select return three empty lists
if not (readable or writable or exceptional) :
print "Time out ! "
break;
for s in readable :
if s is server:
#通过inputs查看是否有客户端来
connection, client_address = s.accept()
print " connection from ", client_address
connection.setblocking(0)
inputs.append(connection)
message_queues[connection] = Queue.Queue()
else:
print "read"
data = s.recv(1024)
if data :
print " received " , data , "from ",s.getpeername()
message_queues[s].put(data)
# Add output channel for response
if s not in outputs:
print 'append outputs', client_address
outputs.append(s)
else:
#Interpret empty result as closed connection
print " closing", client_address
if s in outputs :
print 'remove outputs', client_address
outputs.remove(s)
print 'remove inputs', client_address
inputs.remove(s)
s.close()
#清除队列信息
print "del message_queues", client_address
del message_queues[s]
for s in writable:
print "write:\t", s.getpeername()
try:
next_msg = message_queues[s].get_nowait()
except Queue.Empty:
print " " , s.getpeername() , 'queue empty remove output'
outputs.remove(s)
#因为客户端断开会执行outputs.remove(s),这个操作会触发select
# except KeyError:
# print "message_queues[s] removed"
# # f.write("%s\n" % len(message_queues))
else:
print " sending " , next_msg , " to ", s.getpeername()
# os.popen('sleep 0.1').read()
#zhege caozuo yehui chufa select
s.send(next_msg)
for s in exceptional:
print " exception condition on ", s.getpeername()
#stop listening for input on the connection
inputs.remove(s)
if s in outputs:
outputs.remove(s)
s.close()
#清除队列信息
del message_queues[s]
print len(outputs)
f.close()
客户端则是通过开启200个线程同时访问服务端,代码:
#!/user/bin/env python
#-*- encoding:utf-8 -*-
import socket
import thread,threading
import time
sockIndex = 1
def connToServer ():
global sockIndex
#创建一个socket连接到127.0.0.1:8081,并发送内容
conn = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
conn.connect(("127.0.0.1", 9999))
print "hi,I'm NO."+ str(sockIndex)
conn.send("hi,I'm NO."+ str(sockIndex))
# print sockIndex
sockIndex = sockIndex + 1
while True:
#等待服务端返回数据,并输出
# break
rev = conn.recv(1024)
print 'get server msg:' + str(rev)
break
threads = []
times = 200
#并发
for i in range(0,times):
t = threading.Thread(target=connToServer())
threads.append(t)
for i in range(0,times):
threads[i].start()
for i in range(0,times):
threads[i].join()
connToServer()
然后发现会报错,错误信息追溯到
Traceback (most recent call last):
File "select.py", line 62, in <module>
next_msg = message_queues[s].get_nowait()
KeyError: <socket._socketobject object at 0x7f09cf823910>
我自己用本办法print打印信息,发现正常情况下send消息后,会再次触发select的output,再次去读queue中的信息,直到空为止。
但是由于来的连接太快,第一次send后没来得及第二次send就来新连接了。等处理完新连接,又处理之前没来得处理的读queue,但这时候已经被删了,所以就报错了。然后加了个except后就没问题了。但是我发现一个问题,就是客户端使用线程并发连接的时候,我发现recv会直接返回空,然后下面的逻辑判断中认为断开了链接,就关掉了socket,但是紧接着又触发了select,这个时候才接受到recv的数据
waiting for next event
connection from ('127.0.0.1', 51852)
read
closing ('127.0.0.1', 51852)
remove inputs ('127.0.0.1', 51852)
del message_queues ('127.0.0.1', 51852)
0
waiting for next event
read
received hi,I'm NO.106 from ('127.0.0.1', 51852)
append outputs ('127.0.0.1', 51852)
1
waiting for next event
write:
sending hi,I'm NO.106 to ('127.0.0.1', 51852)
1
waiting for next event
write:
('127.0.0.1', 51852) queue empty remove output
0
查资料说python的非阻塞socket在recv的时候如果没收到消息,会立马抛出异常,但是这里又没有抛出异常,所以应该不是这个原因,可是看代码和执行过程,确实没有抛出异常,而是直接往下执行了,然后发现没数据,就关闭了socket,触发了select,然后读取到了数据。可是这个逻辑也有问题,(参照上面的逻辑)因为第一次进来就把message_queues里面的队列给删了,然而在后面获取message_queues[s]的时候没有出现异常,可是上面keyerror又不知道怎么解释了。
唉,感觉好乱啊,完全不知道这是什么情况
本人出于个人兴趣,创建了一个个人公众号,每天筛选国外网友发现的有趣的事情推送到公众号,欢迎关注!