Python Select 解析

金角大王發表於2015-03-27

首先列一下,sellect、poll、epoll三者的區別 
select 
select最早於1983年出現在4.2BSD中,它通過一個select()系統呼叫來監視多個檔案描述符的陣列,當select()返回後,該陣列中就緒的檔案描述符便會被核心修改標誌位,使得程式可以獲得這些檔案描述符從而進行後續的讀寫操作。

select目前幾乎在所有的平臺上支援,其良好跨平臺支援也是它的一個優點,事實上從現在看來,這也是它所剩不多的優點之一。

select的一個缺點在於單個程式能夠監視的檔案描述符的數量存在最大限制,在Linux上一般為1024,不過可以通過修改巨集定義甚至重新編譯核心的方式提升這一限制。

另外,select()所維護的儲存大量檔案描述符的資料結構,隨著檔案描述符數量的增大,其複製的開銷也線性增長。同時,由於網路響應時間的延遲使得大量TCP連線處於非活躍狀態,但呼叫select()會對所有socket進行一次線性掃描,所以這也浪費了一定的開銷。

poll 
poll在1986年誕生於System V Release 3,它和select在本質上沒有多大差別,但是poll沒有最大檔案描述符數量的限制。

poll和select同樣存在一個缺點就是,包含大量檔案描述符的陣列被整體複製於使用者態和核心的地址空間之間,而不論這些檔案描述符是否就緒,它的開銷隨著檔案描述符數量的增加而線性增大。

另外,select()和poll()將就緒的檔案描述符告訴程式後,如果程式沒有對其進行IO操作,那麼下次呼叫select()和poll()的時候將再次報告這些檔案描述符,所以它們一般不會丟失就緒的訊息,這種方式稱為水平觸發(Level Triggered)。

epoll 
直到Linux2.6才出現了由核心直接支援的實現方法,那就是epoll,它幾乎具備了之前所說的一切優點,被公認為Linux2.6下效能最好的多路I/O就緒通知方法。

epoll可以同時支援水平觸發和邊緣觸發(Edge Triggered,只告訴程式哪些檔案描述符剛剛變為就緒狀態,它只說一遍,如果我們沒有采取行動,那麼它將不會再次告知,這種方式稱為邊緣觸發),理論上邊緣觸發的效能要更高一些,但是程式碼實現相當複雜。

epoll同樣只告知那些就緒的檔案描述符,而且當我們呼叫epoll_wait()獲得就緒檔案描述符時,返回的不是實際的描述符,而是一個代表就緒描述符數量的值,你只需要去epoll指定的一個陣列中依次取得相應數量的檔案描述符即可,這裡也使用了記憶體對映(mmap)技術,這樣便徹底省掉了這些檔案描述符在系統呼叫時複製的開銷。

另一個本質的改進在於epoll採用基於事件的就緒通知方式。在select/poll中,程式只有在呼叫一定的方法後,核心才對所有監視的檔案描述符進行掃描,而epoll事先通過epoll_ctl()來註冊一個檔案描述符,一旦基於某個檔案描述符就緒時,核心會採用類似callback的回撥機制,迅速啟用這個檔案描述符,當程式呼叫epoll_wait()時便得到通知。

 

 

Python select 

Python的select()方法直接呼叫作業系統的IO介面,它監控sockets,open files, and pipes(所有帶fileno()方法的檔案控制程式碼)何時變成readable 和writeable, 或者通訊錯誤,select()使得同時監控多個連線變的簡單,並且這比寫一個長迴圈來等待和監控多客戶端連線要高效,因為select直接通過作業系統提供的C的網路介面進行操作,而不是通過Python的直譯器。

注意:Using Python’s file objects with select() works for Unix, but is not supported under Windows.

接下來通過echo server例子要以瞭解select 是如何通過單程式實現同時處理多個非阻塞的socket連線的

import select
import socket
import sys
import Queue

# Create a TCP/IP socket
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.setblocking(0)

# Bind the socket to the port
server_address = ('localhost', 10000)
print >>sys.stderr, 'starting up on %s port %s' % server_address
server.bind(server_address)

# Listen for incoming connections
server.listen(5)

 

select()方法接收並監控3個通訊列表, 第一個是所有的輸入的data,就是指外部發過來的資料,第2個是監控和接收所有要發出去的data(outgoing data),第3個監控錯誤資訊,接下來我們需要建立2個列表來包含輸入和輸出資訊來傳給select().

# Sockets from which we expect to read
inputs = [ server ]

# Sockets to which we expect to write
outputs = [ ]

 

所有客戶端的進來的連線和資料將會被server的主迴圈程式放在上面的list中處理,我們現在的server端需要等待連線可寫(writable)之後才能過來,然後接收資料並返回(因此不是在接收到資料之後就立刻返回),因為每個連線要把輸入或輸出的資料先快取到queue裡,然後再由select取出來再發出去。

Connections are added to and removed from these lists by the server main loop. Since this version of the server is going to wait for a socket to become writable before sending any data (instead of immediately sending the reply), each output connection needs a queue to act as a buffer for the data to be sent through it.

# Outgoing message queues (socket:Queue)
message_queues = {}

The main portion of the server program loops, calling select() to block and wait for network activity.

下面是此程式的主迴圈,呼叫select()時會阻塞和等待直到新的連線和資料進來

while inputs:

    # Wait for at least one of the sockets to be ready for processing
    print >>sys.stderr, '\nwaiting for the next event'
    readable, writable, exceptional = select.select(inputs, outputs, inputs)

 當你把inputs,outputs,exceptional(這裡跟inputs共用)傳給select()後,它返回3個新的list,我們上面將他們分別賦值為readable,writable,exceptional, 所有在readable list中的socket連線代表有資料可接收(recv),所有在writable list中的存放著你可以對其進行傳送(send)操作的socket連線,當連線通訊出現error時會把error寫到exceptional列表中。

select() returns three new lists, containing subsets of the contents of the lists passed in. All of the sockets in the readable list have incoming data buffered and available to be read. All of the sockets in the writable list have free space in their buffer and can be written to. The sockets returned in exceptional have had an error (the actual definition of “exceptional condition” depends on the platform).

 

Readable list 中的socket 可以有3種可能狀態,第一種是如果這個socket是main "server" socket,它負責監聽客戶端的連線,如果這個main server socket出現在readable裡,那代表這是server端已經ready來接收一個新的連線進來了,為了讓這個main server能同時處理多個連線,在下面的程式碼裡,我們把這個main server的socket設定為非阻塞模式。

The “readable” sockets represent three possible cases. If the socket is the main “server” socket, the one being used to listen for connections, then the “readable” condition means it is ready to accept another incoming connection. In addition to adding the new connection to the list of inputs to monitor, this section sets the client socket to not block.

    # Handle inputs
    for s in readable:

        if s is server:
            # A "readable" server socket is ready to accept a connection
            connection, client_address = s.accept()
            print >>sys.stderr, 'new connection from', client_address
            connection.setblocking(0)
            inputs.append(connection)

            # Give the connection a queue for data we want to send
            message_queues[connection] = Queue.Queue()

 

第二種情況是這個socket是已經建立了的連線,它把資料發了過來,這個時候你就可以通過recv()來接收它發過來的資料,然後把接收到的資料放到queue裡,這樣你就可以把接收到的資料再傳回給客戶端了。

The next case is an established connection with a client that has sent data. The data is read with recv(), then placed on the queue so it can be sent through the socket and back to the client.

       else:
            data = s.recv(1024)
            if data:
                # A readable client socket has data
                print >>sys.stderr, 'received "%s" from %s' % (data, s.getpeername())
                message_queues[s].put(data)
                # Add output channel for response
                if s not in outputs:
                    outputs.append(s)

第三種情況就是這個客戶端已經斷開了,所以你再通過recv()接收到的資料就為空了,所以這個時候你就可以把這個跟客戶端的連線關閉了。

A readable socket without data available is from a client that has disconnected, and the stream is ready to be closed.

            else:
                # Interpret empty result as closed connection
                print >>sys.stderr, 'closing', client_address, 'after reading no data'
                # Stop listening for input on the connection
                if s in outputs:
                    outputs.remove(s)  #既然客戶端都斷開了,我就不用再給它返回資料了,所以這時候如果這個客戶端的連線物件還在outputs列表中,就把它刪掉
                inputs.remove(s)    #inputs中也刪除掉
                s.close()           #把這個連線關閉掉

                # Remove message queue
                del message_queues[s]   

  

對於writable list中的socket,也有幾種狀態,如果這個客戶端連線在跟它對應的queue裡有資料,就把這個資料取出來再發回給這個客戶端,否則就把這個連線從output list中移除,這樣下一次迴圈select()呼叫時檢測到outputs list中沒有這個連線,那就會認為這個連線還處於非活動狀態

There are fewer cases for the writable connections. If there is data in the queue for a connection, the next message is sent. Otherwise, the connection is removed from the list of output connections so that the next time through the loop select() does not indicate that the socket is ready to send data.

    # Handle outputs
    for s in writable:
        try:
            next_msg = message_queues[s].get_nowait()
        except Queue.Empty:
            # No messages waiting so stop checking for writability.
            print >>sys.stderr, 'output queue for', s.getpeername(), 'is empty'
            outputs.remove(s)
        else:
            print >>sys.stderr, 'sending "%s" to %s' % (next_msg, s.getpeername())
            s.send(next_msg)

最後,如果在跟某個socket連線通訊過程中出了錯誤,就把這個連線物件在inputs\outputs\message_queue中都刪除,再把連線關閉掉

    # Handle "exceptional conditions"
    for s in exceptional:
        print >>sys.stderr, 'handling exceptional condition for', s.getpeername()
        # Stop listening for input on the connection
        inputs.remove(s)
        if s in outputs:
            outputs.remove(s)
        s.close()

        # Remove message queue
        del message_queues[s]

 

客戶端

下面的這個是客戶端程式展示瞭如何通過select()對socket進行管理並與多個連線同時進行互動,

The example client program uses two sockets to demonstrate how the server with select() manages multiple connections at the same time. The client starts by connecting each TCP/IP socket to the server.

import socket
import sys

messages = [ 'This is the message. ',
             'It will be sent ',
             'in parts.',
             ]
server_address = ('localhost', 10000)

# Create a TCP/IP socket
socks = [ socket.socket(socket.AF_INET, socket.SOCK_STREAM),
          socket.socket(socket.AF_INET, socket.SOCK_STREAM),
          ]

# Connect the socket to the port where the server is listening
print >>sys.stderr, 'connecting to %s port %s' % server_address
for s in socks:
    s.connect(server_address)

接下來通過迴圈通過每個socket連線給server傳送和接收資料。


Then it sends one pieces of the message at a time via each socket, and reads all responses available after writing new data.

for message in messages:

    # Send messages on both sockets
    for s in socks:
        print >>sys.stderr, '%s: sending "%s"' % (s.getsockname(), message)
        s.send(message)

    # Read responses on both sockets
    for s in socks:
        data = s.recv(1024)
        print >>sys.stderr, '%s: received "%s"' % (s.getsockname(), data)
        if not data:
            print >>sys.stderr, 'closing socket', s.getsockname()

 

 

最後伺服器端的完整程式碼如下: 

 

#_*_coding:utf-8_*_
__author__ = 'Alex Li'

import select
import socket
import sys
import queue

# Create a TCP/IP socket
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.setblocking(False)

# Bind the socket to the port
server_address = ('localhost', 10000)
print(sys.stderr, 'starting up on %s port %s' % server_address)
server.bind(server_address)

# Listen for incoming connections
server.listen(5)

# Sockets from which we expect to read
inputs = [ server ]

# Sockets to which we expect to write
outputs = [ ]

message_queues = {}
while inputs:

    # Wait for at least one of the sockets to be ready for processing
    print( '\nwaiting for the next event')
    readable, writable, exceptional = select.select(inputs, outputs, inputs)
    # Handle inputs
    for s in readable:

        if s is server:
            # A "readable" server socket is ready to accept a connection
            connection, client_address = s.accept()
            print('new connection from', client_address)
            connection.setblocking(False)
            inputs.append(connection)

            # Give the connection a queue for data we want to send
            message_queues[connection] = queue.Queue()
        else:
            data = s.recv(1024)
            if data:
                # A readable client socket has data
                print(sys.stderr, 'received "%s" from %s' % (data, s.getpeername()) )
                message_queues[s].put(data)
                # Add output channel for response
                if s not in outputs:
                    outputs.append(s)
            else:
                # Interpret empty result as closed connection
                print('closing', client_address, 'after reading no data')
                # Stop listening for input on the connection
                if s in outputs:
                    outputs.remove(s)  #既然客戶端都斷開了,我就不用再給它返回資料了,所以這時候如果這個客戶端的連線物件還在outputs列表中,就把它刪掉
                inputs.remove(s)    #inputs中也刪除掉
                s.close()           #把這個連線關閉掉

                # Remove message queue
                del message_queues[s]
    # Handle outputs
    for s in writable:
        try:
            next_msg = message_queues[s].get_nowait()
        except queue.Empty:
            # No messages waiting so stop checking for writability.
            print('output queue for', s.getpeername(), 'is empty')
            outputs.remove(s)
        else:
            print( 'sending "%s" to %s' % (next_msg, s.getpeername()))
            s.send(next_msg)
    # Handle "exceptional conditions"
    for s in exceptional:
        print('handling exceptional condition for', s.getpeername() )
        # Stop listening for input on the connection
        inputs.remove(s)
        if s in outputs:
            outputs.remove(s)
        s.close()

        # Remove message queue
        del message_queues[s]

 

  

 

 

 

客戶端的完整程式碼如下:
__author__ = 'jieli'
import socket
import sys

messages = [ 'This is the message. ',
             'It will be sent ',
             'in parts.',
             ]
server_address = ('localhost', 10000)

# Create a TCP/IP socket
socks = [ socket.socket(socket.AF_INET, socket.SOCK_STREAM),
          socket.socket(socket.AF_INET, socket.SOCK_STREAM),
          ]

# Connect the socket to the port where the server is listening
print >>sys.stderr, 'connecting to %s port %s' % server_address
for s in socks:
    s.connect(server_address)

for message in messages:

    # Send messages on both sockets
    for s in socks:
        print >>sys.stderr, '%s: sending "%s"' % (s.getsockname(), message)
        s.send(message)

    # Read responses on both sockets
    for s in socks:
        data = s.recv(1024)
        print >>sys.stderr, '%s: received "%s"' % (s.getsockname(), data)
        if not data:
            print >>sys.stderr, 'closing socket', s.getsockname()
            s.close()

Run the server in one window and the client in another. The output will look like this, with different port numbers.

$ python ./select_echo_server.py
starting up on localhost port 10000

waiting for the next event
new connection from ('127.0.0.1', 55821)

waiting for the next event
new connection from ('127.0.0.1', 55822)
received "This is the message. " from ('127.0.0.1', 55821)

waiting for the next event
sending "This is the message. " to ('127.0.0.1', 55821)

waiting for the next event
output queue for ('127.0.0.1', 55821) is empty

waiting for the next event
received "This is the message. " from ('127.0.0.1', 55822)

waiting for the next event
sending "This is the message. " to ('127.0.0.1', 55822)

waiting for the next event
output queue for ('127.0.0.1', 55822) is empty

waiting for the next event
received "It will be sent " from ('127.0.0.1', 55821)
received "It will be sent " from ('127.0.0.1', 55822)

waiting for the next event
sending "It will be sent " to ('127.0.0.1', 55821)
sending "It will be sent " to ('127.0.0.1', 55822)

waiting for the next event
output queue for ('127.0.0.1', 55821) is empty
output queue for ('127.0.0.1', 55822) is empty

waiting for the next event
received "in parts." from ('127.0.0.1', 55821)
received "in parts." from ('127.0.0.1', 55822)

waiting for the next event
sending "in parts." to ('127.0.0.1', 55821)
sending "in parts." to ('127.0.0.1', 55822)

waiting for the next event
output queue for ('127.0.0.1', 55821) is empty
output queue for ('127.0.0.1', 55822) is empty

waiting for the next event
closing ('127.0.0.1', 55822) after reading no data
closing ('127.0.0.1', 55822) after reading no data

waiting for the next event

The client output shows the data being sent and received using both sockets.

$ python ./select_echo_multiclient.py
connecting to localhost port 10000
('127.0.0.1', 55821): sending "This is the message. "
('127.0.0.1', 55822): sending "This is the message. "
('127.0.0.1', 55821): received "This is the message. "
('127.0.0.1', 55822): received "This is the message. "
('127.0.0.1', 55821): sending "It will be sent "
('127.0.0.1', 55822): sending "It will be sent "
('127.0.0.1', 55821): received "It will be sent "
('127.0.0.1', 55822): received "It will be sent "
('127.0.0.1', 55821): sending "in parts."
('127.0.0.1', 55822): sending "in parts."
('127.0.0.1', 55821): received "in parts."
('127.0.0.1', 55822): received "in parts."

 

 

 

 

  

原文:http://pymotw.com/2/select/  

相關文章