Why only 1024 bytes are read in socketserver example

I am reading examples documentation for socketserver python at https://docs.python.org/2/library/socketserver.html

Why is the size specified as 1024 in the descriptor method self.request.recv(1024)

inside. What happens if the data sent by the client is more than 1024 bytes? Is it better to have a loop to read 1024 bytes until the socket is empty? I copied the example here:

import SocketServer

class MyTCPHandler(SocketServer.BaseRequestHandler):
    """
    The RequestHandler class for our server.

    It is instantiated once per connection to the server, and must
    override the handle() method to implement communication to the
    client.
    """

    def handle(self):
        # self.request is the TCP socket connected to the client
        self.data = self.request.recv(1024).strip() # why only 1024 bytes ?
        print "{} wrote:".format(self.client_address[0])
        print self.data
        # just send back the same data, but upper-cased
        self.request.sendall(self.data.upper())

if __name__ == "__main__":
    HOST, PORT = "localhost", 9999

    # Create the server, binding to localhost on port 9999
    server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler)

    # Activate the server; this will keep running until you
    # interrupt the program with Ctrl-C
    server.serve_forever()

      

+3


source to share


2 answers


When reading from a socket, a loop is always required.

The reason is that even if the sent source says 300 bytes over the network, it is possible, for example, to receive the data to the receiver as two separate chunks of 200 bytes and 100 bytes.

For this reason, when you specify the size of the buffer for recv

, you only specify the maximum amount you are willing to process, but the actual returned amount of data may be less.

There is no way to implement "read to end of message" in Python because the functions send

/ recv

are just wrappers of the TCP socket interface and are a stream with no message boundaries (so there is no way to know if all the data has been received from the source).



This also means that in many cases you will need to add your own boundaries if you need to talk with messages (or you need to use a higher level message-based network transport interface like 0MQ )

Note that the "blocking mode" - when reading from a socket - determines the behavior when there is no data already received by the network layer of the operating system: in this case, when blocking, the program will wait for a piece of data; if it does not block instead, it will return immediately without waiting. If any data has already been received by the computer, the call recv

returns immediately, even if the size of the passed buffer is larger - regardless of the blocking / non-blocking option.

Blocking mode does not mean that the call recv

will wait for the buffer to fill.

NOTE . The Python documentation is really misleading in behavior recv

and will hopefully be fixed soon.

+3


source


TCP socket is just a stream of bytes. Think about how to read the file. Is it better to read the file in 1024 byte chunks? It depends on the content. Often a socket, like a file, is buffered and only complete elements are fetched (lines, records, whatever comes up). It's up to the performer.



In this case, a maximum of 1024 is read. If a larger quantity is sent, it will be split. Since there is no defined message boundary in this code, it really doesn't matter. If you want to get only complete lines, loop through the data reading until the message boundary is determined. Perhaps read until a carriage return is detected and process the full line of text.

0


source







All Articles