Can memory be kept in memory to reduce boot times?

I want to use spacy for NLP for an online service. Every time the user makes a request, I call the script "my_script.py"

which starts with:

from spacy.en import English
nlp = English()

      

The problem I am running into is that these two lines take more than 10 seconds, is it possible to store english () in ram or some other option to reduce this load time in less than a second?

+3


source to share


5 answers


You said that you want to run a standalone script ( my_script.py

) whenever a request comes in. This will use capabilites from spacy.en

without download overhead spacy.en

. With this approach, the operating system will always create a new process when the script starts. So there is only one way to avoid loading spacy.en

every time: have a separate process already started with spacy.en

loaded and link your script to that process. The code below shows a way to do it. However, as others have said, you will probably benefit by changing the server architecture to spacy.en

load on your web server (using a Python based web server, for example).

The most common form of interprocess communication is over TCP / IP sockets. The code below implements a small server that supports uploading spacy.en

and processes requests from the client. It also has a client that sends requests to this server and receives the results back. It is up to you to decide what to put in these transfers.

There is also a third script. Since send and receive functions are needed to work with clients and the server, these functions are in a common script called comm.py

. (Note that each client and server downloads a separate copy comm.py

; they do not communicate through a single module loaded into shared memory.)

I am assuming that both scripts run on the same computer. If not, you will need to put a copy comm.py

on both machines and change comm.server_host

to the computer name or server IP address.

Run as a background process (or just in another terminal window for testing). This waits for requests, processes them, and sends the results back: nlp_server.py

import comm
import socket
from spacy.en import English
nlp = English()

def process_connection(sock):
    print "processing transmission from client..."
    # receive data from the client
    data = comm.receive_data(sock)
    # do something with the data
    result = {"data received": data}
    # send the result back to the client
    comm.send_data(result, sock)
    # close the socket with this particular client
    sock.close()
    print "finished processing transmission from client..."

server_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# open socket even if it was used recently (e.g., server restart)
server_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
server_sock.bind((comm.server_host, comm.server_port))
# queue up to 5 connections
server_sock.listen(5)
print "listening on port {}...".format(comm.server_port)
try:
    while True:
        # accept connections from clients
        (client_sock, address) = server_sock.accept()
        # process this connection 
        # (this could be launched in a separate thread or process)
        process_connection(client_sock)
except KeyboardInterrupt:
    print "Server process terminated."
finally:
    server_sock.close()

      

Download as a fast-acting script to request the result from the nlp server (for example ): my_script.py

python my_script.py here are some arguments



import socket, sys
import comm

# data can be whatever you want (even just sys.argv)
data = sys.argv

print "sending to server:"
print data

# send data to the server and receive a result
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# disable Nagle algorithm (probably only needed over a network) 
sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, True)
sock.connect((comm.server_host, comm.server_port))
comm.send_data(data, sock)
result = comm.receive_data(sock)
sock.close()

# do something with the result...
print "result from server:"
print result

      

comm.py

contains code that is used by both client and server:

import sys, struct
import cPickle as pickle

# pick a port that is not used by any other process
server_port = 17001
server_host = '127.0.0.1' # localhost
message_size = 8192
# code to use with struct.pack to convert transmission size (int) 
# to a byte string
header_pack_code = '>I'
# number of bytes used to represent size of each transmission
# (corresponds to header_pack_code)
header_size = 4  

def send_data(data_object, sock):
    # serialize the data so it can be sent through a socket
    data_string = pickle.dumps(data_object, -1)
    data_len = len(data_string)
    # send a header showing the length, packed into 4 bytes
    sock.sendall(struct.pack(header_pack_code, data_len))
    # send the data
    sock.sendall(data_string)

def receive_data(sock):
    """ Receive a transmission via a socket, and convert it back into a binary object. """
    # This runs as a loop because the message may be broken into arbitrary-size chunks.
    # This assumes each transmission starts with a 4-byte binary header showing the size of the transmission.
    # See https://docs.python.org/3/howto/sockets.html
    # and http://code.activestate.com/recipes/408859-socketrecv-three-ways-to-turn-it-into-recvall/

    header_data = ''
    header_done = False
    # set dummy values to start the loop
    received_len = 0
    transmission_size = sys.maxint

    while received_len < transmission_size:
        sock_data = sock.recv(message_size)
        if not header_done:
            # still receiving header info
            header_data += sock_data
            if len(header_data) >= header_size:
                header_done = True
                # split the already-received data between header and body
                messages = [header_data[header_size:]]
                received_len = len(messages[0])
                header_data = header_data[:header_size]
                # find actual size of transmission
                transmission_size = struct.unpack(header_pack_code, header_data)[0]
        else:
            # already receiving data
            received_len += len(sock_data)
            messages.append(sock_data)

    # combine messages into a single string
    data_string = ''.join(messages)
    # convert to an object
    data_object = pickle.loads(data_string)
    return data_object

      

Note. You have to make sure that the result sent from the server only uses its own data structures (dictates, lists, strings, etc.). If the result includes the object defined in spacy.en

, then the client will automatically import spacy.en

when it unpacks the result to provide methods on the object.

This setup is very similar to the HTTP protocol (the server waits for connections, the client connects, the client sends a request, the server sends a response, both sides disconnect). This way you can better use a standard HTTP server and client instead of this custom code. This will be "RESTful API" which is a popular term these days (for no good reason). Using standard HTTP packages saves you the trouble of managing your own client / server code, and you can even call your data server directly from your existing web server instead of running my_script.py

. However, you will have to translate your request to something that is HTTP compliant, such as a GET or POST request, or perhaps just a specially formatted URL.

Another option is to use a standard interprocess exchange package like PyZMQ, redis, mpi4py, or perhaps zmq_object_exchanger. See this question for some ideas: Effective IPC Python

Or you can save a copy of the object spacy.en

to disk using the package dill

( https://pypi.python.org/pypi/dill ) and then restore it at the beginning my_script.py

. It can be faster than importing / restoring it every time and easier than using interprocess communication.

+4


source


Your goal should be to initialize spacy models only once. Use the class and add the spacy attribute to the class. Whenever you use it, it will be the same instance of the attribute.



from spacy.en import English

class Spacy():
      nlp = English()

      

+1


source


So here's a hack for this (I personally will refactor my code and not do this, but since your requirement doesn't elaborate, I'm going to suggest this)

You must have a daemon that starts the online service. Import spacy into the daemon and pass it as a parameter to the file that makes the nlp stuff.

I would refactor my code to use the class mentioned in the solution from @dhruv, which is much cleaner.

The next example is a rough sketch of how to go about things. (Very bad programming principle.)

File1.py

def caller(a,np):
    return np.array(a)

      

File2.py

import numpy as np 
from File1 import caller

z=caller(10,np)
print z

      

The above method will have a load time the first time the daemon is started, after which it is just a function call. Hope this helps!

+1


source


Your main problem is running a new script for every request. Instead of running your script for every request, run the function from within the script for every request.

There are many ways to handle user requests. The simplest thing is to periodically poll requests and add them to the queue. The asynchronous framework is also useful for this kind of work.

This talk from raymond hettinger is a great introduction to concurrency in Python.

+1


source


Since you are using Python you can program some co-workers (I think at some point you will need to scale your application as well) where this initialization is done only once! We tried Gearman for a similar usecase and it works well.

Greetings

0


source







All Articles