MySQL INSERT INTO fork (InnoDB)

I am trying to insert about 500 million rows of garbage data into a database for testing. Right now I have a PHP script going through several statements SELECT/INSERT

, each one inside TRANSACTION

- obviously not the best solution. The tables are InnoDB (row-level locking).

I am wondering if I can (correctly) develop the process, will it speed up the process INSERT

? At the course, it will take 140 hours. I am concerned about two things:

  • If the operators INSERT

    must acquire a write lock, would it look useless because multiple processes cannot write to the same table at the same time?

  • I am using SELECT...LAST_INSERT_ID()

    (inside a TRANSACTION

    ). Will this logic break when multiple processes INSERT

    are entered into the database? I could create a new database connection for each fork, so I hope this avoids the problem.

  • How many processes should I use? The queries themselves are simple and I have a regular dual-core box with 2GB of RAM. I have set my InnoDB to use 8 threads ( innodb_thread_concurrency=8

    ), but I'm not sure if I should use 8 processes or if this is even the correct way to think of a match.

Thank you for your help!

+2


source to share


2 answers


1) yes there will be a lock conflict, but innodb is designed to handle multiple threads trying to insert. of course they won't be inserted at the same time, but will handle the serialization of the inserts for you. just make sure you close your transactions on purpose and do it ASAP. this will ensure maximum insertion performance.

2) no, this logic will not interrupt assuming you have 1 connection per thread, since last_insert_id () is connection specific.



3) this is one of those things that you just have to check to understand. Actually, I would make the program self-tuning. run 100 inserts with 8 threads and record the execution time. then try again with half and double. whichever is faster then compare more thread count values ​​around that number.

in general, you should always just go ahead and test things like this to find out which is faster. during the time you need to think about it and write it, you probably already have provisional numbers.

+4


source


There is a discussion in the MySQL documentation about inserting large numbers of records efficiently. It looks like the clear winner is using the LOAD DATA INFILE command followed by inserts that insert multiple lists of values.



+7


source







All Articles