Symfony2 / Doctrine make $ statement-> execute () not "buffer" all values

I have a basic set of codes like this (inside a controller):

$sql = 'select * from someLargeTable limit 1000';
$em = $this->getDoctrine()->getManager();
$conn = $em->getConnection();
$statement = $conn->prepare($sql);
$statement->execute();

      

My difficulty is that when there are only a few records in the result set, the memory usage is not that bad. I have repeated some debug information before and after executing $ statement-> execute (); parts of the code and found for my implementation that I have the following:

pre-execute... rowCount :: 0 memory: 49.614 MB
post-execute... rowCount :: 1000 memory: 50.917 MB

      

Moving from 1000 records to 10k, the difference in MB usage increases to 13 MB

pre-execute... rowCount :: 0 memory: 49.614 MB
post-execute... rowCount :: 10000 memory: 62.521 MB

      

In the end, fetching about 50K records, I get closer to the maximum memory allocation:

pre-execute... rowCount :: 0 memory: 49.614 MB
post-execute... rowCount :: 50000 memory: 114.096 MB

      

With this implementation, I cannot write a controller (or even a command, for that matter) that will allow me to get the CSV of the data. Sure, 50k + recordings sound a lot and the question asks the question, but that's not a problem.

My last question is, is it possible to tell DBAL / Connection or DBAL / Statement when they are executed to buffer the data inside SQL and not PHP in general. For example, if I have 10 million lines to only post the first 10k lines in PHP ... let me view them with @ statement-> fetch (); and when the cursor reaches the end of 10k, truncates the array and fetch the next 10k from the database?

+3


source to share


4 answers


NOTE. This answer is incorrect. I tried to remove it, but it won't go away because it is accepted. I noted this for the mod review, but they won't remove it. See other answers for better solutions. - lxg


Assuming your query can be represented in DQL, you can use an iterated query with DQL:



$i = 0;
$batchSize = 100; // try different values, like 20 <= $batchSize <= 1000
$q = $em->createQuery('SELECT e FROM YourWhateverBundle:Entity');
$iterableResult = $q->iterate();

foreach($iterableResult as $row)
{
    $entity = $row[0];

    // do whatever you need with the entity object

    if (($i % $batchSize) == 0)
    {
        $em->flush(); // if you need to update something
        $em->clear(); // frees memory. BEWARE: former entity references are lost.
    }
    ++$i;

    // if you only want to process a certain number of elements,
    // you can of course break the loop here.
}

      

http://docs.doctrine-project.org/en/2.0.x/reference/batch-processing.html

If you need to use your own query, you can work with LIMIT offsets to generate fragments of 1000 records, and try clearing the EntityManager in between. (Haven't tested this, but I would avoid native queries with Doctrine altogether.)

-2


source


I faced the same problem and wanted to share a possible solution. Most likely your DBAL is using the PDO library and its value PDO::MYSQL_ATTR_USE_BUFFERED_QUERY

is true, which means that all results in your query are cached on the mysql side and buffered in memory using PDO even if you never call $statement->fetchAll()

. To fix this, we just need to set PDO::MYSQL_ATTR_USE_BUFFERED_QUERY

to false, but DBAL doesn't give us a way to do that - its PDO connection class is protected without using a public method to get it, and it doesn't give us a way to use setAttribute on a PDO connection.

So, in situations like this, I just use my own PDO connection to save memory and speed things up. You can easily instantiate one with your doctrine dbb options like this:



$dbal_conn = $this->getDoctrine()->getManager()->getConnection();
$params = $dbal_conn->getParams();
$pdo_conn = new \PDO(
  'mysql:dbname='.$dbal_conn->getDatabase().';unix_socket='.$params['unix_socket'],
  $dbal_conn->getUsername(),
  $dbal_conn->getPassword()
);
$pdo_conn->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);

      

I am using unix sockets, but the host IP addresses can be used easily too.

+7


source


The selected answer is not correct and @ kroky's answer should be selected as correct.

The problem is Buffering against unbuffered requests .

Now it is not recommended to change the behavior for all requests, because:

If the full result set has not been received from the server, no further requests can be sent over the same connection.

Therefore, it should only be used when needed. Below is a complete working example with> 200k objects:

    $qb = ...->createQueryBuilder('p');

    $this
        ->em
        ->getConnection()
        ->getWrappedConnection()
        ->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);

    $query = $qb->getQuery();
    $result = $query->iterate();
    $batchSize = 20;
    $i = 0;
    foreach ($result as $product)
    {
        $i++;

        var_dump($product[0]->getSku());

        if (($i % $batchSize) === 0) {
            $this->em->flush();
            $this->em->clear(); // Detaches all objects from Doctrine!
        }
    }

      

This most likely needs some work.

+7


source


You can disable the query buffer by doctrine config options options

doctrine:
    dbal:
        # configure these for your database server
        driver: 'pdo_mysql'
        ...
        options:
            1000: false

      

0


source







All Articles