Import CSV in batches of lines in Rails?

I am using FasterCSV to import the uploaded file into the model and it works great for small files. However, when I try to import a large dataset (21,000 rows) it takes a long time and I get browser timeouts on the real server.

This is my current working code:

  logcount=0
  Attendee.transaction do
    FCSV.new(file, :headers => true).each do |row|
      row[1] = Date.strptime(row[1], '%m/%d/%Y')
      record = @event.attendees.new(:union_id => row[0], :dob => row[1], :gender => row[2])
      if record.save
        logcount += 1
      end
    end
  end

      

I would like to use a background process, but the user needs to see how many rows have been imported before they can proceed to the next step of the system.

So I thought I should use action chunking and only read fewer lines, set a counter and then refresh, view with some progress, then run the method again using the previous counter as the starting point.

I can't seem to see how to get the FasterCSV to only read a certain number of lines, and also set an offset for the starting point.

Does anyone know how to do this? Or is there a better way to handle this?

+2


source to share


3 answers


Have you tried using AR extensions for bulk import? You get impressive performance improvements when you insert 1000 rows into the database. Visit their website for more details.



+2


source


I would rather create a prepared statement, load a line from a file, and execute the prepared statement. It should be faster without using this model.



0


source


If you have a database, why not import it via a Rake task? Will your users import such large databases?

If your users import such a large database, the task will fail.

FCSV.new can accept any variation of IO.open. You can use this to find a specific byte. Unfortunately FCSV does not allow you to stop or access the underlying I / O object to find out where you left off. Resuming in the middle of the file also makes the header line difficult to use.

Indeed, I believe the optimal solution is to pass your CSV import to drb, which periodically reports that it is progressing in a way that the controller action might take. Then call this controller action so often with some AJAX running on the client.

I have had success with BackgroundDRb in the past. This setup and usage is too detailed for me to reproduce here. There are other plugins and gems out there with fewer searches.

DRB Caveat . Most DRb solutions require an additional daemon to run on your server. some web hosts prohibit this on more basic plans. Check your TOS

0


source







All Articles