An efficient way to showcase a ton of JSON on Heroku

I have built a simple API with one endpoint. It dumps files and currently has about 30,000 entries. Ideally, I would like to get all of these records in JSON with a single HTTP call.

Here is my Sinatra view code:

require 'sinatra'
require 'json'
require 'mongoid'

Mongoid.identity_map_enabled = false

get '/' do
  content_type :json
  Book.all
end

      

I have tried the following using multi_json with

require './require.rb'
require 'sinatra'
require 'multi_json'
MultiJson.engine = :yajl

Mongoid.identity_map_enabled = false

get '/' do
  content_type :json
  MultiJson.encode(Book.all)
end

      

The problem with this approach is that I am getting R14 error (memory quota exceeded). I am getting the same error when I try to use the "oj" gem.

I would just concentrate the whole long line of Redis, but the Heroku redis service is $ 30 / month for the instance size I need (> 10mb).

My current solution is to use a background task that creates objects and fills them with full jsonified objects near the Mongoid object size limit (16mb). Problems with this approach: it still takes almost 30 seconds to render, and I have to start post-processing in the receiving application in order to properly extract the json from the objects.

Does anyone have a better idea of ​​how I can display json for 30k records in one call without switching from Heroku?

+1


source to share


1 answer


It sounds like you want to pass JSON directly to the client, not create it in memory. This is probably the best way to reduce memory usage. For example you can use yajl

to directly encode JSON in a stream.

Edit: I have rewritten all the code for yajl

because its API is much more attractive and allows for much cleaner code. I've also included an example for reading the answer in chunks. Here's a streaming JSON array helper I wrote:

require 'yajl'

module JsonArray
  class StreamWriter
    def initialize(out)
      super()
      @out = out
      @encoder = Yajl::Encoder.new
      @first = true
    end

    def <<(object)
      @out << ',' unless @first
      @out << @encoder.encode(object)
      @out << "\n"
      @first = false
    end
  end

  def self.write_stream(app, &block)
    app.stream do |out|
      out << '['
      block.call StreamWriter.new(out)
      out << ']'
    end
  end
end

      

Using:

require 'sinatra'
require 'mongoid'

Mongoid.identity_map_enabled = false

# use a server that supports streaming
set :server, :thin

get '/' do
  content_type :json
  JsonArray.write_stream(self) do |json|
    Book.all.each do |book|
      json << book.attributes
    end
  end
end

      

To decode on the client side, you can read and parse the response in chunks, for example with em-http

. Note that this solution requires the client memory to be large enough to hold the entire array of objects. Here's the relevant streaming parser helper:

require 'yajl'

module JsonArray
  class StreamParser
    def initialize(&callback)
      @parser = Yajl::Parser.new
      @parser.on_parse_complete = callback
    end

    def <<(str)
      @parser << str
    end
  end

  def self.parse_stream(&callback)
    StreamParser.new(&callback)
  end
end

      

Using:

require 'em-http'

parser = JsonArray.parse_stream do |object|
  # block is called when we are done parsing the
  # entire array; now we can handle the data
  p object
end

EventMachine.run do
  http = EventMachine::HttpRequest.new('http://localhost:4567').get
  http.stream do |chunk|
    parser << chunk
  end
  http.callback do
    EventMachine.stop
  end
end

      

Alternative solution



You could have simplified the whole thing by eliminating the need to generate a "correct" JSON array. What is generated above is JSON in this form:

[{ ... book_1 ... }
,{ ... book_2 ... }
,{ ... book_3 ... }
...
,{ ... book_n ... }
]

      

However, we could pass each book as a separate JSON and thus reduce the format to the following:

{ ... book_1 ... }
{ ... book_2 ... }
{ ... book_3 ... }
...
{ ... book_n ... }

      

Then the code on the server will be much :

require 'sinatra'
require 'mongoid'
require 'yajl'

Mongoid.identity_map_enabled = false
set :server, :thin

get '/' do
  content_type :json
  encoder = Yajl::Encoder.new
  stream do |out|
    Book.all.each do |book|
      out << encoder.encode(book.attributes) << "\n"
    end
  end
end

      

Like the client:

require 'em-http'
require 'yajl'

parser = Yajl::Parser.new
parser.on_parse_complete = Proc.new do |book|
  # this will now be called separately for every book
  p book
end

EventMachine.run do
  http = EventMachine::HttpRequest.new('http://localhost:4567').get
  http.stream do |chunk|
    parser << chunk
  end
  http.callback do
    EventMachine.stop
  end
end

      

The great thing is that now the client does not have to wait for a complete answer, but instead parses each book separately. However, this will not work if one of your clients is expecting one large JSON array.

+4


source







All Articles