How do I populate the elastic search index from a text file?

I am planning to use an Elastic Search index to store a huge city database with ~ 2.9 milion records and use it as a search engine in my Laravel application.

The question is I have cities in my MySQL database and in a CSV file. The file is ~ 300 MB.

How can I import it into the index faster?

+3


source to share


2 answers


I solved this import using Logstash .

My import script is like this:

input {
      file {
          path => ["/home/user/location_cities.txt"]
          type => "city"
          start_position => "beginning"
      }
}

filter {
    csv {
        columns => ["region", "subregion", "ufi", "uni", "dsg", "cc_fips", "cc_iso", "full_name", "full_name_nd", "sort_name", "adm1", "adm1_full_name", "adm2", "adm2_full_name"]
        separator => "  "
        remove_field => [ "host", "message", "path" ]
    }
}

output {
    elasticsearch {
        action => "index"
        protocol => "http"
        host => "127.0.0.1"
        port => "9200"
        index => "location"
        workers => 4
    }
}

      



This script will import the highlighted file without delimiters into an index named location

with type city

.

To run the script, you need to run it bin/logstash -f import_script_file

in the folder where you installed / extracted Logstash.

+6


source


For efficiency, you need to use a large API and experiment with the block size for your data.

link to elasticsearch documentation for indexing large volume of documents (import)



If you are using python take a look at https://pypi.python.org/pypi/esimport/0.1.9

0


source







All Articles