How do I populate the elastic search index from a text file?
I am planning to use an Elastic Search index to store a huge city database with ~ 2.9 milion records and use it as a search engine in my Laravel application.
The question is I have cities in my MySQL database and in a CSV file. The file is ~ 300 MB.
How can I import it into the index faster?
source to share
I solved this import using Logstash .
My import script is like this:
input {
file {
path => ["/home/user/location_cities.txt"]
type => "city"
start_position => "beginning"
}
}
filter {
csv {
columns => ["region", "subregion", "ufi", "uni", "dsg", "cc_fips", "cc_iso", "full_name", "full_name_nd", "sort_name", "adm1", "adm1_full_name", "adm2", "adm2_full_name"]
separator => " "
remove_field => [ "host", "message", "path" ]
}
}
output {
elasticsearch {
action => "index"
protocol => "http"
host => "127.0.0.1"
port => "9200"
index => "location"
workers => 4
}
}
This script will import the highlighted file without delimiters into an index named location
with type city
.
To run the script, you need to run it bin/logstash -f import_script_file
in the folder where you installed / extracted Logstash.
source to share
For efficiency, you need to use a large API and experiment with the block size for your data.
link to elasticsearch documentation for indexing large volume of documents (import)
If you are using python take a look at https://pypi.python.org/pypi/esimport/0.1.9
source to share