Difference between field name and .raw field name in ELK?
I have been experimenting with the ELK stack for a while, following several resources on the internet. But I haven't found an essential resource that clearly explains the difference between fieldname
and fieldname.raw
for a named field fieldname
.
There is nothing special in this context, but I tried to find this but no luck. The only primary understanding I have on this is the Kibana window (which I do not know how to reproduce, unfortunately), which says: fieldname
- the analyzed field. There was no such information regardingfieldname.raw
Another thing I noticed is that when I use Discoverfieldname.raw: "value"
in Kibana4 , it shows slightly more results than what I see . I couldn't see which ones were missing as I had 559 and 554 results respectively.fieldname: "value"
I'm guessing the suffix .raw
says what it means. It can be a field from the logs themselves without any Logstash intervention. But I want to make sure that this is what it means. If so, how (and more importantly, why?) Did I get fewer results in the analyzed field? Is there something Logstash is not doing correctly or is it some kind of misconfiguration? Any pointers are appreciated.
source to share
Each field in elasticsearch has a mapping that describes the type and how it is parsed for indexing.
By default, fields are strings and are parsed (punctuation removed, words are tokenized, etc.). For example, a field named "path" with:
/var/log/messages
will become
["var", "log", "messages"]
which means you can no longer search for the original string and any value in punctuation is lost.
This is a side effect of using a text engine for the log data.
Since every user of the statistics log hits this almost immediately, the logstash command has created a template that will set up a mapping for any index named "logstash - *".
This template defines a multicast field called "raw", which is specified as "not_analyzed". So, you have two elements in your index:
path: ["var", "log", "messages"]
path.raw: "/var/log/messages"
Very useful, especially for those who have previously contacted first users. You can use "path.raw" in kibana or other queries.
EDIT: short note on kibana: if you use a parsed field, it will create an item for each token, so you get a pie chart with slices for "var", "log" and "var", messages. "
Once you become more familiar with mappings and templates, you might consider making your main fields not_analyzed, thereby eliminating the need for ".raw" altogether. This will also allow you to use doc_values, which is another fun topic.
Good luck!
source to share