Difference between field name and .raw field name in ELK?

I have been experimenting with the ELK stack for a while, following several resources on the internet. But I haven't found an essential resource that clearly explains the difference between fieldname

and fieldname.raw

for a named field fieldname

.

There is nothing special in this context, but I tried to find this but no luck. The only primary understanding I have on this is the Kibana window (which I do not know how to reproduce, unfortunately), which says: fieldname

- the analyzed field. There was no such information regardingfieldname.raw

Another thing I noticed is that when I use Discoverfieldname.raw: "value"

in Kibana4 , it shows slightly more results than what I see . I couldn't see which ones were missing as I had 559 and 554 results respectively.fieldname: "value"

I'm guessing the suffix .raw

says what it means. It can be a field from the logs themselves without any Logstash intervention. But I want to make sure that this is what it means. If so, how (and more importantly, why?) Did I get fewer results in the analyzed field? Is there something Logstash is not doing correctly or is it some kind of misconfiguration? Any pointers are appreciated.

+3


source to share


1 answer


Each field in elasticsearch has a mapping that describes the type and how it is parsed for indexing.

By default, fields are strings and are parsed (punctuation removed, words are tokenized, etc.). For example, a field named "path" with:

/var/log/messages

      

will become

["var", "log", "messages"]

      

which means you can no longer search for the original string and any value in punctuation is lost.

This is a side effect of using a text engine for the log data.



Since every user of the statistics log hits this almost immediately, the logstash command has created a template that will set up a mapping for any index named "logstash - *".

This template defines a multicast field called "raw", which is specified as "not_analyzed". So, you have two elements in your index:

path: ["var", "log", "messages"]
path.raw: "/var/log/messages"

      

Very useful, especially for those who have previously contacted first users. You can use "path.raw" in kibana or other queries.

EDIT: short note on kibana: if you use a parsed field, it will create an item for each token, so you get a pie chart with slices for "var", "log" and "var", messages. "

Once you become more familiar with mappings and templates, you might consider making your main fields not_analyzed, thereby eliminating the need for ".raw" altogether. This will also allow you to use doc_values, which is another fun topic.

Good luck!

+6


source







All Articles