How to count Korean Word block using Unix / Linux commands?

Question

How to count Korean Word block using Unix / Linux commands?

Korean is composed of blocks of words (for example, 가, 나, 다 라, etc.). I need a way to count these blocks of words. For example, the word 바다 (sea) should return 2.but

wc -w

will return 1

wc -c

will return 7

Thus, these options will not work for me. I would appreciate your help.

+3

linux unix bash wc

Eungi kim May 05 '15 at 0:56

source to share

1 answer

Blender · Answer 1 · 2015-05-05T01:08:00+0000

바다

encoded as UTF-8 6 bytes long. If you want to count characters, use wc -m

:

$ printf "바다" | wc -c
       6
$ printf "바다" | wc -m
       2

How to count Korean Word block using Unix / Linux commands?

More articles: