I need to count the number of different lines where a word appears

Question

I need to count the number of different lines where a word appears

I still have

SELECT
    word, count(*)
FROM
    (SELECT
            regexp_split_to_table(ColDescription, '\s') as word
    FROM tblCollection
    ) a
GROUP BY word
ORDER BY count(*) desc

Which makes a good list of all words and how many times they appear in the entire description column.

I need a way to show how many times a word is in a string at least once.

For example, if my data was:

hello hello test 
hello test test test
test hi

he would show

word    count   # of rows it appears in
hello     3        2
test      5        3
hi        1        1

I am a very beginner with databases, any help is appreciated!

Example table:

CREATE TABLE tblCollection ( ColDescription varchar(500) NOT NULL PRIMARY KEY);

Sample data:

"hello hello test"
"hello test test test"
"test hi"

Each line is its own line.

+3

count postgresql

knames 15 oct. 14 at 10:43

source to share

1 answer

Nick barnes · Accepted Answer · 2014-10-15T11:46:46+0000

The main obstacle is that your subquery does not store any information about where it found each instance of the word. This is easily fixed:

SELECT
  regexp_split_to_table(ColDescription, '\s') as word,
  ColDescription
FROM tblCollection

You now have a source field listed along with each word, and it's just a matter of counting them:

SELECT
  word, count(*), count(distinct ColDescription)
FROM
...

I need to count the number of different lines where a word appears

More articles: