Javascript: frequency of counting emotions in text

I am trying to count the frequency of emojis in a block of text. For example:

"I love 🚀🚀🚀 so much 😍 " -> [{🚀:3}, {😍:1}]

      

To calculate the frequency of characters in a block of text, I use

function getFrequency(string) {
    var freq = {};
    for (var i=0; i<string.length;i++) {
        var character = string.charAt(i);
        if (freq[character]) {
           freq[character]++;
        } else {
           freq[character] = 1;
        }
    }

    return freq;
};

      

source: stackoverflow question/691984 / ...

^ The above code works fine, but it doesn't recognize emoji characters:

{ : 1,   : 3,   : 2}

      

Also, I would prefer the output to be a list of 1 json objects rather than one long json object.

+3


source to share


2 answers


You can use a function callback String.replace

and learn unicode RegExp

everything from Unicode blocks "Various characters" to "Wrap symbols for icons and maps" (0x1F300 to 0x1F6FF):

let str = "I love 🚀🚀🚀 so much 😍 ";

let freq = {};
str.replace(/[\u{1F300}-\u{1F6FF}]/gu, char => freq[char] = (freq[char] || 0) + 1);

console.log(freq);
      

Run codeHide result




If you prefer to avoid RegExp

or String.replace

, you can decompose the string into an array and reduce it to frequencies like this:

let str = "I love 🚀🚀🚀 so much 😍 ";

let freq = [...str].reduce((freq, char) => {
  if (char >= '\u{1F300}' && char < '\u{1F700}') freq[char] = (freq[char] || 0) + 1;
  return freq;
}, {});

console.log(freq);
      

Run codeHide result


+5


source


charAt

won't help you. for...of

will correctly parse a string in Unicode encodings, including those specified in the astral plane. We use character.length

to determine if this is an extra flat character. If you really want to know if it's an emoji, you'll need to tighten it up.

const input = "I love 🚀🚀🚀 so much 😍 ";
    
function getFrequency(string) {
  var freq = {};
  for (character of string) {
    if (character.length === 1) continue;
    if (freq[character]) {
      freq[character]++;
    } else {
      freq[character] = 1;
    }
  }
  return freq;
 };
 
 console.log(getFrequency(input));
      

Run codeHide result




To create an array of unambiguous objects, output through this:

function breakProperties(obj) {
  return Object.keys(obj).map(function(key) {
    var result = {};
    result[key] = obj[key];
    return result;
  });
}

      

+4


source







All Articles