Requires regex for all dynamic category elements in node.js

I am working on node.js with regex. I did the following:

 Category 1.2
 Category 1.3 and 1.4
 Category 1.3 to 1.4
 CATEGORY 1.3

      

Regular expression

((cat|Cat|CAT)(?:s\.|s|S|egory|EGORY|\.)?)( |\s)?((\w+)?([.-]|(–)|(—))?(\w+))(\s+(to|and)\s(\w+)([.-]|(–)|(—))(\w+))?

      

However, I need the regex to match the following lines as well:

 Category 1.2, 1.3 and 1.5
 Category 1.2, 4.5, 2.3 and 1.6
 Category 1.2, 4.5, 2.3, 4.5 and 1.6
 Figure 1.2 and 1.4     - no need 

      

How can I find all elements of a category (1,2,4,5,2,3,4,5 and 1,6) dynamically? The category grows depending on the available category.

Note. No matching required Figure 1.2

.

Anyone help me. Thanks in advance.

+3


source to share


2 answers


I suggest using a simplified version of the regex:

/cat(?:s?\.?|egory)?[ ]*(?:[ ]*(?:,|and|to)?[ ]*\d(?:\.\d+)?)*/gi

      

Watch the demo

If you need those hard spaces and en- and em-dashes, you can add them to the regex where necessary, for example:

/cat(?:s?\.?|egory)?[ —–\xA0]*(?:[ —–\xA0]*(?:,|and|to)?[  —–\xA0]*\d(?:\.\d+)?)*/gi

      



See another demo

Sample code:

    var re = /cat(?:s?\.?|egory)?[ —–\xA0]*(?:[ —–\xA0]*(?:,|and|to)?[  —–\xA0]*\d(?:\.\d+)?)*/gi; 
var str = 'Figure 1.2. Category 1.2 Figure 1.2. \nFigure 1.2.  Category 1.3 and 1.4 Figure 1.2. \nFigure 1.2.  Category 1.3 to 1.4 Figure 1.2. \nFigure 1.2.  CATEGORY 1.3 Figure 1.2. \n\nFigure 1.2.  Category 1.2, 1.3 and 1.5 Figure 1.2. \nFigure 1.2.  Category 1.2, 4.5, 2.3 and 1.6 Figure 1.2. \nFigure 1.2. Category 1.2, 4.5, 2.3, 4.5 and 1.6 Figure 1.2. \nFigure 1.2.  Category 1.3 β€” 1.4 Figure 1.2. \nFigure 1.2.  Category 1.3 β€“ 1.4 Figure 1.2. \nFigure 1.2.  Category  1.3 – 1.4 Figure 1.2. (with hard space)';
var m;
 
while ((m = re.exec(str)) !== null) {
    if (m.index === re.lastIndex) {
        re.lastIndex++;
    }
    document.write("<br>" + m[0]);
}
      

Run codeHide result


+1


source


Stopped trying to fix this problem and saw that stsibichev fixed it for you. Just want to share where I'm heading:

var lines = 
 'Category 1.2\n'+
 'Category 1.3 and 1.4\n'+
 'Category 1.3 to 1.4\n'+
 'CATEGORY 1.3\n'+
 'Category 1.2, 1.3 and 1.5\n'+
 'Category 1.2, 4.5, 2.3 and 1.6\n'+
 'Category 1.2, 4.5, 2.3, 4.5 and 1.6\n'+
 'Figure 1.2 and 1.4\n'

document.write(lines.replace(/^(?!category).*$/igm, '').match(/(\d+\.\d+)/gm));

      

This snippet removes all lines that do not contain the word "category" (lines like "Picture ...") - replace

- and then matches all categories (number is the full stop number) and gives them to an array.



I know your regex is much more complex than this, but it looks like what you asked and it is very simple ... Just share;)

Hello

0


source







All Articles