java - Regex: Find first occurence and map to canonical value -


i have input data this:

1996 caterpiller d6 dozer sale (john deere , komatsu too!)

i want match first brand name found , map canonical value.

here's map:

canonical  regex komatsu    \bkomatsu\b cat        \bcat(erpill[ae]r)?\b deere      \b(john )?deere?\b 

i can test brand in string:

/\b(cat(erpill[ae]r)?|(john )?deere?|komatsu)\b/i.exec(...) != null 

or first match was:

/\b(cat(erpill[ae]r)?|(john )?deere?|komatsu)\b/i.exec(...)[0]; //caterpiller 

but there fast or convenient way map first match real value want?

caterpiller => cat 

do need find first match, test against patterns in map?

i need 10,000+ inputs against 10,000+ brands :d

i loop the map, testing against input value, find first value appears in map, not input.

an idea consists associate number of capture group index in canonical name array. each different brand must have own number:

var can = ['', 'komatsu', 'cat', 'deere']; //             ^idx1      ^idx 2 ^idx 3 var re =/\b(?:(komatsu)|(cat(?:erpill[ae]r)?)|((?:john )?deere))\b/ig; //            ^ 1st grp ^ 2nd grp             ^ 3rd grp var text = '1996 caterpiller d6 dozer sale (john deere , komatsu too!)';  while ((res = re.exec(text)) !== null) {     (var i=1; i<4; i++) { // test each group until 1 defined         if (res[i]!= undefined) {             console.log(can[i] + "\t" + res[0]);             break;         }     } }  // result: // cat      caterpiller // deere    john deere // komatsu  komatsu 

Comments

Popular posts from this blog

c++ - Difference between pre and post decrement in recursive function argument -

c# - Retrieve google contact -

javascript - How to insert selected radio button value into table cell -