python - Matching 2 short descriptions and returning a confidence level -


i have data banks using yodlee , corresponding transaction messages on mobile. both have description in them - short descriptions.

for example -

string1 = "tatasky_tpsl mumba ind" string2 = "tatasky_tpsl" 

they can matched if 1 inside other. however, strings like

string1 = "t.g.i friday's" string1 = "tgi friday's mumba mah"  

still need matched. there y algorithm gives confidence level in matching 2 descriptions ?

you might want use normalized edit distance called levenstien distance levenstien distance wikipedia. after getting levenstien distance between 2 strings, can normalize dividing length of longest string (or average of 2 strings). normalised socre can act confidense. can find 4-5 python packages of calculating levenstien distance. can try online edit distance calculator

alternatively 1 simple solution algorithm called longest common subsequence, can used here


Comments

Popular posts from this blog

c++ - Difference between pre and post decrement in recursive function argument -

php - Nothing but 'run(); ' when browsing to my local project, how do I fix this? -

php - How can I echo out this array? -