python - Matching 2 short descriptions and returning a confidence level -
i have data banks using yodlee , corresponding transaction messages on mobile. both have description in them - short descriptions.
for example -
string1 = "tatasky_tpsl mumba ind" string2 = "tatasky_tpsl"
they can matched if 1 inside other. however, strings like
string1 = "t.g.i friday's" string1 = "tgi friday's mumba mah"
still need matched. there y algorithm gives confidence level in matching 2 descriptions ?
you might want use normalized edit distance called levenstien distance levenstien distance wikipedia. after getting levenstien distance between 2 strings, can normalize dividing length of longest string (or average of 2 strings). normalised socre can act confidense. can find 4-5 python packages of calculating levenstien distance. can try online edit distance calculator
alternatively 1 simple solution algorithm called longest common subsequence, can used here
Comments
Post a Comment