returns a list of all the unique words in a file in python -
write function takes 3 parameters, filename , 2 substrings, , returns list of unique words in file contain both substrings (in order first appear in file).
for example, unique words in previous sentence contains substring 'th' , 'at' ['that']. function should pass following doctests:
def words_contain2(filename, substring1, substring2): """ >>> words_contain2('words_tst.txt', 're', 'cu') ['recursively', 'recursive.'] >>> words_contain2('words_tst.txt', 'th', 'at') ['that'] >>> words_contain2('/usr/share/dict/words', 'ng', 'warm') ['afterswarming', 'hearthwarming', 'housewarming', 'inswarming', 'swarming', 'unswarming', 'unwarming', 'warming', 'warmonger', 'warmongering'] """ if __name__ == '__main__': import doctest doctest.testmod(verbose = true)
actually ive tried this:
def words_contain2(filename, substring1, substring2): files=open(filename,"r") files_read=files.read() filelist=files_read.split() sub1=substring1 sub2=substring2 count=0 result="" while count<len(filelist): if sub1 in filelist[count] , sub2 in filelist[count]: result = result + filelist[count]+"," count += 1 print result
but returns result recursively, recursively, recursive, recursively
in opinion, there 2 mistakes:
- i got string not list in result
- the question gives example doctest prints word in result list once. in file, same word might appear more 1 time.
i lost original file word_tst.txt
.
filtering list strings contain substring without maintaining uniqueness order way easy filter function
not_unique = filter(lambda x:str(x).__contains__(substring1) , str(x).__contains__(substring2), content.split())
but need create unique list order maintained
def words_contain2(filename, substring1, substring2): file_ = open(filename, "r") content = file_.read() not_unique = filter(lambda x:str(x).__contains__(substring1) , str(x).__contains__(substring2), content.split()) seen = set() return [x x in not_unique if not (x in seen or seen.add(x))]
Comments
Post a Comment