Text Mining Cleanup with Ruby & Regex -

January 15, 2014

i have word count hash, following:

words = {   "love"   => 10,   "hate"   => 12,   "lovely" => 3,   "loving" => 2,   "loved"  => 1,    "peace"  => 14,   "thanks" => 3,   "wonderful" => 10,   "grateful" => 10   # there more idea }

i want make sure "love", "loved" & "loving" counted "love". adding counts count "love", , removing rest of variation of "love". however, @ same time, don't want "lovely" counted "love", preserving is.

so i'll in end.

words = [   "love"   => 13,   "hate"   => 12,   "lovely" => 3,   "peace"  => 14,   "thanks" => 3,   "wonderful" => 10,   "grateful" => 10   # there more idea ]

i have code sort of works, think logic of last line wrong. wonder if can me fix or suggest better way of doing this.

words.select { |k| /\alov[a-z]*/.match(k) } words["love"] = purgedwordcount.select { |k| /\alov[a-z]*/.match(k) }.map(&:last).reduce(:+) - 1 # 1 1 "lovely"; tried not hard code using words["lovely"], messed things completely, had this.  words.delete_if { |k| /\alov[a-z]*/.match(k) && k != "love" && k != "lovely" }

thanks!

words = {   "love"   => 10,   "hate"   => 12,   "lovely" => 3,   "loving" => 2,   "loved"  => 1,   "peace"  => 14,   "thanks" => 3,   "wonderful" => 10,   "grateful" => 10   # there more idea }  aggregated_words = words.inject({}) |memo, (word, count)|   key = word =~ /\alov.+/ && word != "lovely" ? "love" : word   memo[key] = memo[key].to_i + count   memo end  > {"love"=>13, "hate"=>12, "lovely"=>3, "peace"=>14, "thanks"=>3, "wonderful"=>10, "grateful"=>10}

Search This Blog

Lix

Text Mining Cleanup with Ruby & Regex -

Comments

Post a Comment

Popular posts from this blog

c++ - Difference between pre and post decrement in recursive function argument -

javascript - IE11 incompatibility with jQuery's 'readonly'? -

php - How can I echo out this array? -