python - Find most common sub-string pattern in a file -

You have given a string:

  input_string = "" "HIYourName = this is No true HIYourName = a good day HIYourName = not HIYourName = goodbye! "" "

Find the most common sub-string in the file. Here its answer is" HiAranName = ". Note, the challenging part is that the HiYourName = string is not just "word" .e. It is not delineated from the surrounding place.

So, to make it clear, this is not the most common word problem.

Here is a simple toughness force solution:

  import from import Counter counter = "HIYourName = this is not true HIYourName = a nice day HIYourName = not HIYourName = goodbye!" For the category (1, in lane (s)): substr_counter = counter (s [i: i + n] for the class I (lenon (s) - n)), count = substr_counter.most_common (1) [0] ] If count == 1: # print print in the earliest days for the trivial cases 'Size:% 3d: Events:% 3d Phrase:% r'% (n, count, phrase)

The output for your sample string is:

  Size: 1: Events: 10 phrases: '' Size: 2: Events: 4 phrases: 'No' Size: 3: Events: 4 phrases: 'name' size: 4: events: 4 phrases: 'ourN' size : 5: Events: 4 phrases: 'HIYou' Size: 6: Events: 4 Phrases: 'Iiar' Size: 7: Events: 4 Phrases: 'urName =' Size: 8: Events: 4 Phrases: 'Hyerorn' Size : 9: Events: 4 Phrases: 'Hieyer Nomen' Size: 10: Events: 4 Phrases: 'HIYourNam' Size: 11: Events: 4 Phrases: 'HIYourName' Size: 12: Events: 4 Phrases: 'Hi, Your Name = 'Size: 13: Events: 2 Phrases:' E Hiirinnam = '

Search This Blog

Alcantara

python - Find most common sub-string pattern in a file -

Comments

Post a Comment

Popular posts from this blog

javascript - Render HTML after each iteration in loop -

java - Joda Time Interval Not returning what I expect -

python - Pandas concat gives error ValueError: Plan shapes are not aligned -