-
Ukkonen's Suffix Tree Algorithm Implemented in Python
I implemented Suffix Tree algorithm in Python in the past few days, you can refer to my github repository if you’re interested in. The famous tutorial on stackoverflow is a good start. If you pay enough attention to some details like state updating or suffix links managing, you can write the code by yourself for sure. When a lot of small operations are entangling together, the ancient trick flowchart is always a wise choice for assisting. Also a tree visualizer might help a lot with debugging for sure. In order to make the intervene of the processes easy to understand, we demonstrate the flowchart in this post.
-
TwEater: A Python bot for scraping more tweets and comments
Today, I submitted my way of scraping tweets on Twitter - TwEater on GitHub.
-
String algorithm: Suffix Tries
Trie
A trie is a tree representing a collection of strings with one node per common prefix. It is a natural tool to manage either a set or a map where keys are strings. In a trie tree:
-
String Algorithm: Bloom Filter
Just like hash table with multiple \((k)\) hash functions. The searching time complexity of bloom filter is \(O(k)\), just as that of hash table \(O(1)\). For hash table, in order to keep the the false positive rate sufficiently low, the array must be very large to make sure that only a small fraction of bits are set to 1. Whereas for Bloom filter, the fraction could be relatively big (~50%) if the number \(k\) (the number of hash functions) and \(m\) (the size of bit array) are properly set.
-
String Algorithms: KMP in python
KMP is fantastic. There are some practical ideology in its design.