Science Fair Project Encyclopedia
CRM114 is a program based upon a statistical approach for filtering email spam. While others have done statistical Bayesian filtering based upon the frequency of single word occurrences in email, CRM114 achieves a higher rate of spam recognition through creating hits based upon phrases up to five words in length. With this additional contextual recognition, it is one of the more accurate spam filters available. The author claims recognition rates as high as 99.87%.
As an example of pattern recognition software, CRM114 is a good example of machine learning accomplished with a reasonably simple algorithm. Source code in C is available through the external link.
At a deeper level, CRM114 is also a string pattern matching language, similar to grep. Used in this manner, it may be used for many other applications aside from detecting spam.
The name CRM114 is taken from the movie Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb. The CRM-114 discriminator was designed to not receive messages unless they were proceeded by the correct 3-digit code.
The contents of this article is licensed from www.wikipedia.org under the GNU Free Documentation License. Click here to see the transparent copy and copyright details