IsMyTextCreepy.com is an experiment designed to demonstrate whether an Artificial Neural Network (ANN) can learn subtleties of human language that even humans have trouble defining and identifying (e.g. "creepiness").
The problem formulation is binary classification, where we train the network to classify texts as either "creepy" or "normal". "Creepy" training examples were collected from /r/creepyPMs, an online community where users submit screenshots of creepy private messages. Submissions were scraped and converted from screenshots into text. "Normal" training examples were collected from non-spam examples in the SMS Spam Dataset.
The specific type of ANN used is a Convolutional Neural Network (CNN). CNNs are typically used for machine learning tasks involving images. However, in Character-level Convolutional Networks for Text Classification, Zhang et al. use them to classify texts. What's interesting is that the input is at the character level, instead of the word level (as in most approaches to natural language processing). This has the advantage that "abnormal character combinations such as misspellings and emoticons may be naturally learnt." This maps well to our data, in which misspellings and emoticons are common.