Bot_U GitHub: https://github.com/ermrk/Bot_U

I am diving even deeper into chatbot technologies thanks to Alexa prize recently. I would like to share with you my fun side project which I created. It called Bot_U. It is chatbot created out of your chat history on Facebook. You can chat with the copy of yourself thanks to it. This project will basically give you immortality. Anybody will be able to talk with you even after your death and understand your ideas, plans, desires and dreams…

You can download its code from my GitHub https://github.com/ermrk/Bot_U. Follow steps from readme to make it running. You will have to download an archive of your Facebook chat logs and run one python script in a nutshell.

How does it work? BOT_U uses Python3 with scikit-learn’s TF-IDF. The Python script parses all messages send to you from Facebook logs, pair them with your answers to them, converts messages to vector representation and saves it bot’s memory. This happens before chatting with the chatbot.

What happens when the chatbot receives some message from a user? The user’s message is also converted to vector representation and is compared to all messages in bot’s memory by cosine similarity. The answer to the most similar message is returned.

I recommend tutorial http://blog.christianperone.com/2011/09/machine-learning-text-feature-extraction-tf-idf-part-i/ about the TF-IDF by Christian S. Perone.

2 responses

  1. Hi,
    Thanks so much for this code! I’m just getting started with chatbots and turns out you had exactly the same idea as me!
    I’m trying to use your code to make my own chatbot built from message logs.
    The only problem is when I try to run the code python throws this error.

    Traceback (most recent call last):
    Loading dataset of size 0
    File “/Users/liampower/Desktop/Liam/facebook-eyeslikebutter/html/main.py”, line 36, in
    sklearn_representation = sklearn_tfidf.fit_transform(messages)
    File “/Users/liampower/PycharmProjects/test/venv/lib/python3.6/site-packages/sklearn/feature_extraction/text.py”, line 1352, in fit_transform
    X = super(TfidfVectorizer, self).fit_transform(raw_documents)
    File “/Users/liampower/PycharmProjects/test/venv/lib/python3.6/site-packages/sklearn/feature_extraction/text.py”, line 839, in fit_transform
    self.fixed_vocabulary_)
    File “/Users/liampower/PycharmProjects/test/venv/lib/python3.6/site-packages/sklearn/feature_extraction/text.py”, line 781, in _count_vocab
    raise ValueError(“empty vocabulary; perhaps the documents only”
    ValueError: empty vocabulary; perhaps the documents only contain stop words

    It seems like the parser can’t find any of the messages and is trying to use an empty json array.
    Do you have any tips for me?
    Thanks
    Liam

Leave a Reply to Petr Marek Cancel reply

Your email address will not be published. Required fields are marked *