Bot_U GitHub: https://github.com/ermrk/Bot_U
I am diving even deeper into chatbot technologies thanks to Alexa prize recently. I would like to share with you my fun side project which I created. It called Bot_U. It is chatbot created out of your chat history on Facebook. You can chat with the copy of yourself thanks to it. This project will basically give you immortality. Anybody will be able to talk with you even after your death and understand your ideas, plans, desires and dreams…
You can download its code from my GitHub https://github.com/ermrk/Bot_U. Follow steps from readme to make it running. You will have to download an archive of your Facebook chat logs and run one python script in a nutshell.
How does it work? BOT_U uses Python3 with scikit-learn’s TF-IDF. The Python script parses all messages send to you from Facebook logs, pair them with your answers to them, converts messages to vector representation and saves it bot’s memory. This happens before chatting with the chatbot.
What happens when the chatbot receives some message from a user? The user’s message is also converted to vector representation and is compared to all messages in bot’s memory by cosine similarity. The answer to the most similar message is returned.
I recommend tutorial http://blog.christianperone.com/2011/09/machine-learning-text-feature-extraction-tf-idf-part-i/ about the TF-IDF by Christian S. Perone.
Hi,
Thanks so much for this code! I’m just getting started with chatbots and turns out you had exactly the same idea as me!
I’m trying to use your code to make my own chatbot built from message logs.
The only problem is when I try to run the code python throws this error.
Traceback (most recent call last):
Loading dataset of size 0
File “/Users/liampower/Desktop/Liam/facebook-eyeslikebutter/html/main.py”, line 36, in
sklearn_representation = sklearn_tfidf.fit_transform(messages)
File “/Users/liampower/PycharmProjects/test/venv/lib/python3.6/site-packages/sklearn/feature_extraction/text.py”, line 1352, in fit_transform
X = super(TfidfVectorizer, self).fit_transform(raw_documents)
File “/Users/liampower/PycharmProjects/test/venv/lib/python3.6/site-packages/sklearn/feature_extraction/text.py”, line 839, in fit_transform
self.fixed_vocabulary_)
File “/Users/liampower/PycharmProjects/test/venv/lib/python3.6/site-packages/sklearn/feature_extraction/text.py”, line 781, in _count_vocab
raise ValueError(“empty vocabulary; perhaps the documents only”
ValueError: empty vocabulary; perhaps the documents only contain stop words
It seems like the parser can’t find any of the messages and is trying to use an empty json array.
Do you have any tips for me?
Thanks
Liam
Hello. There is a new issue on Github. There might be some connection https://github.com/thePetrMarek/Bot_U/issues/1
However, I would bet that there is some problem with chat_parser.py. Is there anything in the file data.json? It should contain message/response pairs.