To build a conversation-analyzing AI, we needed to fit a form to several functions. Something that could cling. Something that could both utilize and contain 2 cameras, a Matrix Voice, and speakers. Something cute and unobtrusive you wouldn’t mind sitting on your shoulder all day. What better than a chameleon? (They’ve already got a judgy expression, why not put that to use?) Thus Chameo was born. After many design iterations, we decided on a streamlined chameleon-based look that retained 100% of that classic chameleon charisma.Assembly
- Tetris who? We’re playing a whole new game now. We’ve got batteries. We've got the Matrix Voice. We’ve got Raspberrys of the Pi variety (Zero W and Model 3 A). Make a beautiful sandwich of AI components.
Front view of the innards, with the Matrix Voice on top and servo facing forward
- Spot weld the batteries together. No one needs loose innards wiggling around. And to that effect, glue those pis and circuit boards together.
- Where should Chameo talk? Out of its butt of course. Connect a speaker to the audio amplifier board, which is connected to the Pi 3 A+.
- To keep your companion robot happy and charged, install a convenient micro USB port on the side.
Right side view of the innards, with batteries visible on the bottom and charge port facing forward
- Chameo's body was built from 5mm thick EVA foam. Patterns for the tail, body, and head were drawn up and cut from a foam sheet, then heated (with the ol' heat gun) to retain their shape, and rubber cemented.
The pattern used for Chameo's chunky little body case
- A base to fit Chameo onto the user's shoulder was made in Fusion 360 and 3D printed. With the help of a hot glue gun, magnets were affixed to the base, giving Chameo the staying power a true friend should have.
- The beauty of this chameleon buddy lies in its listening skills. In order to give the Matrix Voice a chance to hear, holes were cut from the sides of the body. This has the added effect of revealing the Matrix Voice's LEDs in action. Beautiful indeed.
Right side view of Chameo's body, complete with stylish holes to allow the passage of sound
Back view of the elegant speaker hole
- Once paint was applied to the foam, the electronic guts were placed inside the body and head shells.
Clingy in the best way, with the help of magnets
- At this point, Chameo is ready to put its sweet coding to use and listen to your conversations with a discerning ear. It picks up on key phrases to let you know what kinds of tones you're using and how you're coming across. No more wondering how a conversation went or fretting days later about how you sounded during that meeting, phone call, or casual chat.
In this enticing link you will find:
- app.py, a server that hosts the sentiment endpoint, where the Snips Assistant goes to get the ratings of phrases Chameo picks up (mentioned in more detail below)
- formConvo.py, a program that configures the dialogue into data that is usable for the sentiment endpoint (As you will be installing this app directly from the console, this program serves as the source code.)
- readme.md, a handy guide to the order in which Chameo's dandy software should be set up
- sockTest.py, an experiment for interfacing with the direction-of-sound service
How it Works
- Asking Chameo, "How did I come across?" will activate the Snips Assistant. Responding to the keywords "I" and "come across", Snips will query the sentiment endpoint.
- From there, the sentiment endpoint program will pull out the last 1000 words or so out of the database, and separate words based on the direction they originated from to determine who was speaking. In this case, only the user's words will be analyzed because of the "I" keyword.
- Using the textblob library, phrases are rated on a scale of 1 to -1. If/else statements determine negativity, positivity, or neutrality, and then the Snips assistant conveys that to the user verbally. Thus Chameo will be able to provide you with the answers you need.
- In the scope of this tutorial, the cameras were not put into play, but for future iterations, would be used to track the speakers in a conversation and potentially to analyze emotional information from faces.