Video Posted on Updated on
Ever since reading Cynthia Breazeal’s book, “Designing Sociable Robots“, I’ve had this constant itch to implement her visual attention model on a robot, mainly the Nao as there’re four of them laying around in the lab these days. So, suffice to say that I’ve finally gotten around to scratching this particular itch, and boy does it feel good! 🙂
So, if you haven’t already read this book (and if you work in social robotics, shame on you), I highly recommend it! It’s full of lots in interesting insights and thoughts, and it is a sure read for any new MSc/PhD students that might be embarking on their research journeys.
To get to the point, in one of the chapters, Breazeal describes the vision system running on Kismet. This is actually something that was developed by Brian Scassellati (whilst working on “Cog”, if I recall), and I must say, I think that it is a little gem (hence why I wanted to see it run on the Nao). The model is intended to make the robot attend to things that it can see in the environment (e.g. things that move, people, objects, colours, ect) using basic visual features. Basically a bottom-up approach to visual processing: take lots of basic, simple features, and combine “upwards” to something that is more complex.
I’ve finally implemented the model, from scratch and made it run using either a Desktop webcam, or using it with an Aldebaran Nao. This little personal project also holds a more serious utility. I’m now beginning to make an online portfolio of my coding skills as I have seen some employers request example code recently (and I’m currently on a job hunt). I’ve made two YouTube videos of the model. The first is it running on my Desktop machine in the lab, where I talk through the model and the parameters that drive it. In the second video I show the slightly adapted version running with a Nao. Here are those two videos:
I have to admit that there is certainly room for improvement and fine tuning in the parameter settings, as well as some nice extensions. For example I had a bit of trouble as there is quite a lot of red in our office and the robot was immediately drawn to this. Either I need to change the method for attention point selection, or I need to take distance into account in some way (but there isn’t and RGBD sensor on the Nao at the moment). Currently for attention point selection I am finding all the pixels that share the same max value in the Saliency Map and finding the Center of Mass of the largest connected region of these. Alas in the videos this was sometimes background items…
Talking about possible extensions, I certainly see alot of room to have an adaptive mechanism that provides the “Top Down” task orientated control of the feature weights (at least) as was done with Kismet. There are a small subset of the different parameters driving the model and finding values that work can be a little tricky. Furthermore, I suspect as soon as you change setting, you will need to tweak parameters again.
Coding this system up also made me think about the blog post I wrote a about what a robot should do out of the box. I recall that the Nao was doing at least face detection and tracking. I pondered the idea of whether this kind of model would work as on out of the box program. Rather than having fixed weights, the robot could have some pre-set modes (as Kistmet did) and just cycle through these at different intervals. Perhaps the biggest problem will be the onboard processing that would need to happen. My program is multi-threaded (each feature map is computed in it’s own thread, as is the Nao motor control) and isn’t exactly computationally cheap, and so I can see it using quite a bit of the processing resources.
Anyway, there are lots of possibilities with this model both with respect to tweaking it, extending it, and merging it with other “modules” that do other things. As such, I’ve made the code available to download:
Desktop + webcam version (needs Qt SDK, OpenCV libs and ArUco libs): Link
Version for the Nao (needs Qt SDK, OpenCV libs, ArUco libs and NaoQi C++ SDK, v 1.14.5 in my case): Link
Note: With the NaoQi SDK, this isn’t free. You need to be a Developer and I have access through the Research Projects at Plymouth University. I can’t provide you with the SDK as this would go against the agreement we have with Aldebaran… Sorry… 😦