Pan Tilt Camera

I have decided to build a new head project encompassing a pan and tilt camera along with stereo microphones for directional hearing. The whole thing will also be wireless so it can be attached to any android project in the future.

I will be addressing the issue of getting the video and audio away to a remote computer for processing and also getting the control signal back to the device to move the camera etc. The remote computer could be replaced by a person for teleoperation, but I am also going to attempt to write some machine vision software to autonomously control the camera to follow movement or different coloured objects.

Firstly I have built a simple pan/tilt assembly for my camera from two servos. The camera appears to be a CMOS device which has power in and composite video out. The brackets are made from aluminium:


If you wanted to build something similar to this but not make it wireless then any PC webcam would do which is USB etc. As I am making this wireless I need some way of getting the video through the air. So, I bought this ‘video sender’ pair which consists of a 2.4Ghz transmitter and a receiver. They will transmit and receive composite video and also stereo audio:


This is a consumer device which you would buy if you wanted to watch cable / satellite in your spare bedroom, so the picture quality is pretty good. As I’m not going to use it for watching TV I need a way of getting the video into my PC. I found this USB video-in device in PC-World for 30GBP, it has composite (and S-Video) in as well as stereo audio:


So, next I made the actual head ‘frame’ which will hold everything at the android end. It will need to have the video sender, camera assembly, and some electronics mounted in it as well as the ears at a later date. Here’s the video sender in it’s place:


the top on:


and the camera mounted in place:



The whole thing is plywood and 6mm threaded steel. There is plenty of space for the electronics at the back (where the brain is is my head), and also space for the camera to move around:


So how will it work?

The handy thing about this video sender pair is that it allows the signal from your remote control to be sent ‘back the other way’ from the receiver to the transmitter. This allows you to pause/play/change channel on whatever device you are watching (from your spare bedroom). I will be using this handy feature to control the servos (and anything else) remotely from the computer.

The finished system will look something like this, the yellow arrows are signals from the sensors towards the computer, green arrows are the control signals to move the camera etc:


Now, the video sender manual says that the ‘sending IR’ feature doesn’t work with certain cable remote controls (for some reason), so I’m assuming that it doesn’t just relay what ever is recieved to the other end and transmit it again. To be safe I’m going to stick to using Sony IR remote control codes.

This is easy using a ‘picaxe’ microcontroller, it is also easy to read and write RS232. Converting data from my computers serial port and transmitting IR should be a few lines of code, the other end will be similar but the other way around – reading IR and converting to RS232.

The best resource for this is where there is documentation and very helpful forums. I may detail the code I used later once I’ve done it but the psuedo code is somethig like:

-read RS232 string to variable
-transmit variable as IR
-go back to the beginning

and then the other way around at the other end… simple right? and the picaxe supports a full Sony IR set of codes from 0 to 128 or something.

Now, it’s all very well transmitting numbers one after the other, but how will the servo controller know what to do with them when they get there?. After all, there are two (and potentially more) servos to control, I need to be able to address each servo and also send the desired position for it to turn to.

A bit more code

Ok, so on my receiving picaxe I need a bit more code as well the loop chucking out RS232 values. It will look something like this:


Using this method, a different subroutine will get called depending on what servo number is sent first of all (1 or 2). Then it will wait for the position value to turn the servo to, and output it to the relevant servo. If at any point a zero is sent it will go back to the beginning – like a ‘global reset’. Oh, and did I mention that the picaxe can control standard hobby R/C servos directly?

So it’s looking good, the only real problem to solve is writing some vision software…

Some machine vision software – the easy way

Firstly, this project will use a Windows computer, although I will be using the Python programming language which will also run on Linux. However, I’m using DirectX to get the video in, so if you want it to run on Linux instead you’ll need to investigate ‘Video for Linux’.

Now here is an important link: Python webcam fun

I came across this link while trying to solve this problem, I have modified some of the code to make it do what I want and I’ve also written some other examples to demonstrate other things like just displaying the webcam image on the screen and also colour detection.

In case the linked article disappears for any reason, the things you will need to install to make it work are as follows. All of them are open source / free / GPL so nothing to pay:

The Python programming language
VideoCapture for Python
The Python imaging library
Pygame – for displaying the graphics

All are pretty easy to install on Windows, the latest version as of writing is Python 2.4. There are plenty of Python books and tutorials out there, just google for ‘python tutorial’

Some example code

First of all, these examples only work if the webcam is set to 320×240 by default, any bigger or smaller and it will fail. The Python Imaging Library (PIL) has a scale function built in so you could modify the code. I have tried my ‘Creative USB Webcam Pro NX’ and also the video-in device pictured above which both work. I tried a Logitech Quickcam but I couldn’t get it to stay at 320×240 through the driver so it failed. Of course the webcam / video device needs it’s driver installed and needs to be seen by Windows etc.

Just paste the following code into a test file called ‘’ and double click it…

So, in this first example the webcam image is just displayed on the screen. The code is here:

In the next one I nicked the code from the other article linked above, but I made extra sections to detect movement so I ended up with the image divided into 9 sections. The results appear like this, each square is coloured in red when movement is detected:


The code is here:

Colour detection

The problem with detecting motion is that lots of things may move within the ‘scene’ as people / objects move around. So I decided to narrow it down to only detecting red objects. This way, I can wave something red in front of the camera and have it only pay attention to that, rather than my arm and body moving at the same time. Obviously it could be any colour but I happened to have something red nearby.

First I tried just removing the green and blue from the image and using just the red for the motion detection. But then I remembered that white and purple contain red too, I tried thesholding to detect only large amounts of red, but white contains large amount of red, green and blue so the results looked as follows:


You can see the wall reflecting white(ish) light. The code, if you want it is here:

At first I was quite confused because the red thing (my USB key) appeared black in the image – it wasn’t red enough to go above the threshold. So, I wrote a ‘colour counting’ program that counts how many pixels are red in the image. As I waved my USB key in front of the cam the number got higher, horray!:


The code for that is here:

But then I realised that I didn’t actually have to display the filtered image, I just had to count the red and replace the original motion detection algorithm with the ‘colour counting’ algorithm. This made things easier, and also meant it wouldn’t actually have to detect motion – just red objects and colour the square in red where it saw red. As the object moves from square to square the camera can follow which ever square is flagged.

However, white still contains red so I had to think about what makes things look red. Answer: more red than green and blue by about 50%.

So I just counted up the red, green and blue in each square and said if red is greater than green x1.5 and greater than blue x1.5 then ‘this is red’. This works well, below you can see me waving the red USB key again – only it’s covered by the coloured in red squares:


The final code is here:

Next steps

Next I will need to output the data of which ever square is red to the serial port as RS232, this can be done with PySerial

I’ll also need to build the picaxe circuits at each end, but as far as the Python software goes it will just output the relevant data as well as colouring the square in red. Then the camera can move a ‘little bit’ left, right, up or down unless only the centre square detects red (or no squares detect red).

I will also write software to read the audio from the stereo ears and then some loosely termed ‘AI’ software to decide what to do with all the data. For instance, it will have to decide what to do in the event a sound is heard in one direction but an object moves in the other direction.


A change of direction

I have now fitted the head to Mr Stick Legs, he also now has a portable PC running Linux inside him. So, I no longer need to use the ‘IR remote control extending’ feature of the video sender to transmit data to the android. I still plan to have a remote control for some features of Mr Stick Legs but the majority of the control will be over IP with Wi-Fi for the android. Here’s the head which now has various electronics added to it as detailed in the main Mr Stick Legs article:


So that’s pretty much all I have to write about the head on it’s own. I may add ears, in which case I will update this article. The programming will also be detailed in a separate article…