It depends of what error is acceptable for you.
We have similar setup where we have camera which looks at some plane with object on it that can be moved.
We assume that the image and plane are parallel.
First lets calculate the rotation. Put the tool in such position that you see it on the center of the image, move it on one axis select the point on the image that is corresponding to tool position.
Those two points will give you a vector in the image coordinate system.
The angle between this vector and original image axis will give the rotation.
The scale may be calculated in the similar way, knowing the vector length (in pixels) and the distance between the tool positions(in mm or cm) will give you the scale factor between the image and real world axis.
If this method won't provide enough accuracy you may calibrate the camera for distortion and relative position to the plane using computer vision techniques. Which is more complicated.
See the following links
http://opencv.willowgarage.com/documentation/camera_calibration_and_3d_reconstruction.html
http://dasl.mem.drexel.edu/~noahKuntz/openCVTut10.html