I am currently working on a head motion tracking project using normal webcam. Face detection and LK tracking works fine but hope to build something more comprehensive. Is it possible to use the coordinates of the feature points in one frame to the next to estimate the 6 DOF (scaling, rotation) motion of a subject? Intuitively I think it is possible because we can assume that the feature points on all the frames fall on the same plane. I am looking to measure the pitch yaw and roll and scaling of the subject. Scaling is easy if we assume that the subject only experience translation (no pitch yaw roll) by just looking at the different in distance between any two given feature points. Pitch yaw roll and scaling becomes tricky when they all combine. I am trying to find out how Kinect capture motion because my guess is I can treat the feature points like the infra red pattern for Kinect.
Please advice on this. Need to find out the feasibility of this method before I try something else.
THanks, Kelvin