1

我有一个立体相机,上面有两个我在 Matlab 中使用的网络摄像头。我校准了相机,并得到了stereoParams。

然后,我希望用户能够选择图片中的一个点,并获得图像中真实世界的点位置。我知道为此我需要基线、焦距和像素差异。我有像素差异,但我如何获得基线和焦距?可以从 stereoParams 计算基线吗?

4

3 回答 3

1

我不熟悉 Matlab 立体相机校准功能,但一般来说,一旦你校准了每个相机,并找到基本矩阵,你应该能够做到以下几点:

  1. 将其中一张图像设置为参考并校正另一张图像,以便沿着图像中的水平线进行视差搜索
  2. 根据像素视差,您可以通过关系 z = fB/d 计算真实世界的深度,其中 f 是焦距,B 是基线,d 是视差。注意单位很重要!如果 d 以像素为单位,如果您希望 z 以基线为单位(例如厘米),则 f 也必须以像素为单位
  3. 基线是相机光学中心之间的距离。它应该可以从 matlab stereoParameters.translationofCamera2 获得
  4. 焦距是每个相机的内在参数。我假设上面的焦距相等,但是对于网络摄像头,这不能保证。您应该能够从 matlab cameraParameters.IntrinsicMatrix 中提取焦距。焦距与内在矩阵中的 alpha 参数有关(有关解释,请参阅此Wikipedia 条目)
于 2015-02-23T00:10:36.210 回答
0

The "pixel" disparity is defined in rectified image coordinates. However, as your real cameras will not normally be exactly parallel and row-aligned, there is a non-identity transformation that rectifies your input camera images. Therefore you need to "undo" the rectification in order to find the pixel in the other image corresponding to a given one. The procedure is as follows:

  1. User selects a point in, say, the left image, giving you a pair of image coordinates (xl, yl).
  2. Apply the left rectification transform to them, obtaining their corresponding left rectified image coordinates. If you are using one of the common linear rectification methods, this is (xlr, ylr, wlr)' = Hlr * (xl, yl, 1)' , where Hlr is the left rectification homography.
  3. Look up the disparity at map at (xlr / wlr, ylr / wlr), obtaining the pixel's disparity value d (here I assume that your stereo algorithm yields a left-to-right disparity map for the X coordinate).
  4. The matching point in the right rectified image is then (xrr, yrr) = (d + xlr / wlr, ylr / wlr)
  5. Apply the inverse of the right rectification transform to get the corresponding pixel in right image coordinates (xr, yr, wr)' = Hrr^-1 * (xrr, yrr, 1)'

Note that all these operations need be performed once only for each pixel, and can be cached. In other words, you can pre-compute a "rectified" 2-channel disparity map that for each pixel yields an offset from its coordinates in one image to the corresponding pixel in the other image. The map itself can be stored as an image, whose channel type depends on the disparity range - usually short integer will be enough, as it can represent offsets of +- 32K pixels.

于 2015-02-23T21:28:22.243 回答
0

您可以使用reconstructScene函数,该函数将为您提供具有有效视差的每个像素的3D 世界坐标。在此示例中,您将查找检测到的人的质心的 3D 坐标。

于 2015-02-24T20:29:25.010 回答