0

我正在尝试对视频输入进行 3D 场景重建和相机姿态估计,但是相机位置与我在视频中看到的不匹配。

这是我为恢复姿势和地标位置而编写的代码

    def SfM(self, points1, points2):
        x = 800 / 2
        y = 600 / 2

        fov = 80 * (math.pi / 180)
        f_x = x / math.tan(fov / 2)
        f_y = y / math.tan(fov / 2)

        # intrinsic camera matrix
        K = np.array([[f_x, 0, x],
                      [0, f_y, y],
                      [0, 0, 1]])

        #find fundamental matrix
        E, mask = cv2.findFundamentalMat(np.float32(points2), np.float32(points1), cv2.FM_8POINT)
        #get rotation matrix and translation vector
        points, R, t, mask = cv2.recoverPose(E, np.float32(points2), np.float32(points1), K, 500)
        
        #caculate the new camera position based on the translation, camPose is the previous camera position
        self.cam_xyz.append([self.camPose[0] + t[0], self.camPose[1] + t[1], self.camPose[2] + t[2]])

        #calculate the extrinsic matrix
        C = np.hstack((R, t))

        #calculate the landmark positions
        for i in range(len(points2)):
            #convert coordinates into a 3x1 array
            pts2d = np.asmatrix([points2[i][0], points2[i][1], 1]).T
            #calculate camera matrix
            P = np.asmatrix(K) * np.asmatrix(C)
            #find 3d coordinate
            pts3d = np.asmatrix(P).I * pts2d
            #add to list of landmarks
            self.lm_xyz.append([pts3d[0][0] * self.scale + self.camPose[0],
                                pts3d[1][0] * self.scale + self.camPose[1],
                                pts3d[2][0] * self.scale + self.camPose[2]])

        #update the previous camera position
        self.camPose = [self.camPose[0] + t[0], self.camPose[1] + t[1], self.camPose[2] + t[2]]

当我通过这个视频时,我得到了这个作为我的输出 运行我的程序的结果

我不明白为什么当摄像机只在视频中直行时它会向右转。我怀疑我cv2.recoverPose错误地实施了该方法,但我不知道我还能做些什么来使它变得更好。我将完整的代码放在PasteBin中,以防有人想要复制该程序。任何帮助将不胜感激。太感谢了!

4

1 回答 1

0

Shouldn't you calculate the essential matrix E with cv.findEssentialMatrix instead? In this way, you calculated the fundamental matrix F, but to recover the pose, you must pass E = K^T * F * K, w/ K = camera matrix

于 2021-07-07T12:43:18.930 回答