深度视觉中有关图像projection的代码改写cv2.remap() → F.grid_sample() | Numpy+cv2格式改为PyTorch格式_民生

深度视觉中有关图像projection的代码改写cv2.remap() → F.grid_sample() | Numpy+cv2格式改为PyTorch格式

创始人

2024-04-20 02:58:50

0次

Cv2 remap in pytorch?

Numpy+cv2实现的代码迁移到PyTorch上往往不怎么需要改动，直接把np换成torch即可，但cv2.remap()函数是个特殊例子，该函数通过xy两个数组重新采样图像，可以用来实现投影变换（warp，projection），在torch中与之对应的是torch.nn.functional.grid_sample() 函数，但用法上有着一些不同。

以下以我的一个实际代码片段例子来直观介绍torch版本的代码重写。

我的任务是将ref视点的ref_img and ref_depth 投影到另一个src视点。

Numpy 和 cv2 风格的代码:

def reproject_with_depth(img_ref, depth_ref, intrinsics_ref, extrinsics_ref, intrinsics_src, extrinsics_src):width, height = depth_ref.shape[1], depth_ref.shape[0]x_ref, y_ref = np.meshgrid(np.arange(0, width), np.arange(0, height))x_ref, y_ref = x_ref.reshape([-1]), y_ref.reshape([-1])xyz_ref = np.matmul(np.linalg.inv(intrinsics_src),np.vstack((x_ref, y_ref, np.ones_like(x_ref))) * depth_ref.reshape([-1]))xyz_src = np.matmul(np.matmul(extrinsics_ref, np.linalg.inv(extrinsics_src)),np.vstack((xyz_ref, np.ones_like(x_ref))))[:3]K_xyz_src = np.matmul(intrinsics_ref, xyz_src)xy_src = K_xyz_src[:2] / K_xyz_src[2:3]x_src = xy_src[0].reshape([height, width]).astype(np.float32)y_src = xy_src[1].reshape([height, width]).astype(np.float32)sampled_depth_src = cv2.remap(depth_ref, x_src, y_src, interpolation=cv2.INTER_LINEAR)sampled_img_src = cv2.remap(img_ref, x_src, y_src, interpolation=cv2.INTER_LINEAR)return sampled_depth_src, sampled_img_src

翻译成 torch 风格后的代码：

def reproject_with_depth(img_ref, depth_ref, intrinsics_ref, extrinsics_ref, intrinsics_src, extrinsics_src):B, width, height = depth_ref.shape[0], depth_ref.shape[2], depth_ref.shape[1]y_ref, x_ref = torch.meshgrid([torch.arange(0, height, dtype=torch.float32, device=depth_ref.device), torch.arange(0, width, dtype=torch.float32, device=depth_ref.device)])y_ref, x_ref = y_ref.contiguous(), x_ref.contiguous()x_ref, y_ref = x_ref.reshape([-1]), y_ref.reshape([-1])# reference 3D spacexyz_ref = torch.matmul(torch.inverse(intrinsics_src),torch.stack((x_ref, y_ref, torch.ones_like(x_ref))).unsqueeze(0).repeat(B, 1, 1) * depth_ref.reshape([B, 1, -1]))xyz_src = torch.matmul(torch.matmul(extrinsics_ref, torch.inverse(extrinsics_src)),torch.cat([xyz_ref, torch.ones_like(x_ref).unsqueeze(0).repeat(B,1,1)], dim=1))[:,:3]K_xyz_src = torch.matmul(intrinsics_ref, xyz_src)xy_src = K_xyz_src[:, :2] / K_xyz_src[:, 2:3]x_src = xy_src[:, 0].reshape([B, height, width]).float()y_src = xy_src[:, 1].reshape([B, height, width]).float()grid = torch.stack((x_src/((width-1)/2)-1, y_src/((height-1)/2)-1), dim=3)sampled_depth_src = F.grid_sample(depth_ref.unsqueeze(1), grid.view(B, height, width, 2), mode='bilinear', padding_mode='zeros').squeeze(1)sampled_img_src = F.grid_sample(img_ref, grid.view(B, height, width, 2), mode='bilinear', padding_mode='zeros')return sampled_depth_src, sampled_img_src

一些核心翻译原则需要遵守的（可能会给你带来困惑的）是：

torch版的代码要考虑 batch B这个维度，因此诸如切片等操作记得先把0-dim考虑进去，例如 [:, idx]
torch版代码需要考虑device ，一般可以另新创建的tensor的dtype为输入参数的dtype
F.grid_sample()的坐标采样范围是[-1, 1]，而cv2.remap() 直接使用的是像素坐标尺度，因此需要在x轴/((width-1)/2)-1，在y轴/((height-1)/2)-1) 来缩放坐标系

词库加载错误:未能找到文件“E:\highferrum_mysql\Configuration\Dict_Stopwords.txt”。

上一篇：Linux权限的理解

下一篇：Python对json的操作总结

深度视觉中有关图像projection的代码改写cv2.remap() → F.grid_sample() | Numpy+cv2格式改为PyTorch格式

相关内容

热门资讯