0

我正在 Detectron2 上构建一个以蘑菇为主题的项目。预测工作正常,我现在正在尝试生成具有预测区域(该区域的所有 XY 坐标)的图像的类似 COCO 的注释。为此,我需要做两件事:

  • 检索预测形状/区域的 XY 坐标并“缩小”以仅保存主要边缘(以避免保存太多数据点)
  • 将“已保存”点重新绘制到主图像上,以供用户判断是否保存了足够多的点

不幸的是,我在这两个方面都失败了。在第一点上,我(认为)我与二进制 numpy 对象相同,但我对它的大小感到惊讶并且我没有设法将它转换为 XY 坐标集

关于第二点,我收到一个错误,我无法弄清楚如何调试:

/usr/local/lib/python3.6/dist-packages/google/colab/patches/__init__.py in cv2_imshow(a)
     20       image.
     21   """
---> 22   a = a.clip(0, 255).astype('uint8')
     23   # cv2 stores colors as BGR; convert to RGB
     24   if a.ndim == 3:

AttributeError: 'cv2.UMat' object has no attribute 'clip'

我正在使用的代码部分在这里:

from detectron2.utils.visualizer import ColorMode

## Predicts some random image
dataset_dicts = get_all_mushroom_dicts(mushroom_categories, "mushroom_dataset_small/val")
d = random.sample(dataset_dicts, 1)  
im = cv2.imread(d["file_name"])
mushroom_outputs = mushroom_predictor(im)
v = Visualizer(im[:, :, ::-1],
               metadata=mushroom_metadata, 
               scale=0.8, 
               instance_mode=ColorMode.IMAGE_BW   # remove the colors of unsegmented pixels
)

instances = mushroom_outputs["instances"].to("cpu")
mush_out = v.draw_instance_predictions(instances)
image = mush_out.get_image()[:, :, ::-1]

masks = np.asarray(instances.pred_masks)
print ("NP array shape", masks.shape)
print("Image array shape", image.shape)

print("Type before", type(image))
cv2_imshow(image) ## works fine

## ?? How to get the coordinates of the boundaries? And then take only some of them

## ??

## Assuming that (128, 128) is one of these coordinates for now
image2 = cv2.circle(image, (128, 128), 10, (255, 0, 0), 20)
print("Type after", type(image2))
cv2_imshow(image2) ## crashes

仅供参考,我也尝试找到并绘制轮廓,但这似乎不起作用(请参阅我的帖子https://github.com/facebookresearch/detectron2/issues/1702#event-3501434732)。

你有什么线索吗?

谢谢!

4

1 回答 1

0

我设法找到了一个可行的解决方案,但其中一部分并不优雅:

检索 XY 坐标:

masks = np.asarray(instances.pred_masks)
cnt, heirarchy = cv2.findContours(masks[0].astype("uint8"), cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE)
border = cnt[0]
gap = 20 # take only every 20 point
for i in list(range(border.shape[0]))[0::gap]:
  y = int(border[i][0][1]*0.8)
  x = int(border[i][0][0]*0.8)
  print("New XY coordinates", (x,y))

在图片上绘制:

image_old = mush_out.get_image()[:, :, ::-1]
cv2.imwrite("img.png", image_old)
image = cv2.imread("img.png",cv2.IMREAD_COLOR) # Still no clue why this is needed but doesnt work if you dont save and re-read...
radius = 3
color = (255, 0, 0)
cv2.circle(image, (x, y), radius, color, -1) # my previous code was wrong; cv2.circle returns void and edits directly the image 
cv2_imshow(image) 
于 2020-07-02T07:53:34.203 回答