Faster R-CNN: Down the rabbit hole of modern object detection

113 points by vierja 8 years ago · 10 comments

Reader

Does anyone try to get accurate bounding boxes (rotation, correct angle) with these object detection models? Or does the greatly harden the problem?

electrograv 8 years ago

That’s exactly what Faster-RCNN does. Edit: Except for rotation — they are axis aligned bounding boxes.
Mask-RCNN (more recent) takes it a step further and also generates a per-object pixel segmentation mask, which is even better than a bounding box obviously. For that reason, Mask-RCNN is much more exciting to me, and incredibly impressive if you see examples showing what it can do.
That said, “under the hood” of Mask-RCNN are still axis aligned 2D bounding boxes for every object (and this occasionally creates artifacts when a box is erroneously too small and crops off part of an object). IMO we need to somehow get away from these AABBs, but right now methods that use them simply work the best.

nicodjimenez 8 years ago

Object detection is an interesting failure for deep learning. Systems such as these perform well but whenever you have something like non max suppression at the end you are bound to get hard to fix errors. I'm more optimistic about deep mask and similar pixel wise approaches as well as using RNNs to generate a list of objects from an image.

swframe2 8 years ago

I saw this today: https://github.com/facebookresearch/Detectron

nnq 8 years ago

wansn't R-CNN already superseded by YOLO[1]? didn't read the article, but no mention of it to compare itself to, so seems outdated maybe.

anyone had the time to dig deeper into this?

[1] https://pjreddie.com/media/files/papers/yolo.pdf

eggie5 8 years ago

Tradeoffs: RCNN has better accuracy. YOLO is faster.
- pilooch 8 years ago
  
  rcnn is two steps and ssd is single step.
bitL 8 years ago

Take a look at SSD instead; it seems to be more precise than YOLO and a bit faster. R-CNN variants are usually 10x slower than either of these two.

BillyParadise 8 years ago

Is this what they use for self driving cars?

bitL 8 years ago

Faster R-CNN gives you only like 5fps on high-end GPU, so answer is no.

Settings

Faster R-CNN: Down the rabbit hole of modern object detection

Keyboard Shortcuts