You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks again for your research! The position relation idea is very smart. I've had success improving DINO/ DDQ / Align DETR baselines with Relation DETR for images with <300 objects similar to COCO.
I work on counting trees in aerial images, and am having trouble training models for dense object detection, where image chips can have >1500 objects, e.g. below
In these cases, Relation DETR causes OOM errors since an attention matrix between every object has to be constructed. Any thoughts on how to improve training for dense objects?
Additional
No response
The text was updated successfully, but these errors were encountered:
Hi @JohnMBrandt Thanks for your question.
For dense object detection, relation computation will consume too much memory. So far, I haven't found a good solution to optimize it. But you can use torch.utils.checkpoint in pytorch to wrap position_relation_embedding as a workaround. It can save memory by recalculating the variables in the back propagation during training.
Question
Thanks again for your research! The position relation idea is very smart. I've had success improving DINO/ DDQ / Align DETR baselines with Relation DETR for images with <300 objects similar to COCO.
I work on counting trees in aerial images, and am having trouble training models for dense object detection, where image chips can have >1500 objects, e.g. below
In these cases, Relation DETR causes OOM errors since an attention matrix between every object has to be constructed. Any thoughts on how to improve training for dense objects?
Additional
No response
The text was updated successfully, but these errors were encountered: