Grounding Dino L: ИИ для поиска объектов по описанию

Q: Кто разработал Grounding Dino L?

Модель Grounding Dino L разработана компанией Tsinghua University,International Digital Economy Academy,Hong Kong University of Science and Technology (HKUST),Chinese University of Hong Kong (CUHK),Microsoft Research,South China University of Technology (China,Hong Kong,Hong Kong,United States of America,China).

Q: Какие задачи решает Grounding Dino L?

Детекция объектов, Image captioning

// задачи

Детекция объектовImage captioning

// описание

Grounding Dino L объединяет детекцию объектов с глубоким пониманием естественного языка. Эта ИИ-модель способна находить на снимках любые предметы, просто следуя текстовым инструкциям или сложным описаниям пользователя. Настоящий прорыв в области мультимодального AI, стирающий границы между компьютерным зрением и лингвистикой.

// abstract

In this paper, we present an open-set object detector, called Grounding DINO, by marrying Transformer-based detector DINO with grounded pre-training, which can detect arbitrary objects with human inputs such as category names or referring expressions. The key solution of open-set object detection is introducing language to a closed-set detector for open-set concept generalization. To effectively fuse language and vision modalities, we conceptually divide a closed-set detector into three phases and propose a tight fusion solution, which includes a feature enhancer, a language-guided query selection, and a cross-modality decoder for cross-modality fusion. While previous works mainly evaluate open-set object detection on novel categories, we propose to also perform evaluations on referring expression comprehension for objects specified with attributes. Grounding DINO performs remarkably well on all three settings, including benchmarks on COCO, LVIS, ODinW, and RefCOCO/+/g. Grounding DINO achieves a AP on the COCO detection zero-shot transfer benchmark, i.e., without any training data from COCO. It sets a new record on the ODinW zero-shot benchmark with a mean AP. Code will be available at \url{https://github.com/IDEA-Research/GroundingDINO}.

// faq

Что такое Grounding Dino L?+

Кто разработал Grounding Dino L?+

Какие задачи решает Grounding Dino L?+

// похожие модели

Physical Intelligence

5.3B