About 50 results
Open links in new tab
  1. VQA: Visual Question Answering

    VQA is a new dataset containing open-ended questions about images. These questions require an understanding of vision, language and commonsense knowledge to answer.

  2. VQA: Visual Question Answering

    Please follow the instructions in the README to download and setup the VQA data (annotations and images). By downloading this dataset, you agree to our Terms of Use.

  3. VQA: Visual Question Answering

    Please follow the instructions in the README to download and setup the VQA data (annotations and images). By downloading this dataset, you agree to our Terms of Use.

  4. VQA: Visual Question Answering

    The VQA v2.0 train, validation and test sets, containing more than 250K images and 1.1M questions, are available on the download page. All questions are annotated with 10 concise, open-ended answers …

  5. VQA: Visual Question Answering

    The easy-VQA dataset is a beginner-friendly way to get started — a “Hello World” for VQA. It contains 5k simple, geometric images and 48k questions with only 13 possible answers.

  6. VQA: Visual Question Answering

    TextVQA: This track is the 3rd challenge on the TextVQA dataset introduced in Singh et al., CVPR 2019. TextVQA requires models to read and reason about text in an image to answer questions based on …

  7. VQA: Visual Question Answering

    Home People Code Demo Download VQA v2 VQA v1 Evaluation Challenge 2021 2020 2019 2018 2017 2016 Browse VQA v2 VQA v1 Visualize Workshop 2021 2020 2019 2018 2017 2016 Sponsors …

  8. VQA: Visual Question Answering

    Results Format Overview This page describes the results format used by the VQA evaluation code.

  9. VQA: Visual Question Answering

    This workshop will provide an opportunity to benchmark algorithms on VQA v2.0 and to identify state-of-the-art algorithms that need to truly understand the image content in order to perform well on this …

  10. We collected a new dataset of “realistic” abstract scenes to enable research focused only on the high-level reasoning required for VQA by removing the need to parse real images.