Toronto Metropolitan University
Browse
Korchevskiy, Mikhail.pdf (22.45 MB)

An Attempt at Defining And Quantifying Image Describability Through Semantic Connection Between Visual and Language

Download (22.45 MB)
thesis
posted on 2024-05-06, 19:12 authored by Mikhail Korchevskiy

One of the most challenging tasks of modern artificial intelligence systems is image captioning, the task requiring a machine to adequately comprehend the semantic content of visual data and correctly map it to a description within the language domain. Generally, to achieve acceptable performance, a learning system is presented with human-generated ground truth captions as a target to aim for. While significant progress has been achieved in creating highly functional image captioning systems, not much research has been focused on exploring the nature of the ground truth itself. In this thesis, such ground truth captions are analyzed in an attempt to find the semantic connection between visual data and associated language data describing it, revealing potential insights on human judgement and getting closer to defining and quantifying an abstract notion of image “describability”; the extent to which an image can be adequately described using language.

History

Language

English

Degree

  • Master of Science

Program

  • Computer Science

Granting Institution

Ryerson University

LAC Thesis Type

  • Thesis

Thesis Advisor

Dr. Nariman Farsad and Dr. Neil Bruce

Year

2022

Usage metrics

    Computer Science (Theses)

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC