• Login
    View Item 
    •   etd@IISc
    • Division of Electrical, Electronics, and Computer Science (EECS)
    • Electrical Communication Engineering (ECE)
    • View Item
    •   etd@IISc
    • Division of Electrical, Electronics, and Computer Science (EECS)
    • Electrical Communication Engineering (ECE)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Generalizable No-Reference Image Quality Assessment: Multi-Modal Models and Human Preference Analysis for AI Generated Images

    View/Open
    Thesis full text (3.006Mb)
    Author
    Totate, Sanjot Sagar
    Metadata
    Show full item record
    Abstract
    One of the major challenges in no-reference (NR) image quality assessment (IQA) is the ability to generalize to diverse quality assessment applications. Recently, multi-modal vision-language models have been found to be very promising in this direction. They are beginning to form a part of several state-of-the-art NR IQA methods. On the other hand, multi-modal large language models (LLMs) are increasingly being studied for various computer vision applications including IQA. In this work, we perform a thorough study of the ability of multi-modal LLMs for NR IQA by training some of its components and testing for its generalizability. In particular, we keep the LLM frozen and learn parameters corresponding to the querying transformer, the LLM prompt, and some layers that process the embedding output by the LLM. We observe that some of these components offer a generalization performance far superior to any existing NR IQA algorithm. With the rapid emergence of artificial intelligence (AI)-generated images, there is also a need to understand human preferences of these images. We explore the fundamental dimensions of AI generated image quality assessment, particularly the relationship between alignment (how well images match their text prompts) and quality (both low-level artifacts and high-level structural coherence). We analyze how these dimensions interact and contribute to the overall perceived quality, examining whether separate assessment of alignment and quality yields better results than holistic evaluation approaches. Through comparative analysis of existing and novel assessment models, we provide insights into effective strategies for evaluating AI-generated images.
    URI
    https://etd.iisc.ac.in/handle/2005/6967
    Collections
    • Electrical Communication Engineering (ECE) [404]

    etd@IISc is a joint service of SERC & J R D Tata Memorial (JRDTML) Library || Powered by DSpace software || DuraSpace
    Contact Us | Send Feedback | Thesis Templates
    Theme by 
    Atmire NV
     

     

    Browse

    All of etd@IIScCommunities & CollectionsTitlesAuthorsAdvisorsSubjectsBy Thesis Submission DateThis CollectionTitlesAuthorsAdvisorsSubjectsBy Thesis Submission Date

    My Account

    LoginRegister

    etd@IISc is a joint service of SERC & J R D Tata Memorial (JRDTML) Library || Powered by DSpace software || DuraSpace
    Contact Us | Send Feedback | Thesis Templates
    Theme by 
    Atmire NV