Zum Inhalt springen

„KI-Systeme und wie sie blinden Nutzern helfen“

    +++ SAVE THE DATE: Mittwoch, 02.04.2025 +++ Universitätsbibliothek Marburg +++

    Im April wird Dr. Kyle Keane von Der Univerität Bristol unsere Teams in Karlsruhe und Marburg besuchen. Am Mittwoch, den 02.04.2025 ab 14 Uhr hält er den Vortrag „KI-Systeme und wie sie blinden Nutzern helfen“ im Raum B013 der Universitätsbibliothek der Philipps-Universität Marburg (Deutschhausstraße 9, 35037 Marburg).

    Wir freuen uns über den Besuch und laden Interessierte herzlich dazu ein, der Veranstaltung beizuwohnen. Vortragssprache ist Englisch:

    The Importance of Spatial Reasoning in AI Systems to Help Blind Users and a Proposed Approach to Get There

    Recent advancements in AI include real-time video question-answering systems — that is, systems in which a user can point a camera and ask an AI system to describe what it is looking at and to give advice about how to interact with what it is interpreting. These systems have incredible potential as an augmentation tool for users that have visual impairments or are blind. However, there are presently limits in the capabilities of these systems that make them very dangerous and unreliable. One specific capability that present-day AI systems lack is spatial reasoning, which is the ability to understand the relative position of objects within a visual frame from other objects within the visual frame.

    There are communities of blind and low-vision individuals that have learned to use remote assistance from humans by showing them the world through their cell phone camera, getting important information spoken to them, and asking for interpretations of that information when needed. For certain classes of activities, it is essential that the spatial information and relative position of objects are described accurately relative to the camera frame, such as asking whether a doorway is to the left or the right of the camera frame.

    There are ways to augment the performance of these AI systems using sophisticated prompt engineering in order to give spatial information in a format that can be reasoned with by the AI system. The simple video stream may eventually inherently have spatial reasoning, but it presently does not, although it will answer questions as if it does.

    In this talk, we will explore current practical applications of modern multimodal AI tools that are being actively used by blind and low-vision individuals, examining their strengths and limitations in real-world scenarios. We will then cover some of these topics in more depth and outline the necessary research to surface the types of activities that require spatial reasoning that can be impactful for blind and low-vision users and developing trustworthy methods to augment their performance of such tasks.

    Dr. Kyle Keane, Website von Kyle Keane