This study aims to design and validate an automatic adaptive caption generation system that addresses the diverse linguistic needs of users who are excluded from conventional video caption services, which typically provide only direct speech-to-text conversion. These users include sign language users with hearing impairments, foreigners learning Korean, and individuals with language development disorders—groups whose language usage is affected by physical, sociocultural, or cognitive challenges. We built a ‘lexical mapping database’ based on a hash table containing evidence-based objective data verified for each user group. Utilizing prompt engineering and fine-tuning techniques with transformer-based large language models (LLMs), we developed a system that automatically generates adaptive captions. Demonstrations using actual YouTube™ video contents confirmed the potential for universal applicability of this technology in generating adaptive captions for a variety of users in need of linguistic accommodation.
Design of an Automatic Adaptive Video Caption Generation Technology for Users in Need of Linguistic Accommodation
•
Leave a Reply