AccScience Publishing / AC / Online First / DOI: 10.36922/AC026140017
ARTICLE

Talking slide avatars: Open-source multimodal communication approach for teaching

Xinxing Wu1*
Show Less
1 School of Mathematics and Computer Science, College of Business, Engineering and Technology, Kentucky State University, Frankfort, Kentucky, United States of America
Received: 30 March 2026 | Revised: 13 May 2026 | Accepted: 15 May 2026 | Published online: 5 June 2026
© 2026 by the Author(s). This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution -Noncommercial 4.0 International License (CC-by the license) ( https://creativecommons.org/licenses/by-nc/4.0/ )
Abstract

Slide-based teaching is widely used in higher education. Yet, in online, hybrid, and asynchronous contexts, slides often lose instructor presence, narrative continuity, and expressive framing that help learners connect with course content. A full lecture video can partly restore these qualities, but it is time-consuming to record, revise, and reuse. This study presents a practice-based implementation and analytic reflection of an open-source workflow for creating talking slide avatars. The workflow integrates OpenVoice for text-to-speech and authorized voice-style conversion with Ditto-TalkingHead for audio-driven talking-image synthesis, enabling instructors to transform a short script and an authorized or synthetic portrait image into a narrated video for slide decks or HyperText Markup Language-based lecture materials. Rather than treating this workflow only as a technical solution, the study frames talking slide avatars as multimodal communication artifacts at the intersection of digital pedagogy, aesthetic education, and art–technology practice. The paper documents the production pipeline, analyzes communicative and aesthetic affordances, and proposes practical guidelines for script length, image selection, pacing, disclosure, accessibility, consent, and ethical use. Its contribution is not a validated learning intervention but an educator-oriented open-source production model and communication design framework. The study concludes that short, transparent, and carefully designed avatars may provide a reusable communication layer for introductions, transitions, reminders, and recaps when used selectively and with appropriate ethical safeguards.

Keywords
Artificial intelligence avatar
Multimodal communication
Instructional video
Art and technology
Higher education
Talking head synthesis
Funding
None.
Conflict of interest
The author declares no competing interests.
References
  1. Baker JP, Goodboy AK, Bowman ND, Wright AA. Does teaching with PowerPoint increase students’ learning? A meta-analysis. Comput Educ. 2018;126:376-387. doi: 10.1016/j.compedu.2018.08.003
  2. Chávez HD, Ramón CP, Castelló TA. Patterns of PowerPoint use in higher education: a comparison between the natural, medical, and social sciences. Innov High Educ. 2020:45(1):65- 80. doi: 10.1007/s10755-019-09488-4
  3. Li W, Wang W. The impact of teaching presence on students’ online learning experience: Evidence from 334 Chinese universities during the pandemic. Front Psychol. 2024;15:1291341. doi: 10.3389/fpsyg.2024.1291341
  4. Polat H. Instructors’ presence in instructional videos: A systematic review. Educ Inf Technol. 2023;28(7):8537-8569. doi: 10.1007/s10639-022-11532-4
  5. Lawson AP, Mayer RE, Adamo-Villani N, Benes B, Lei X, Cheng J. The positivity principle: Do positive instructors improve learning from video lectures?. Educ Technol Res Dev. 2021;69(6):3101-3129. doi: 10.1007/s11423-021-10057-w
  6. Guo PJ, Kim J, Rubin R. How video production affects student engagement: An empirical study of MOOC videos. In: Proceedings of the First ACM Conference on Learning at Scale Conference. Assoc Comput Mach. 2014;41-50. doi: 10.1145/2556325.2566239
  7. Polat H, Taş N, Kaban A, Kayaduman H, Battal A. Human or humanoid animated pedagogical avatars in video lectures: The impact of the knowledge type on learning outcomes. Int J Hum Comput Interact. 2025;41(14):8912-8927. doi: 10.1080/10447318.2024.2415762
  8. Anttonen R, Kristian K, Eija R, Carita K. Storifying instructional videos on online credibility evaluation: Examining engagement and learning. Comput Hum Behav. 2024;161:108385. doi: 10.1016/j.chb.2024.108385
  9. Dai L, Jung MM, Postma M, Louwerse MM. A systematic review of pedagogical agent research: Similarities, differences and unexplored aspects. Comput Educ. 2022;190:104607. doi: 10.1016/j.compedu.2022.104607
  10. Atkinson RK. Fostering social agency in multimedia learning: Examining the impact of an animated agent’s voice. Contemp Educ Psychol. 2005;30(1):117-139. doi: 10.1016/j.cedpsych.2004.07.001
  11. Wang N, Johnson WL, Mayer RE, Rizzo P, Shaw E, Collins H. The politeness effect: Pedagogical agents and learning outcomes. Int J Hum Comput Stud. 2008;66(2):98-112. doi: 10.1016/j.ijhcs.2007.09.003
  12. Nass C, Steuer J, Tauber ER. Computers are social actors. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM; 1994:72-78. doi: 10.1145/191666.191703
  13. Wu X. Singing syllabi with virtual avatars: enhancing student engagement through AI-generated music and digital embodiment. arXiv. 2025;2508:11872. doi: 10.48550/arXiv.2508.11872
  14. Fink MC, Robinson SA, Ertl B. AI-based avatars are changing the way we learn and teach: Benefits and challenges. Front Educ. 2024;9:1416307. doi: 10.3389/feduc.2024.1416307
  15. Qin Z, Zhao W, Yu X, Sun X. OpenVoice: Versatile instant voice cloning. arXiv. 2023;2312:01479. doi: 10.48550/arXiv.2312.01479
  16. Li T, Zheng R, Yang M, Chen J, Yang M. Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis. In: Proceedings of the 33rd ACM International Conference on Multimedia. ACM; 2025:9704-9713. doi: 10.1145/3746027.3755075
  17. Sun A, Zhang X, Ling T, Wang J, Cheng N, Xiao J. Pre- Avatar: An automatic presentation generation framework leveraging talking avatar. In: Proceedings of the2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE; 2022:1002-1006. doi: 10.1109/ICTAI56018.2022.00153
  18. Uğraş H, Uğraş M, Papadakis S, Kalogiannakis M. ChatGPT-supported education in primary schools: The potential of ChatGPT for sustainable practices. Sustainability. 2024;16(22):9855. doi: 10.3390/su16229855
  19. Arkün-Kocadere S, Çağlar-Özhan Ş. Video lectures with AI-generated instructors: Low video engagement, same performance as human instructors. Int Rev Res Open Distrib Learn. 2024;25(3):350-369. doi: 10.19173/irrodl.v25i3.7815
  20. Duester E, Zhang R. Digital and AI transformation in the contemporary art industry in China. Arts Commun. 2025;3(2):3822. doi: 10.36922/ac.3822
  21. Ramos-Vallecillo N, Murillo-Ligorred V. The phenomenon of artificial intelligence-generated images in university teacher training and its impact on developing critical thinking. Arts Commun. 2025;3(3):5047. doi: 10.36922/ac.5047
  22. Zhao B, Zhan D, Zhang C, Su M. Computer-aided digital media art creation based on artificial intelligence. Neural Comput Appl. 2023;35(35):24565-24574. doi: 10.1007/s00521-023-08584-z
  23. Holmes W, Miao F. Guidance for generative AI in education and research. UNESCO Publishing. 2023.
  24. Bender S. Generative-AI, the media industries, and the disappearance of human creative labour. Media Pract Educ. 2025;26(2):200-217. doi: 10.1080/25741136.2024.2355597
  25. Zhou E, Dokyun L. Generative artificial intelligence, human creativity, and art. PNAS Nexus. 2024;3(3):pgae052. doi: 10.1093/pnasnexus/pgae052
  26. Bomba F, Antonella DA. Agency and authorship in AI art: Transformational practices for epistemic troubles. International J Hum Comput Stud. 2025;205:103652. doi: 10.1016/j.ijhcs.2025.103652
  27. Egon K, Russell J, Julia R. AI in Art and Creativity: Exploring the Boundaries of Human-Machine Collaboration. OSF Preprints. 2023. doi: 10.31219/osf.io/g4nd5
  28. Hsu TWL. Online Art Therapy: Reimagining Body, Place, Object and Relations in the Digital Era. Doctoral dissertation, Goldsmiths, University of London. 2024.
  29. Katalin F. Exploring AI media. Definitions, conceptual model, research agenda. J Media Bus Stud. 2024;21(4):340- 363. doi: 10.1080/16522354.2024.2340419
  30. Tao Z, Liu Y, Qiu J, Li S. Impact of virtual avatar appearance realism on perceptual interaction experience: a network meta-analysis. Front Psychol. 2025;16:1624975. doi: 10.3389/fpsyg.2025.1624975
  31. Mori M, MacDorman KF, Kageki N. The uncanny valley [from the field]. IEEE Robot Autom Mag. 2012;19(2):98-100. doi: 10.1109/MRA.2012.2192811
  32. Mayer RE. Evidence-based principles for how to design effective instructional videos. J Appl Res Mem Cogn. 2021;10(2):229-240. doi: 10.1016/j.jarmac.2021.03.007
Share
Back to top
Arts & Communication, Electronic ISSN: 2972-4090 Published by AccScience Publishing