[4th July, 11:30 ] Dr. Ramazan S. Aygun: Unsupervised Speaker Identification for TV News

Place: TH:A-1455

Television (TV) networks produce a tremendous amount of information every day. Identifying the speakers throughout a video would help to analyze and understand the video content. Previous research has usually identified speakers on pre-trained faces of famous people for TV shows and movies.
News videos are challenging because new faces (or people) often appear. By using an unsupervised method, this paper proposes to label speakers using just the available information in the news video without external information. Our proposed framework segments the audio by speaker, parses closed captions for speaker names, identifies talking persons, and performs optical character recognition for speaker names.
The presentation will show

  • how speaker diarizarion, face recognition, face landmarking, natural language processing, and optical character recognition tools can be effectively used utilized for speaker identification,
  • how speakers who are not famous could be recognized using different modalities, and
  • present results for identifying speakers for CNN news with overall accuracy of 63.6% including speakers just appearing once.

About the speaker

Dr. Ramazan S. Aygun is an Associate Professor in the Computer Science Department at the University of Alabama in Huntsville. 

He received his Ph.D. degree in computer science and engineering from University at Buffalo in 2003.
He has published or presented around 80 refereed international journal and conference papers in the areas of multimedia information retrieval,
multimedia systems, data analytics for protein crystallization, spatio-temporal modeling and retrieval, video panorama generation, computer vision,
video & image processing, and bioinformatics.
Dr. Aygun served as a program co-chair of a leading conference in multimedia systems (IEEE International Symposium on Multimedia 2012),
guest editor for a special issue of International Journal of Semantic Computing,
editorial review board of International Journal of Multimedia Data Engineering and Management, and
organizer of workshop on video panorama.
He served on the organizing committees of 16 conferences/workshops and program committees of 34 conferences/workshops.

Follow Us

Copyright (c) Data Science Laboratory @ FIT CTU 2014–2016. All rights reserved.