Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.
Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.
Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.
ORCID
https://orcid.org/0000-0003-0247-6830
Access Type
Open Access Thesis
Document Type
thesis
Degree Program
Electrical & Computer Engineering
Degree Type
Master of Science in Electrical and Computer Engineering (M.S.E.C.E.)
Year Degree Awarded
2021
Month Degree Awarded
May
Abstract
Lecture videos are good sources for people to learn new things. Students commonly use online videos to explore various domains. However, some recorded videos are posted on online platforms without being post-processed due to technology and resource limitations. In this work, we focus on the research of developing an intelligent system to automatically extract essential information, including the main instructor and screen, in a lecture video in several scenarios by using modern deep learning techniques. This thesis aims to combine the extracted essential information to render the videos and generate a new layout with a smaller file size than the original one. Another benefit of using this approach is that the users may save video post-processing time and costs. State-of-the-art object detection models, an algorithm to correct screen display, tracking the instructor, and other deep learning techniques were adopted in the system to detect both the main instructor and the screen in given videos without much of the computational burden.
There are four main contributions:
1. We built an intelligent video analysis and post-processing system to extract and reframe detected objects from lecture videos.
2. We proposed a post-processing algorithm to localize the frontal human torso position in processing a sequence of frames in the videos.
3. We proposed a novel deep learning approach to distinguish the main instructor from other instructors or audiences in several complex situations.
4. We proposed an algorithm to extract the four edge points of a screen at the pixel level and correct the screen display in various scenarios.
DOI
https://doi.org/10.7275/22450152.0
First Advisor
Lixin Gao
Second Advisor
Russell Tessier
Third Advisor
Michael Zink
Recommended Citation
Wang, Xi, "Lecture Video Transformation through An Intelligent Analysis and Post-processing System" (2021). Masters Theses. 1078.
https://doi.org/10.7275/22450152.0
https://scholarworks.umass.edu/masters_theses_2/1078