Date of Award


Document type


Access Type

Open Access Dissertation

Degree Name

Doctor of Philosophy (PhD)

Degree Program

Electrical and Computer Engineering

First Advisor

Aura Ganz

Second Advisor

C. Mani Krishna

Third Advisor

Marco F. Duarte

Subject Categories

Computer Engineering | Electrical and Computer Engineering | Engineering


In this dissertation, we introduce a hybrid radio frequency and video framework that enables identity aware-augmented perception. Identity-aware augmented perception enhances users' perception of the surrounding by collecting and analyzing information pertaining to each identifiable or tractable target nearby aggregated from various sensors, and presents it visually or audibly augmenting users' own sensory perceptions. We target two application areas of disaster management and assistive technologies. Incident commanders and first responders can use the technology to perceive information specific to a victim, e.g. triage level, critical conditions, visually superimposed on third person or first person video. The blind and visually impaired can use the technology to perceive the direction and distance of static landmarks and moving people nearby, and target specific information, e.g. a store's name and opening hours, a friend's status on social networks.

Identifying who is who in video is an important yet challenging problem that can greatly benefit existing video analytics and augmented reality applications. Identity information can be used to improve the presentation of target information on graphical user interface, enable role-based target analytics over long term, and achieve more efficient and accurate surveillance video indexing and querying. Instead of relying on target appearance, we propose a hybrid approach that combines complimentary radio frequency (RF) signal with video to identify targets. Recovering target identities in video using RF is not only useful in its own right, but also provides an alternative formulation that helps to solve difficult problems in individual video and RF domains, e.g., persistent video tracking, accurate target localization using RF signal, anchorless target localization, multi-camera target association, automatic RF and video calibration.

We provide a comprehensive RF and video fusion framework to enable identity-aware augmented perception in a variety of scenarios. We propose a two stage data fusion scheme based on tracklets, and formulate the tracklet identification problem under different RF and camera measurement models using network flow or graphical model. We first start from a basic calibrated single fixed camera, fixed RF readers configuration. Then we consider anchorless target identification using pair-wise measurements between mobile RF devices to reduce deployment complexity. Then we incorporate multiple cameras, to improve coverage, camera deployment flexibility, identification accuracy and enable multi-view augmented perception.

We propose a self-calibrating identification algorithm, that simplifies manual calibration and improve identification accuracy in environments with obstruction. Finally, we solve the problem of annotating video taken by mobile cameras to provide first-person perception, taking advantage of target appearance, location and identity given by the fixed video hybrid system.