Permanent URI for this collection
Browse
Recent Submissions
Publication EXPLOITING PERVASIVE LEAKED EM SIGNALS FOR COMMUNICATION, CHARGING AND SENSING(2025-02) Cui, MinhaoWireless technologies are becoming increasingly important in our daily lives. As we use 4G and 5G services, researchers are also working on the development of future 6G networks. In addition to traditional communication functions like Wi-Fi, wireless signals are now being used for localization, sensing, and even charging. However, one significant challenge with these new applications is that they require dedicated signal transmissions, which can interfere with the original communication functions. In my Ph.D. thesis, I aim to leverage the pervasive ambient leakage signals, which are typically seen as detrimental, to enhance wireless communication performance and enable new functions like sensing and charging. My approach is based on the observation that there is a significant amount of leakage RF signals in our environment. For instance, powerlines continuously emit 50/60 Hz electromagnetic (EM) signals due to the alternating current flowing through them. In my thesis, I implement innovative designs in both hardware and software to transform these ambient leakages from adversaries into valuable assets. We first explored that during the transmission of VLC, the transmitter not only emits out visible light signals but also leaks out RF signals due to the intensity modulation scheme. What's more interesting is that the leakage RF signal contains a copy of the data transmitted in the light signals and this finding renders VLC--the generally believed most secure wireless technology--not secure any more. Building upon this finding, we further demonstrate a novel utilization of these leakage signals for carrying extra data to double the data rate of existing VLC systems. Following the successful utilization of VLC leakage signals for communication, we further view the leakage signals as a form of wasted energy and devote our effort to harvesting it with the help of wearable bracelet antenna. Last but not least, we are going to utilize the leakage signals for sensing purposes. This electromagnetic leakage, stemming from alternating current in the electric appliance, is governed by Maxwell's Equations. We can infer the body motions by analyzing such leakage signals received by the human body. Therefore, we decide to leverage the EM leakage from electric vehicles (EVs) to enable in-vehicle sensing. We observe that numerous components within the EVs including battery, powerline, and power inverter, emit EM signals during their operation. And such leakage can be utilized to sense the body motions of the driver/passenger without any dedicated signal transmitters.Publication Towards reliable black-box variational inference(2025-02) Agrawal, AbhinavProbabilistic models are essential for understanding complex systems across various scientific fields. Inferring the posterior distribution over the unknown quantities is central to generating insights from these models, but the posterior is often analytically intractable and must be approximated. Black-box variational inference (BBVI) is a leading approximate inference framework that uses optimization to find a tractable posterior approximation. BBVI methods apply to a wide variety of models and offer practitioners a faster alternative to expensive Markov chain Monte Carlo methods. While BBVI is promising, it suffers from several issues that hinder its widespread adoption. We identify four such challenges: robustness, scalability, evaluation, and accuracy, and address them in this thesis to make BBVI a reliable inference tool. The first chapter addresses the issue of robustness. Naive BBVI approaches are often not robust enough to work out of the box and fail to converge without model-specific tuning. We improve this by integrating key algorithmic components like initialization strategies, gradient estimators, variational objectives, and advanced variational families. Extensive empirical studies demonstrate that our proposed scheme outperforms competing baselines without requiring model-specific tuning. The second chapter improves scalability. When models are large, BBVI methods fail to scale and suffer from slow convergence. We address this by proposing a structured and amortized BBVI scheme that maintains accuracy while offering faster and more scalable inference than existing approaches. The third chapter improves evaluation. The posterior predictive distribution is used to make predictions on unseen data; however, sometimes posterior predictive evaluations can be extremely noisy. We identify the conditions under which the simple Monte Carlo estimator of the posterior predictive distribution can exhibit an extremely low signal-to-noise ratio. Based on this analysis, we introduce an adaptive importance sampling approach that significantly improves the evaluation accuracy. The fourth chapter improves accuracy. Normalizing flow-based variational inference (flow VI) is a promising class of BBVI methods, but its performance is mixed, with some works reporting success and others reporting optimization challenges. We conduct an empirical analysis to disentangle the impact of various factors like representational capacity, objectives, gradient estimators, batch sizes, and step sizes. Our analysis leads to practical recommendations for each factor, culminating in a flow VI approach that matches or surpasses leading turnkey Hamiltonian Monte Carlo methods on a wide variety of targets. Collectively, the advances in this thesis make BBVI a reliable inference method.Publication EXPLORING REPRESENTATIONS FOR 3D RECONSTRUCTION FROM IMPAIRED REAL-WORLD DATA(2025-02) Selvaraju, Pratheba3D reconstruction from real-world data is essential in applications like augmented reality, robotics, medical imaging, and autonomous navigation. However, this data is often noisy, incomplete, occluded, or corrupted. Despite these imperfections, utilizing this data is necessary to develop reconstruction methods that can be applied in real-world scenarios. Each application comes with unique requirements and constraints, making it important to select representations tailored to the specific characteristics. Recognizing that our world is primarily composed of two types of objects, static (rigid) and dynamic (non-rigid) body structures, this thesis focuses on reconstruction tasks by exploring representations best suited to each type, ensuring adaptability to applications with similar characteristics, rather than reinventing wheels for each case. We focus on static structure reconstruction in fabrication industries that produce real-world products. It often deals with non-malleable materials which has zero-gaussian curvature property. To address reconstruction with this property constraint, we introduce Developability Approximation for Neural Implicits through Rank Minimization, a neural network model that represents surfaces as piecewise zero-gaussian curvature patches. The model encodes data implicitly, offering an advantage over prior explicit methods that struggle with high tessellation and shape fidelity. Applying this method to large-scale urban planning requires understanding building structures which is made of several different components with different non-malleable materials. Thus, automatically identifying these components becomes essential. To this end, we created a large-scale dataset of 2,000 diverse building exteriors (e.g., residential, commercial, stadium) named BuildingNet. Using this dataset, we developed a Graph Neural Network (GNN) model to automatically label building components. Next, we explore dynamic object reconstruction, focusing on human faces, by introducing OFER: Occluded Face Expression Reconstruction. OFER reconstructs expressive human faces from occluded images. Occlusion introduce new sources of ambiguity in hidden regions, requiring multi-hypotheses solution. Toward this, OFER employs a parametric face model and trains hybrid UNet-Attention diffusion models to generate diverse expression coefficients. This representation ensures smooth, plausible reconstructions with integrity to the visible parts and ease of animatability through simple parameter adjustments. In facial animation, real-time performance is crucial for applications like gaming and augmented reality, which require computational efficiency while preserving high quality. Traditional UNet-based diffusion models often suffer from slower temporal coherence and long range sequence, while attention computation results in computational overhead and slower inference time. To tackle this, we explore efficient computational representations and introduce FORA: Fast-Forward Caching for Diffusion Transformer Acceleration. FORA employs a caching mechanism that reuses intermediate outputs, thereby minimizing computational overhead without requiring model retraining, enabling faster processing with minimal trade-offs in quality.Publication ADVANCING PRECISION HEALTH WITH CLINICAL FOUNDATION MODELS(2025-02) Yang, ZhichaoThe integration of Artificial Intelligence (AI) in healthcare promises unprecedented improvements in patient care, yet its full potential, especially in precision health, remains underutilized due to significant challenges in transforming real-world data into real-world evidence. This dissertation explores the development and application of clinical foundation models (FMs), specifically Clinical Language Models (CLaMs) and Foundation Models for Electronic Health Records (FEHRs), which are pretrained on extensive clinical narratives and structured records. The research presents innovations in several key areas: Accurate information extraction from clinical notes: We analyzed baseline CLaMs performance in the task of extracting diagnostic code information from clinical notes. We identified two key issues in these CLaMs: their imprecision in extracting rare diseases due to the lack of training data, and their difficulties with recognizing synonyms due to model’s inadequate medical knowledge. To address these issues, we developed a generative knowledge-injected prompt-based fine-tuned transformer, achieving state-of-the-art accuracy. Accurate information extraction paves the road for better-informed decisions during clinical diagnoses. Enhanced quality of patient health assessments: Inferring clinical diagnosis to generate an assessment is a crucial step during the patient encounter. However, there is limited research on generating clinical diagnoses in a free text format. Hence, we propose a new task of generating full-length patient health assessments. We applied CLaM to this task and found that it tend to generate factually incorrect responses. To improve the generated assessment quality, we combined the CLaM with the medical knowledge graph. By reducing the incidence of misleading information generated during the assessment process, our CLaM supports clinicians in making better-informed decisions. Predictive modeling of complex disease interrelations: We developed TransformEHR, a generative transformer FEHR pretrained on a vast dataset of 6.5 million patient electronic health records. With visit level pretraining objective, TransformEHR is designed for predicting complex interrelations among diseases. The high performance in predicting intentional self-harm shows the potential of TransformEHR in building effective clinical intervention systems. TransformEHR is also generalizable and can be easily fine-tuned for clinical prediction tasks with limited data. HealthMamba are multi-purpose long context extraction and prediction engines: Traditional transformer-based FMs have limited performance in long EHR and struggle to extract information from locations distant from the end of the document (position bias). To mitigate this issue, we developed HealthMamba, a Mamba-based FM that addresses the issue of long medical history. HealthMamba uses selective scan algorithm allowing the model to selectively propagate or forget information along the sequence length depending on the input token. With prefix prompt, HealthMamba significantly outperforms transformer-based FMs in 7 clinical information extraction tasks and patient outcome prediction tasks. Notably, it demonstrated less position bias compared to GPT-4, maintaining effectiveness across all parts of EHRs. By training advanced generative clinical FMs on large-scale healthcare data, this dissertation demonstrates AI’s role in enhancing precision health for more personalized and effective healthcare solutions. The findings underscore the potential of AI to transform medical data analysis and patient care, setting a path towards a future where healthcare is increasingly driven by intelligent and automated systems to support healthcare providers.Publication SLO-Aware Power Management for Elastic Cloud Applications(2025-02) Savasci, MehmetThe enormous power consumption of cloud data centers poses severe financial and environmental concerns. Server consolidation, server throttling, and power capping emerge as several of the numerous approaches proposed to manage the power consumption of data centers. While these methods notably increase data center power efficiency by reducing the power consumption of data center servers, they also negatively impact the performance SLOs of hosted applications. Thus, data center operators must grasp and navigate the tradeoffs between power and performance. The motivation behind this thesis is to design techniques for managing power-performance tradeoffs in cloud data centers. To such an end, this thesis investigates the connections between the power usage of cloud data center servers and the performance SLOs of applications hosted on the servers, designs models that capture these connections, and develops controllers aimed at controlling the power usage of data center servers while considering application performance SLO targets. This thesis presents the following three key contributions. First, I design techniques to automate the process of power-performance controller generation for latency-sensitive web applications, ensuring that the generated controllers control power allocation while meeting application performance SLOs and offer improved performance compared to state-of-the-art techniques. I achieve this by designing DDPC, an autonomous data-driven controller generation system for power-performance management. Second, I introduce SLO-Power, a framework for effectively coordinating elastic resource provisioning and power management techniques under performance constraints. SLO-Power shows improvements over the standalone state-of-the-art elastic resource provisioning and power management methods. Finally, I design PADS, a hardware-agnostic power capping technique that integrates horizontal and vertical scaling of CPU resources—termed diagonal scaling—with the power-performance models of applications to keep the total power consumption of servers under power cap while respecting application SLOs. I show that PADS outperforms the state-of-the-art power capping solution in the literature regarding dealing with power cap violations and keeping application performance under control.Publication Uncertainty-Aware Computer Vision in Resource-Constrained Environments(2025-02) Samplawski, ColinRecent breakthroughs in deep learning techniques have led to staggering performance improvements in many domains. This has made autonomous systems a critical component for many real-world use cases, including in Internet of Things (IoT) environments. This domain is especially challenging as it imposes considerable environmental, networking, and hardware constraints on the models. Further compounding this challenge are the high stakes decisions that are necessary in many real-world deployments. This thesis explores how to leverage uncertainty awareness to create more robust vision models for use in constrained environments. By estimating and communicating the uncertainty that a model has, we can generate more reliable and trust-worthy predictions. We first consider the problem of zero-shot image classification, where no labeled data is available for some classes. By utilizing a textual class hierarchy, we expose an accuracy-specificity trade-off that lets systems make more accurate, albeit less specific, predictions under uncertainty due to resource constraints. We then address the distributed execution of image classifiers. We split a neural network between an edge device and the cloud by performing a partial execution on the edge and sending latent features to the cloud for completion. We find that this approach demonstrates superior bandwidth utilization over conventional methods. Merging these strategies together, we craft a distributed, hierarchical object detector validated via a prototype on ultra low-power edge hardware. We next evaluate the edge runtime of recent transformer-based object detectors. We additionally show how their unique characteristics simplify reasoning about bounding box uncertainty compared to earlier methods. We approximate the Bayesian parameter uncertainty using a simple deep ensemble. Due to the high cost of this approach, we present a more efficient uncertainty quantification method by ensembling only a subset of the detector's parameters. However, reasoning about uncertainty over bounding boxes remains challenging and makes multi-camera fusion less straightforward. For these reasons, we consider geospatial tracking, where 3D points in a shared world space are predicted rather than boxes in the image plane. With the support of multi-camera datasets with geospatial ground truth, we train a deep probabilistic model of an object's position. The predictions are then fused using multi-observation Kalman trackers. We demonstrate how modeling the geometric transformation between the image plane and the world coordinate frame allows us to train geospatial detectors for tracking using much less data than end-to-end deep learning approaches. Furthermore, we are able to output intuitive geospatial uncertainty estimates, generalize to unseen viewpoints, and provide straightforward support for multi-object tracking.Publication Learning-Augmented Online Algorithms for Energy Optimization(2025-02) Lee, RussellFor competitive analysis of online algorithms, an online algorithm only knows current and past inputs and must make decisions sequentially as inputs arrive. The primary performance metric of competitive ratio - the maximum ratio of the online algorithm’s performance against the optimal offline algorithm’s performance with full knowledge of future inputs - is calculated over all possible problem instances. This framework is highly suited for energy optimization problems that face uncertainties in future variables such as price fluctuation, electricity demand, and renewable energy generation. Online algorithms are then able to provide theoretical performance guarantees on the cost of procuring energy, even when facing worst-case outcomes for future inputs. In this thesis, we present optimal online algorithms for energy optimization in the competitive analysis setting. First, we consider online linear optimization with inventory management constraints where the clearing price of electricity is determined by bids submitted by market participants. We propose algorithms where the competitive ratio approaches those of optimal online algorithms in the basic setting without bids. Competitive analysis fares well when future inputs are adversarial, but in practice, predictions on future inputs are available through machine learning or other predictive models. This has sparked a growing area of research on learning-augmented online algorithms that are able to use predictions while maintaining provable performance guarantees. This setting is well suited to the nature of predictions in energy optimization problems, where data-driven models for price and demand are able to leverage some degree of seasonality. However, underlying factors such as weather variation still cause unpredictable spikes in price and demand. We then present energy optimization problems in the learning-augmented setting. First, we analyze the peak-aware energy scheduling problem and propose Pareto-optimal algorithms that can utilize a trust parameter of the predictions. Second, we consider the k-min search problem with predictions. We design our algorithm with both robustness (when prediction error is arbitrary) and consistency (when predictions are accurate) guarantees such that our algorithm achieves the Pareto-optimal tradeoff of robustness and consistency.Publication FAST, SCALABLE, WARM-START SEMIDEFINITE PROGRAMMING WITH APPLICATION TO KNOWLEDGE EDITING AND MIXED INTEGER SEMIDEFINITE OPTIMIZATION(2025-02) Angell, RicoSemidefinite programming (SDP) has traditionally been limited to moderate-sized problems. Methods for scaling to larger problems have sacrificed the convexity of the original problem for scalalability. More recently, algorithms augmented with matrix sketching techniques have enabled solving larger SDPs. However, these methods achieve scalability at the cost of an increase in the number of necessary iterations, resulting in slower convergence as the problem size grows. Furthermore, they require iteration-dependent parameter schedules that prohibit effective utilization of warm-start initializations important in practical applications with incrementally-arriving data or constraints. We present Unified Spectral Bundling with Sketching (USBS), a fast and scalable algorithm for solving massive SDPs that can leverage a warm-start initialization to further accelerate convergence. Our proposed algorithm is a spectral bundle method for solving general SDPs containing both equality and inequality constraints. Moreover, when augmented with an optional matrix sketching technique, our algorithm achieves the dramatically improved scalability of previous work while sustaining convergence speed. We empirically demonstrate the effectiveness of our method across multiple applications, with and without warm-starting. For example, USBS provides a 500x speed-up over the state-of-the-art scalable SDP solver on an instance with over 2 billion decision variables. The speed and scalability of USBS enables the use of SDPs in novel applications. First, we present a new paradigm of interactive feedback, existential cluster constraints, for correcting entity resolution predictions and present a novel SDP relaxation at the core of the proposed inference algorithm. We demonstrate empirically that our proposed framework facilitates more accurate entity resolution with dramatically fewer user feedback inputs. We show USBS is not only faster than the previous state-of-the-art scalable SDP solver, but can also effectively leverage a warm-start initialization to improve empirical convergence. Finally, we provide evidence that USBS could potentially be used as part of a mixed-integer semidefinite program solver. There are many applications where we want to optimize a semidefinite program where a subset of the decision variables are subject to additional integrality constraints. One of the barriers to applying standard branch-and-bound techniques to mixed-integer semidefinite programs is the lack of a fast semidefinite program solver that can warm-started. Given existing evidence that USBS can effectively utilize a warm-start initialization, we explore the possibility of using USBS as part of a branch-and-bound solver for mixed-integer semidefinite programs.Publication Data Driven Expert Assignment(2025-02) Payan, JustinThe modern knowledge economy relies on expertise. In important technocratic tasks such as scientific peer review and community question answering, knowledge workers can only fulfill requests they have the expertise, interest, and availability to complete. We develop multiple novel approaches to assign experts to requests, addressing questions of fairness, scalability, assignment quality, and robustness to uncertainty. We use peer review as the primary case study, though Chapter 3 highlights the domain of community question answering. Our algorithms can be applied to other domains where resource-constrained experts are assigned to complete complex requests, such as crowd-sourced editing of knowledge repositories or corporate staff assignment. Expert assignments must be both fair and welfare efficient, so that all requests receive a reasonably well-qualified set of experts. We first present a set of simple mechanisms that fairly distribute expertise across requests, with welfare guarantees. Our algorithms, Greedy Expert Round Robin and FairSequence, assign experts in such a way that no request "envies" another request's assigned experts. Although fairness and welfare criteria ensure evenly-distributed, high quality expertise, they both depend on the method of quantifying expert performance. In automated reviewer assignment systems, existing methods for estimating the benefits of assigning each reviewer to each paper can be noisy and ineffective. We take a data-driven perspective on the expert assignment problem, demonstrating how to more accurately estimate the benefits of assigning experts to requests. We train a variety of models to predict answer quality on StackExchange, then compare the results when using these models to produce constrained assignments of users to questions. This study demonstrates the benefits of fully predictive expert assignment. No matter how accurate our predictive model, we always are uncertain when we assign experts to requests. Distribution shift can cause our models to make errors, or experts may be unable to perform due to unforeseen circumstances. We discuss two main solutions to hedge against the worst outcomes. The robust optimization framework optimizes over a region containing the true matching scores with high probability. The stochastic optimization framework assigns experts using a percentile criterion over the assignment objective. We study both the robust and stochastic approaches for utilitarian and egalitarian welfare objectives, and we detail applications in reviewer assignment and community question answering. Expert assignment is a rich problem, which needs to be addressed from both a data analysis and algorithmic lens. Our work improves the end-to-end expert assignment pipeline, which will result in less wasted time and greater productivity for knowledge workers.Publication Data Generation for Weakly Supervised Neural Retrieval(2025-02) Lien, Yen-ChiehTo address data limitation in neural retrieval, weak supervision leverages existing ranking methods to automatically generate pseudo relevance judgments. However, there are still several limitations to consider. Firstly, the size and accessibility of the query collection pose challenges for existing methods in generating a sufficiently large and diverse set of weak signals. Secondly, while empirical and theoretical evidence demonstrates that a weakly supervised neural ranking model can outperform the original ranker, the quality of the information provided by the original ranking models highly correlates with and constrains the overall performance of weakly supervised models. To overcome these limitations, the dissertation focuses on employing a neural generative approach for data generation in weak supervision settings, incorporating effective techniques. To address the issue of query scarcity, a query augmentation framework is designed, which utilizes GAN-based methods to expand an insufficient query set. The evaluation results indicate that augmentation enhances ranking performance, particularly when the original query set is inadequate to support weakly supervised training. In the context of e-commerce applications, an ensemble approach is devised to generate pseudo queries from customer reviews. These generated queries exhibit similarity to real customer queries and effectively enhance ranking performance within the weak supervision framework. To tackle the challenge of weak labeler quality, we propose a framework called generalized weak supervision (GWS). This framework extends the definition of weak labeler to include the weakly supervised model itself. Through iterative re-labeling, the quality of pseudo relevance judgments is improved without the need for additional data. We present four implementations of the GWS framework, demonstrating significant improvements in ranking tasks compared to weak supervision methods. Finally, we extend weak signals generated from large language models (LLMs) to explanations in natural language. Through the extended form, the ranking ability of LLMs is transferred to smaller models more effectively.Publication TOWARD UNIFIED EXPERTISE: ONE MODEL FOR ALL TASKS(2025-02) Chen, ZitianUnderstanding the real visual world involves processing diverse forms of perception and learning the intrinsic connections among different perceptions. Humans exhibit a remarkable ability to adapt and respond appropriately to various types of visual stimuli, whether it's a glimpse of the 3D real world, perceiving a 2D black and white image, or watching a blurry video clip. In contrast, visual recognition systems often encounter challenges when learning from multiple sources. One such challenge is gradient conflict, where gradients from different tasks contradict each other. This conflict can lead to breakdowns in the systems' ability to learn across multiple tasks simultaneously. Another challenge is catastrophic forgetting, where a neural network trained sequentially on different tasks overwrites what it previously learned, and this can occur at any point between training iterations. This dissertation aims to endow visual recognition systems with multi-task learning (MTL) ability. The aim is to enable these systems to transfer knowledge inductively between tasks. Deep learning naturally clusters similar concepts while maintaining separation between unrelated ones in the data and feature space. Here, the objective is to replicate this effect but optimize within the parameter and task spaces. Gradient conflict and catastrophic forgetting could be alleviated if parameters are carefully assigned to the best set of tasks. Challenges such as gradient conflict and catastrophic forgetting can be mitigated by strategically assigning parameters to the most suitable set of tasks. This allows for better performance across tasks without one task interfering with another. Along with this motivation, this dissertation seeks to identify the most effective neural network architectures for MTL. These architectures should support a scaling law, where increasing the model size, the amount of data, and the number of tasks leads to improved performance across a broad range of tasks—though with diminishing returns as the scale continues to grow. We begin by addressing fundamental visual tasks such as object localization and object categorization. In the initial step, a unified framework was designed to incorporate these basic perceptual capabilities and enable knowledge transfer between tasks. The transformative dynamics between localization and categorization were parameterized and directly modeled to achieve this. This approach involves designing architecture and allocating the model parameters for various purposes, relying on human comprehension. Beyond manual efforts in architecture design, we explore methods for automatically allocating model parameters to specific tasks. This involves creating a framework where different parts of the model can specialize in learning distinct tasks. To achieve this, we introduce the concept of the mixture of experts (MoE), where each expert represents a fundamental building block of the model. These experts can either be shared across a set of tasks or dedicated to a single task, depending on what the system needs. By structuring the model in this way, we avoid the limitations of sharing the entire backbone for every task, while still enabling knowledge transfer between them. We further extend this approach to manage a large number of tasks efficiently. Our strategy focuses on dynamically allocating resources to ensure that as the system scales with more tasks, it maintains high performance and efficiency, allowing the model to grow gracefully without overwhelming computational resources. Further, the strategy of optimizing the parameter and task spaces can extend beyond efficient upstream pre-training to accommodate the diverse needs of downstream applications. In line with this approach, we delve into Dynamic Structured Optimization techniques for adaptable and efficient downstream learning. We explore how the adaptive nature of MoE layers can enable fine-tuning, support continual learning, and provide effective control over model capacity and computational cost. Just as humans have specialized body parts—hands, brains, and more—each suited for specific functions, neural networks are composed of parameters that serve as the AI’s specialized components for different tasks. Our goal is to teach AI how to coordinate these diverse elements, much like the human body seamlessly orchestrates its parts, allowing it to manage a wide range of tasks. By optimizing how these components are allocated and adapt to different tasks, we aim to build AI systems that can handle complex and varied applications efficiently, while scaling gracefully.Publication Accelerating Sustainability of the Electricity Grid using Distributed Energy Resources(2025-02) Bovornkeeratiroj, PhuthipongIn recent years, the impacts of climate change have become more visible, raising concern and active movement in sustainability efforts. For instance, energy transition, which focuses on shifting from traditional fossil fuels towards renewable energy sources, is a critical strategy for mitigating the effects of climate change. Electrical grid is an important part of the energy transition since it still heavily relies on dirty sources such as coal, oil, and natural gas in many locations. Moreover, other sectors such as transportation, industry, and agriculture are transiting to all electric economy to reduce their emissions which leads to higher demand and, of course, emissions from the grid. The advances of the Internet of Things (IoT) and the proliferation of high-capacity networked energy devices at household-level, such as electric vehicles (EV), batteries, and heating, ventilation, and air conditioning (HVAC) have introduced opportunities for transforming electric demand at house-level and for coordinated control of those residential loads at a large scale. This provides a new and powerful form of demand response in terms of environmental and consumer perspectives to accelerate the sustainability of the grid. This thesis puts forth a central focus on sustainability in electricity grids with the presence of distributed energy resources. At the same time, the study also takes a human-centric design approach that considers the environmental, economic, convenience, and privacy aspects of electric consumers in the design of the systems and algorithms. To address those challenges, first, I propose a grid peak shaving framework that consists of peak prediction and a control algorithm which utilizes a distributed and heterogeneous pool of energy resources to perform flexible grid peak shaving. The algorithm can take home owner’s preference into consideration. Second, I examine electricity grid peak patterns, and then present peak prediction algorithms that can predict peak time of the day, peak day of the month, and peak day of the year respectively. I also provide reference datasets for peak forecasting in energy systems. Third, to prevent privacy leakage from the electric consumption data, I introduce an algorithm that shifts electricity demand with household batteries to prevent occupancy leakage while preserving other useful information. Finally, I analyze the potential conflict between electricity prices and carbon emissions and the resulting trade-offs in carbon-aware and cost-aware load scheduling. I also present a control algorithm that balances between reducing carbon and cost while still respecting user and grid constraints.Publication From LoRa Sensing to Coexistence of LoRa Sensing and Communication(2024-09) Xie, BinbinWireless sensing is an exciting new research area which can benefit a large spectrum of disciplines including elderly care, HCI, environment monitoring, and disaster response. The key difference between wireless sensing and traditional sensor-based sensing is that the target does not need to be equipped with any sensors and the wireless signal itself is utilized to sense the context information of humans. The rationale behind wireless sensing is that wireless signals vary with human movement. For instance, when a person moves in a room covered by WiFi, the WiFi signal reflected from this person varies with his/her movement. By analyzing the signal variation, the motion information such as target moving speed and respiration rate can be obtained. The contact-free and sensor-free nature of wireless sensing makes it particularly appealing in challenging scenarios such as pandemic and disaster survivor detection. During the COVID-19 pandemic, it is preferred that the patients' respiration rates can be monitored in a contact-free manner through walls. In disasters such as building collapse where the survivors do not have any sensors with them, wireless sensing can be crucial in detecting their presence and saving lives. % While promising in many aspects, there are several critical issues that hinder wireless sensing from being widely deployed in real-life scenarios. % critical issues still exist. These issues include (1) very limited sensing range due to the intrinsic nature of employing weak reflection signals for sensing; (2) strong interference from other objects in the environment; and (3) severe degradation of sensing performance in the presence of ongoing communication function of wireless technologies. This thesis explores the exciting opportunity of employing LoRa~--~the emerging wireless protocol designed for IoT device connections~--~to realize long-range wide-area wireless sensing. This thesis addresses these fundamental issues by making the following contributions. First, we adopt a chirp concentration scheme which fully exploits the property of LoRa chirp to improve the signal power and accordingly boost the sensing range. Second, to mitigate the impact of interference, we propose the concept of ``virtual fence'' to constrain sensing only within the area of interest. The location and size of virtual fence can be flexibly controlled in software to meet the requirements of different applications. Finally, to make LoRa-based wireless sensing work in the presence of ongoing communication, we propose to employ the reversed chirp, i.e., downchirp, for sensing and keep the original upchirp for communication. This design smartly leverages the orthogonality between downchirp and upchirp to address the issue of communication interference on sensing. While the upchirp-downchirp design can remove most of the interference, we further adopt a novel chirp rotation method to deal with the remaining power leakage interference from upchirp to downchip, enhancing the sensing performance.Publication LTE-Based Pervasive Sensing across Indoor and Outdoor(2024-09) Feng, YudaBesides the communication function, wireless signals, such as WiFi, Bluetooth and UWB, are recently exploited for sensing purposes. However, designing a wireless sensing system that provides truly pervasive coverage at city or even national scale is still challenging. In this dissertation, we propose to involve the pervasive LTE signals into the ecosystem of wireless sensing, and enable various sensing applications on human, vehicles, and agriculture. In the first part, we exploit the unique advantages of downlink LTE sensing in movement detection across different scales, and resolve the corresponding challenges. We demonstrate the advantages of LTE sensing using two typical applications, finegrained indoor respiration monitoring and large-scale outdoor car speed estimation. The proposed system achieves highly accurate respiration sensing with the common problems, blind spot and orientation-sensitive issues, greatly mitigated. In the second part, we propose to combat inherent limitations of downlink LTE sensing, i.e., the low signal quality due to long propagation and significant variations across different areas. We found the key insight in the unique asymmetric downlink and uplink transmissions. Accordingly, we propose to leverage the complementary features of LTE uplink and downlink signals on signal power, bandwidth and sensing rate. We propose noise-resistant combination algorithms and develop robust LTE sensing, expanding the sensing coverage to more than 4 times and extending to general dynamic movement detection. In the last part, we enable LTE sensing on non-movement physical property detection, soil moisture in agriculture. Soil moisture sensing is a basic function in modern precision irrigation. Multiple wireless soil moisture sensing solutions such as WiFi and RFID have been proposed, which, however, can hardly support large scale deployment in farmfield environments. LTE signal provides a unique opportunity for soil moisture sensing as the ubiquitously deployed base stations. We for the first time propose low-cost and low-power LTE based soil moisture sensing. Our low-cost sensing system ($55) achieves a high accuracy (3.15%) comparable to high-end soil sensors ($850), wide coverage (2.4 km from the base station) and low power consumption (lasting 16 months using batteries).Publication Resource Management for Edge AI(2024-09) Liang, QianlinWith the proliferation of IoT devices and the continuous advancement of AI al- gorithms, edge AI, which represents the synergy of edge computing and artificial intelligence, has garnered increasing attention from both academia and industry. By pushing AI frontier to the edge ecosystem which is closer to users, edge AI provides substantial benefits such as low-latency inference, reduced network bandwidth us- age, and enhanced user privacy. However, deploying compute-intensive AI models on resource-constrained edge platforms presents substantial challenges to resource man- agement, which plays a key role in realizing the benefits and ensuring the success of edge systems. It is imperative to efficiently schedule and share the heterogeneous and limited edge resources, including emerging specialized AI accelerators such as GPUs and TPUs, to adapt to the dynamic edge workloads and satisfy their low-latency requirements. Additionally, energy, particularly for battery-powered edge devices, must be considered as a scarce resource, necessitating efficient operation to support the long-term execution of workloads. This thesis addresses pivotal challenges of resource management in Edge AI. By optimizing resource and energy efficiency for AI applications within the constraints of edge computing environments, this thesis aims to enhance hardware utilization, reduce costs, and improve application performance and reliability.Publication Developing Digital Biomarkers of Early Childhood Mental Health using Multimodal Sensor Data(2024-09) Kalanadhabhatta, ManasaPediatric mental health is a growing concern around the world, with mental, emotional, and behavioral disorders affecting children's social-emotional development and increasing the risk of adverse behavioral outcomes later in life. However, diagnosing mental health disorders in early childhood remains challenging. Caregivers are often unable to accurately identify signs of problematic behavior, and many lack access to specialized screening services. Digital biomarkers from passively sensed signals collected using smartphones and wearable devices have shown remarkable promise for mental health screening at scale. Nevertheless, such digital mental health tools are yet to make a significant mark in pediatric settings. While this may partly be driven by caregivers' perspectives toward such tools, the fact that children rarely tend to be independent users of mobile and wearable devices is also a key deterrent to developing scalable digital biomarkers of mental health in younger populations. In this thesis, I attempt to bridge this pediatric mental health diagnosis gap by developing novel digital tools that enable screening for problem behaviors in a convenient and scalable manner. These screening tools leverage multimodal signals that can be recorded using ubiquitous devices in the home while children are engaged in brief, clinically validated play-based interactions. I establish the technical feasibility of developing machine learning models to detect interaction-based biomarkers of attention-deficit/hyperactivity, disruptive behavior, and other externalizing disorders using behavioral (audio, video) and physiological (heart rate, electrodermal activity) signals. I incorporate these biomarkers into three new home-based assessments that can be realized using off-the-shelf mobile and wearable devices to predict not just behavioral symptoms but also their neurophysiological underpinnings, thus providing richer insight into the trajectories of early problem behaviors. To facilitate the integration of these next-generation screening tools into existing mental healthcare ecosystems, I further outline design recommendations for such tools by distilling findings from stakeholder studies involving parents and child mental health practitioners. This work thus sets the stage for ubiquitous technologies that can obtain rich, multidimensional data in the wild and enable screening for early childhood mental health concerns at scale.Publication Leveraging Explanations for Information Retrieval Systems under Data Scarcity(2024-09) Yu, PuxuanThe importance of explanations in the advance of information retrieval (IR) systems is on the rise. On one hand, this is driven by the increasing complexity of IR systems and the demand for transparency and interpretability from users; on the other hand, explanations can inherently improve the effectiveness of IR systems without necessarily being displayed to users. However, the scarcity of data poses significant challenges in developing these explanations, as acquiring high-quality explanations for relevance judgments is prohibitively expensive yet crucial for training neural network-based IR models and explanation generation models. To overcome these challenges, we utilize open-domain knowledge and generative language models to facilitate the generation of user-oriented explanations for various IR tasks limited by data availability. We start by introducing a novel model-agnostic task for search result explanations that emphasizes context-aware summaries, detailing each document's relevance to the query and other documents. To address this task, we design a novel Transformer-based encoder-decoder architecture. Next, we develop an inherently explainable IR model specifically designed to provide diversified reranking of retrieved documents. This model is pre-trained on open-domain data using explanation tasks, achieving state-of-the-art results in search result diversification with minimal domain-specific data. Additionally, we explore how natural language explanations can enhance the capabilities of generative language models to augment IR datasets through synthetic query generation, achieved by automatically identifying similarities and differences between document pairs. Finally, we utilize zero-shot generative language models to directly elicit natural language explanations of relevance between search queries and candidate documents, providing crucial auxiliary information for the calibration of neural ranking models and thus enhancing their ability to generate meaningful scores.Publication Context-Aware Query and Document Representation in Information Retrieval Systems(2024-09) Naseri, ShahrzadInput representation has a major impact on the effectiveness of Information Retrieval (IR) systems. Further, developing a context-aware input representation for IR systems is crucial to answering user's complicated information need. The goal of this work is to take advantage of the \textit{contextual features} to represent the query and document to enhance the information retrieval systems performance. We focus on three sources of \textit{contextual} features: 1. Entities, defined as things or concepts that exist in the world; 2. Context within pseudo-relevant feedback document in IR systems; and 3. Context within example documents provided by user as the IR system's input. We first introduce a dense entity representation based on the relationships between an entity and other entities described within its summary. We explore its use in the entity ranking task by representing both queries and documents using this model. By integrating this ranking methodology with a term-based ranking method, we achieved statistically significant improvements over the term-based ranking approach. Further, we developed a retrieval model that merges term-based language model retrieval, word-based embedding ranking, and entity-based embedding ranking, resulting in the best performance. Additionally, we introduce an entity-based query expansion framework employing local and global entity knowledge sources; i.e. corpus-based indexed entities and the summary-expanded entity embedding. Our results demonstrate our entity-based expansion framework outperforms the learned combination of word-based expansion techniques. Then we focus on leveraging the context of pseudo-relevance feedback documents (PRF) for ranking relevant terms to the user's query. To achieve this, we utilize transformer models, which excel at capturing context through their attention mechanisms, and expand the query with top-ranked terms. We propose both unsupervised and supervised frameworks. Our unsupervised model employs transformer-generated embeddings to calculate the similarity between a term (from a PRF document) and the query, while considering the term's context within the document. Our results demonstrate that this unsupervised approach outperforms static embedding-based expansion models and performs competitively with state-of-the-art word-based feedback models, relevance model variants, across multiple collections. The supervised framework approaches query expansion as a binary classification task, aiming to identify terms within the PRF documents relevant to the query. We utilize transformer models in a cross-attention architecture to predict relevancy scores for candidate terms. This supervised approach yields performance comparable to term frequency-based feedback models, relevance model variant. Moreover, combining it with the relevance model results in even greater improvement than either model used independently. Finally, we concentrate on leveraging the context of the example documents provided by the user in the query-by-example retrieval problem to formulate a latent query that represents the user's information needs. We construct three query-by-example datasets and develop several transformer-based re-ranking architectures. Our Passage Relevancy Representation by Multiple Examples (PRRIME) overcomes BERT's context window limitations by segmenting query example and candidate documents into passages. It then trains an end-to-end neural ranking architecture to aggregate passage-level relevance representations, demonstrating improvement over the first-stage ranking framework. Additionally, we explore a cross-encoder reranking architecture using the Longformer transformer model for query-by-example retrieval, aiming to capture cross-text relationship, particularly aligning or linking matching information elements across documents. This shows statistically significant improvement on the test set of the dataset which it is trained on but performs not as well as the baseline on the other two datasets which have limited fine-tuning data, indicating limited knowledge transferability. Finally, we investigate a dual-encoder reranking architecture that learns query and document representations through an auxiliary training paradigm. It uses query prediction as an auxiliary task alongside the ranking objective as the main task. It outperforms both the initial retrieval stage and the single-loss training method - i.e training the dual encoders solely with a ranking objective.Publication Advancing Acoustic Sensing from the Laboratory to Real World: Theories, Applications, and Practical Considerations(2024-09) Li, DongWith the proliferation of voice assistants, speakers and microphones are essential components in billions of smart devices that people interact with on a daily basis, such as smartphones, smart watches, smart speakers, home appliances, etc. This dissertation explores the transformation of these devices from simple audio tools into sophisticated acoustic radars, expanding their applications beyond basic audio playback and voice interactions to include gesture tracking, vital sign monitoring, and eye blink detection. We address fundamental technical challenges and practical considerations, which not only resolve existing system limitations but also facilitate the creation of new applications. One major challenge in acoustic sensing is tracking multiple targets simultaneously due to the inherent nature of contact-free tracking. Signals reflected from multiple targets are mixed at the microphone, and thus, it is difficult to separate them to obtain the context information of each individual target. FM-Track pioneers in enabling contact-free multi-target tracking using acoustic signals. A signal model is introduced to characterize the location and motion status of targets by fusing the information from multiple dimensions (i.e., range, velocity, and angle of targets). Then a series of techniques are developed to separate signals reflected from multiple targets and accurately track each individual target. FM-Track can successfully differentiate two targets with a spacing as small as 1 cm. Another significant challenge for acoustic sensing is the extremely limited sensing range, particularly for fine-grained activities due to weak signal reflections. LASense dramatically increases the sensing range for fine-grained human activities by introducing a virtual transceiver idea that purely leverages delicate signal processing techniques in software. LASense can significantly increase the sensing range of respiration monitoring from the state-of-the-art 2 m to 6 m, and enhance the sensing range for finger tapping and eye blink detection by 150% and 80%, respectively. Additionally, this dissertation demonstrates how to apply acoustic sensing techniques to enable new applications, i.e., “listening” to your hand gestures using smart speakers. In SpeakerGesture, we develop a series of novel signal processing techniques and implement our system on two commodity smart speaker prototypes. SpeakerGesture can achieve over 90% accuracy in gesture recognition even when the user is 4 m away from the smart speaker and there is strong interference. At last, this dissertation shares the experience and findings when transitioning acoustic sensing systems from laboratory settings to real-world environments. We identify multiple practical considerations that were not paid attention to in the research community and propose the corresponding solutions. The challenges include: (i) there exists annoying audible sound leakage caused by acoustic sensing; (ii) acoustic sensing actually affects music play and voice calls; (iii) acoustic sensing consumes a significant amount of power, degrading the battery life; (iv) real-world device mobility can fail acoustic sensing.Publication Modeling Cross-Lingual Knowledge in Multilingual Information Retrieval Systems(2024-09) Huang, ZhiqiIn many search scenarios, language can become a barrier to comprehensively fulfilling users’ information needs. An Information Retrieval (IR) system equipped with an extra component of language translation is capable of mapping words in different languages, enabling it to retrieve documents according to the user's query regardless of the language in which the query and documents are expressed. Effectively incorporating multilingual knowledge is the key to building the translation component. Such knowledge can be obtained from dictionaries, machine translation modules, or multilingual pre-trained language models. For these different forms of multilingual knowledge, we present cross-lingual knowledge injection, transfer, and language debiasing techniques to enhance the effectiveness of Cross-lingual Information Retrieval (CLIR) and Multilingual Information Retrieval (MLIR). Specifically, by utilizing multilingual knowledge at various levels—from individual word translations to parallel and non-parallel corpora—we develop new model architectures and training goals tailored for information retrieval tasks across diverse linguistic settings. First, we introduce a mixed attention Transformer layer, which augments mutually translated words between query and document into the attention matrix and investigates its effectiveness on CLIR tasks. Next, we study cross-lingual transfer in the IR models and demonstrate a knowledge distillation framework to address the data scarcity problem in model training and improve retrieval effectiveness involving low-resource languages. Then, we focus on a special setting in MLIR, where the query is in one language, and the collection is a mixture of languages. To address the problem of inconsistent ranking results between languages, we design an encoder-decoder model that maps document representations from different languages into the same embedding space. We also present a decomposable soft prompt to capture unique and shared properties across languages. Finally, we introduce a language debiasing method to identify and remove linguistic features from a multilingual embedding space. This approach significantly diminishes the necessity for parallel data in constructing MLIR models, allowing for using non-parallel data instead. By reducing language-specific factors from the training process, we improve the retrieval effectiveness for all linguistic settings in retrieval tasks (e.g., monolingual, cross-lingual, and multilingual), thereby facilitating language-agnostic information retrieval.