Funded Projects & Fellows

2024 Awards

Faculty Research Projects

PI: Luca Carloni (Computer Science)

The primary goal of this project is to develop and disseminate a novel research platform for the design, optimization, and programming of artificial intelligence (AI) hardware accelerators and their integration in a complete system-on-chip (SoC) architecture. A platform is defined as the combination of a computer architecture and a companion computer-aided design (CAD) methodology. The proposed platform leverages FPGA technology for SoC emulation and prototyping. A complementary goal is to make the platform portable from “local FPGAs” in the lab to “remote FPGAs” in the cloud by demonstrating its deployment on Amazon EC2 F1 instances. As an educational aspect of the project, the platform will be used in future offerings of the course “System-on-Chip Platforms,” regularly taught by the PI at Columbia during the past decade. In terms of broader impact, the proposed platform will be released in the public domain as an open-source project, thus promoting collaborations among researchers across universities, industry, and government labs and supporting the development of similar courses at other institutions.

PI: Josh Alman (Computer Science)

The time to perform attention computations is the bottleneck for Large Language Model operations. This project aims to design faster algorithms for this problem, and give complexity-theoretic proofs that faster algorithms are impossible. The starting point is recent work by the PI showing that an algorithmic technique called the polynomial method is optimal for attention modeling long sequences in the worst case. The PI plans to investigate important generalizations and practical implications: whether the inputs which arise in practice can be handled more quickly, how to apply other algorithmic techniques like Fast Multipole Methods to model very long sequences, and other applications of these techniques for implicitly manipulating matrices in machine learning.

PIs: Gamze Gursoy, Noemie Elhadad (Biomedical Informatics)

Large language models (LLMs) like GPT (Generative Pre-trained Transformer) and open-source ones like Mistral have the promise to revolutionize healthcare. In this proposal, we argue that it is critical to investigate the privacy implications of LLMs in the clinical domain, and particularly to quantify the hidden leakages that could happen from these exciting models. In this proposal, we aim to enhance privacy measures for clinical data used in LLMs by developing a novel exposure metric to quantify privacy leakage: We plan to create a new exposure metric that reflects the unique aspects of clinical data. This involves a data driven approach by leveraging information theory to incorporate the co- occurrence and frequency of specific data points (tokens and medical entities) observed in a patient population. This will be achieved by integrating information-theoretic measures into the exposure metric, thus providing a more nuanced understanding of data exposure risks and leveraging state-of-the-art NLP metrics that account for the high paraphrasing power of clinical language.

PI: Junfeng Yang (Computer Science)

Generative code models have transformed the software development landscape. Built on sophisticated generative large language models (LLMs), these platforms have garnered widespread adoption, assisting millions of developers globally in coding. Unfortunately, writing code constitutes about merely 40% of the software development efforts. Researchers and practitioners have long understood that the bulk of the work lies in software maintenance, and much of the maintenance includes critical code editing tasks such as vulnerability patching, performance optimization, paraphrasing to circumvent licensing issues, and refactoring. These tasks, integral to the software development lifecycle, are more constrained than code generation; they must preserve the original version’s correct semantics. This preservation requirement often conflicts with objectives that necessitate code alterations. For instance, performance optimization aims to change code to improve its efficiency without impacting its intended functionality. Vulnerability patching seeks to correct flaws without disrupting the existing correct logic. Paraphrasing, similarly, involves re-writing code to avoid licensing conflicts without altering its intended semantics. Compounding these challenges, developers may need to optimize code, patch vulnerabilities, and paraphrase to avoid license infringement concurrently, all while preserving the original code’s correct semantics.

The goal of this project is to create EDITGUARDIAN, a novel LLM framework for code editing. Unlike existing methods, EDITGUARDIAN prioritizes semantic preservation alongside other editing objectives and harnesses the universal capabilities of cutting-edge LLMs without depending on task-specific adjustments.

CAIT PhD Fellows

Project Summary: Large language models (LLMs) have demonstrated remarkable capabilities, yet their internal workings remain largely opaque. This  lack of transparency hinders our understanding of how these models represent knowledge and execute complex tasks. This project aims to illuminate  the internal representations of LLMs by applying a novel approach: causal representation learning. By adapting techniques designed to uncover causal mechanisms in real-world systems, we seek to identify: 1. Concept Encoding: Where and how LLMs encode various concepts, including those with  significant implications such as bias and toxicity. 2. Mechanism Specialization: Which neural network components (e.g., attention heads) specialize in  different tasks. This research will advance foundational LLM science by exploring how LLMs represent knowledge (e.g., modular, linear). Furthermore, it could enable novel internal checks for safe LLM deployment, reducing reliance on output text alone.

Project Summary: Autoregressive Large Language Models (LLMs) now dominate the landscape of text generation. In few-shot and even zero-shot settings, LLMs like GPT-4 exhibit a remarkable aptitude for rewriting fluent text in various styles or with specific attributes. These large models, however, require both extensive training and inference time resources, which restricts broader adoption. In parallel to these advances, other lines of work have explored alternative controllable text generation approaches, including text diffusion models. Unlike LLMs, these lighter-weight methods rely on fewer parameters, are not limited to text-based  prompting, and demonstrate increased sample diversity. However, this comes at a cost; non–LLM controllable text approaches are substantially more inconsistent, and regularly produce disfluent outputs. In our work, we will first explore whether text diffusion models, along with other efficient controllable text generation alternatives, can be made competitive with GPT- 4 though inference-time procedures like re-ranking. We are motivated by recent advances in controllable approaches and the increasing availability of strong discriminative ranking functions for most text generation tasks, like style and fluency classifiers. Additionally, these efficient approaches are highly parallelizable, enabling diverse candidates for ranking. Second, we will investigate the potentially synergistic role of LLMs as ranking functions alongside existing controllable approaches. As part of our work, we would like to re-examine the conventional evaluation procedure for text generation that prioritizes mean performance, and whether the existence of powerful text classifiers warrants a renewed focus on sample diversity and upper decile performance in controllable text generation systems.

2023 Awards

Faculty Research Projects

PI: Elias Bareinboim

Artificial intelligence (AI) plays an increasingly prominent role in society since decisions that were once made by humans are now delegated to automated systems. These systems are currently in charge of deciding bank loans, criminals’ incarceration, and the hiring of new employees, and it is not hard to envision that they will in the future underpin most of the decisions in society. Despite the growing concern about issues of transparency and fairness and the high complexity entailed by this task, there is still not much understanding of basic properties of such  systems. For instance, we currently cannot detect whether an AI system is operating fairly (i.e., is abiding by the decision-constraints agreed by society) or if it is reinforcing biases and perpetuating a preceding prejudicial practice. Additionally, there is no clear understanding of the various metrics used to evaluate different types of influence the protected attribute (such as race, gender, or religion) exerts on the outcomes and predictions. In practice, this translates into the current state of affairs where a decision is, almost invariably, made without much discussion or justification. To assist AI designers in developing systems that are ethical and fair, we will build on recent advances in causal inference to develop a principled and general causal framework for capturing and disentangling different causal mechanisms that may be simultaneously present. Besides providing a causal formalization for fairness analysis, we will investigate admissibility conditions, decomposability, and power of the proposed fine-grained causal measures of fairness. This will allow us to quantitatively explain the total observed disparity of decisions through different underlying causal mechanisms, that are commonly found in real-world decision-making settings.

PI: Hod Lipson

About 55% of human face-to-face communication is nonverbal. Whereas ongoing progress in language models such as ChatGPT and BingChat is radically advancing the verbal component of conversations, the nonverbal portion of human-machine interaction is not keeping pace. Animatronic robotic faces are stiff and slow. Robots cannot express a natural smile, let alone more sophisticated expressions. Robots cannot sync their lip motion to properly match annunciated phonemes. They cannot match their facial expression to their speech tone or content. This growing chasm between the advancing verbal content, and the poor nonverbal ability, will prevent AI from reaching its potential in full human engagement.

The goal of this research pilot is to explore architectures that will allow robots to begin to learn the subtle but critical art of physical facial expressions. Our lab has developed a soft animatronic face platform containing 26 soft actuators, most of them around critical expression zones such as lips and eyes. We aim to study two key communication pathways: The first is learning what facial expression to make (and when) based on conversational context, and the second is learning how to physically articulate these expressions on a given soft face.

PI: Kathy McKeown, Co-PI: Carl Vondrick

The ubiquity of art on the internet demands better ways of organizing and making sense of visual art. We propose an investigation into unified methods for representing and describing these artistic images. We propose beginning with an investigation of representations  produced by large pre-trained vision and language models to understand the kinds of aesthetic information they encode (e.g., color, form, style, emotion, subject matter). We build on our preliminary investigation in this area. Using that knowledge, we propose a follow-up study of greater difficulty: generating descriptive (i.e. describing only what a sighted person sees) and interpretative captions (i.e. incorporating contextual information about the art, artist, time period and movement). We believe this line of inquiry has the potential to drive social good and commercial value, expanding access to the visually impaired while simultaneously enabling better tools for a range of commercial scenarios.

PI: Eugene Wu, Co-PI: Rachel Cummings

We propose to develop a scalable and differentially private data market system, and to deploy a version for the Columbia campus. The data market system allows anyone that has a machine learning task to upload their training dataset (in a differentially private), and search for other datasets that could be used to augment their training data to produce a higher accuracy model. At the same time, data providers can upload differentially private summaries of their datasets to be indexed by the platform. Differential privacy is the gold standard in data privacy, and guarantees anonymity for individuals, and is a way to allow sensitive data such as medical records or student grade information to be shared for research but without leaking individually identifiable information. In a Columbia deployment, researchers and teams throughout the Columbia would be able to register the data they have available, and benefit from the collective capacity of the whole university.

CAIT PhD Fellows

Project Summary: Chen says, “One of my goals in applying for the CAIT fellowship was to develop scalable machine learning models or algorithms that can efficiently process massive amounts of data in industrial applications, with the support of the CAIT fellowship. Besides, I aim to enhance various business operations within Amazon.”

“Throughout this period, I developed a Pseudo-Bayesian optimization framework tailored for scalable black-box optimization. Within this framework, we constructed several algorithms based on kernel regression and randomized prior, yielding superior performance in benchmarking with state-of-the-art Bayesian optimization variants in certain numerical tasks, including neural network tuning and robotic tasks.”

Project Impact: “The impacts of our project to the field of Bayesian optimization (BO) are extensive not only in theory but also in applications:

  1.  The Pseudo-Bayesian Optimization framework generalizes Bayesian optimization so that other non-Bayesian methods can be considered. It gives users a broadened horizon to reconsider and customize algorithms with a wide range of surrogate models, uncertainty quantifiers, and acquisition functions with cheap computation. Moreover, it decouples SP and UQ to allow flexibility, in contrast to Bayesian optimization and most of its variants that tie the surrogate mean and variability together (e.g., gaussian process's posterior mean and posterior standard deviation).
  2. In theory, our framework characterizes the limiting behavior of the surrogate model, uncertainty quantifier and acquisition function for a reasonable black-box optimization. Therefore, our framework gives automatic convergence guarantee and, to our best knowledge, it is unique among the Bayesian optimization literature in using a "top-down" approach that inspires algorithmic design from theoretical convergence, not vice versa.
  3. In practice, the framework gives algorithms with superior performance compared to many state-of-the-art benchmarks in both runtime and iteration-based convergence to optimum, leveraging kernel regression and randomized prior method.”

Publications & Presentations

  • Presentation at the 2023 CAIT Showcase, May 2023
  • Haoxian Chen & Henry Lam (in draft). Pseudo-Bayesian Optimization. [Full Paper]
  • Haoxian Chen & Henry Lam (2023). Pseudo-Bayesian Optimization. Presented at INFORMS 2023. [Presentation Deck]

Project Summary: Kumar seeks “to develop algorithms for data-driven revenue management that are robust to demand uncertainty and perform well in both these settings. Our robust algorithms can be modified based on the confidence one has in the demand forecast and guarantee good performance when the forecast is accurate while maintaining a minimum level of worst-case performance.”

Project Impact: "Single-Leg Revenue Management with Advice" was one of the first ones to apply the Algorithms-with-Predictions framework to a revenue management problem, and it has led to numerous follow up works that attempt to do so for other problems in the field. "Robust Budget Pacing with a Single Sample" was selected for an oral presentation at ICML'23 (less than 3% acceptance rate); it drastically improved the sample complexity for a very important and practical problem in online ad auction markets.

Publications & Presentations

  • Presentation at the 2023 CAIT Showcase, May 2023
  • Santiago Balseiro, Christian Kroer, Rachitesh Kumar (2023). “Single-Leg Revenue Management with Advice”. Under revision at Operations Research. [Full Paper]
  • Santiago Balseiro, Christian Kroer, Rachitesh Kumar (2023). “Single-Leg Revenue Management with Advice”. Proceedings of the 24th ACM Conference on Economics and Computation. [Extended Abstract]
  • Santiago Balseiro, Christian Kroer, Rachitesh Kumar (2023). “Online Resource Allocation under Horizon Uncertainty”. Proceedings of ACM SIGMETRICS ‘23. [Full Paper
    • Also presented at INFORMS MSOM'23, APS'23, and RMP'23 conferences.
  • Santiago Balseiro, Rachitesh Kumar, Vahab Mirrokni, Balasubramanian Sivan, Di Wan (2023). “Robust Budget Pacing with a Single Sample”. Proceedings of the 40th International Conference on Machine Learning. [Full Paper]

Project Summary: Xia is conducting research “focused on machine learning, specifically causal inference. In particular, I hope to answer two research questions: (1) How can deep learning models be used to perform causal inference and, conversely, (2) how can causal information guide deep learning?”

Xia says, “I was able to tackle a more ambitious project, and I feel that it is nearing completion. For career, I was able to meet many Amazon representatives and speak with people from internship programs. I feel that the fellowship has given me footing to find interesting future opportunities aligned with my goals.”

“My earlier work on Neural Causal Models (NCMs) developed the theory around utilizing practical deep learning approaches for solving causal inference tasks. The approach worked on small toy datasets but scaled poorly. With my latest work on causal abstractions, these issues have largely been resolved. On the theoretical side, the abstraction framework is novel and unifies many topics such as existing abstraction works, causal generative modeling, and representation learning. Experiments from this project have shown strong results on image data. These results show significant promise in eventually incorporating causal elements into large-scale AI models in real-world usage.”

Publications & Presentations

  • Presentation at the 2023 CAIT Showcase, May 2023
  • Kevin Xia, Yushu Pan, Elias Bareinboim (2023). “Neural Causal Models for Counterfactual Identification and Estimation”, In Proceedings of the 11th Eleventh International Conference on Learning Representations (ICLR-23), February 2023 [PDF]
  • Kevin Xia, Elias Bareinboim (2023) “Neural Causal Abstractions”, In Submission at AAAI-24, August 2023 [PDF]

Project Summary: Subbiah is investigating “automatic summarization of narrative, a problem at the intersection of artificial intelligence and the humanities and social sciences. Much of the previous work in summarization has focused on single-document news. Our lab has shown, however, that while this domain is important, the task is limited due to the nature of how factual news articles are written — the critical information is almost always contained in the first couple sentences of the article. Focusing on narrative therefore poses a much more interesting challenge for summarization.”

Publications & Presentations

  • Presentation at the 2023 CAIT Showcase, May 2023
  • Preparing three additional papers for publication by December
  • Completed Admission to PhD Candidacy [Candidacy talk deck

 

2022 Awards

Faculty Research Projects

PI: Eric Balkanski (Columbia Engineering)

This project develops fast optimization techniques for fundamental problems in machine learning. In a wide variety of domains, such as computer vision, recommender systems, and immunology, objectives we care to optimize exhibit a natural diminishing returns property called submodularity. Off-the-shelf tools have been developed to exploit the common structure of these problems and have been used to optimize complex objectives. However, the main obstacle to the widespread use of these optimization techniques is that they are inherently sequential and too slow for problems on large data sets. Consequently, the existing toolbox for submodular optimization is not adequate to solve large scale optimization problems in ML.

This projects considers developing novel parallel optimization techniques for problems whose current state-of-the-art algorithms are inherently sequential and hence cannot be parallelized. In a recent line of work, we developed algorithms that achieve an exponential speedup in parallel running time for problems that satisfy a diminishing returns property. These algorithms use new techniques that have shown promising results for problems such as movie recommendation and maximizing influence in social networks. They also open exciting possibilities for further speedups as well as for applications in computer vision and public health, where important challenges remain.

PI: Julia Hirschberg (Columbia Engineering)

Much research has been done in the past 15 years on creating empathetic responses in text, facial expression and gesture in conversational systems. However almost none has been done to identify the speech features that can create an empathetic sounding voice. Empathy is the ability to understand another’s feelings as if we were having those feelings ourselves and Compassionate Empathy includes the ability to take action to mitigate any problems. This type of category has been found to be especially useful in dialogue systems, avatars, and robots, since empathetic behavior can encourage users to like a speaker more, to believe the speaker is more intelligent, to actually take the speaker’s advice, to trust and like it more, and to want to speak with the speaker longer and more often. We identify acoustic/prosodic as well as lexical features which produce empathetic speech by collecting the first corpus of empathetic podcasts and videos, crowdsourcing their labels for empathy, building machine learning models to identify empathetic speech and the speech and language features as well as the visual features which can be used to generate it.

PI: Will Ma (Columbia Graduate School of Business)

Fueled by the insatiable customer desire for faster delivery, e-tailers have begun deploying "forward" distribution centers close to city centers, which have very limited space. Our proposal is to develop scalable optimization algorithms that allow e-tailers to systematically determine the SKU variety and inventory that should be placed in these precious spaces. Our model accounts for demand that depends endogenously on our SKU selection, inventory pooling effects, and the interplay between different categories of SKU's. Our model is designed to yield insights about: the relationship between demand variability and SKU fragmentation; sorting rules for selecting a few SKU's within a given category; and the marginal value of capacity to different categories.

PI: Matei Ciocarlie, Shuran Song (Columbia Engineering)

This project uses novel methods for leveraging human assistance in Reinforcement Learning (RL). The sparse reward problem has been one of the biggest challenges in RL, often leading to inefficient exploration and learning. While real-time immediate feedback from a human could resolve this issue, it is often impractical for complex tasks that require a large number of training steps. To address this problem, we aim to develop new confidence measures, which the agent computes during both training and deployment. In this paradigm, a Deep RL policy will train autonomously, but stop and request assistance when the confidence in the ultimate success of the task is too low to continue. We aim to show that expert assistance can speed up learning and/or increase performance, while minimizing the number of calls for assistance made to the expert.

PIs: Asaf Cidon, Junfeng Yang (Columbia Engineering)

Full-precision deep learning models are often too large or costly to deploy on edge devices such as Amazon Echo, Ring, and Fire devices. To accommodate to the limited hardware resources, models are often quantized, com-pressed, or pruned.While such techniques often have a negligible impact on top-line accuracy, the adapted models exhibit subtle differences in output compared to the full-precision model from which they are derived.

We propose a new attack termed Adversarial Deviation Attack, or ADA, that exploits the differences in model quantization, compression and pruning, by adding adversarial noise to input data that maximizes the output difference between the original and the edge model. It will construct malicious inputs that will trick the edge model but will be virtually undetectable by the original model. Such an attack is particularly dangerous: even after extensive robust training on the original model, quantization, compression or pruning will always introduce subtle differences, providing ample vulnerabilities for the attackers. Moreover, data scientists may not even be able to notice such attacks because the original model typically serves as the authoritative model version, used for validation, debugging and retraining. We will also investigate how new or existing defenses can fend off ADA attacks, greatly improving the security of edge devices.

CAIT PhD Fellows

Project Summary: It is well known that the problem of computing bounds for the expected values or probabilities of outcome variables under interventions in causal graphs with unobserved confounders can be formulated as linear programs (LPs). However, the size of the LP is exponential in the number of edges in the causal graph; and consequently, these LPs can be solved only for very small causal graphs. Madhu’s research has developed methods to significantly reduce the size of the LP, and efficiently construct the smaller LP, increasing the size of casual graphs for which these bounds can be computed. As a corollary she also develops a method for computing the bounds in closed form for a particular class of causal graphs. This work was published in International Conference on Machine Learning 2022 (ICML 2022).

Next, she developed methods for computing personalized predictive bounds using observations of a particular unit's response to earlier interventions. For example, consider a system which makes sequential decisions about which products to recommend to a user. Once we obtain a few data points from the user from initial stages of experimentation, how can we augment this limited data with historical data and causal mechanisms to make the best decisions? This problem has been avoided in prior work, since incorporating data from observations to construct bounds requires solving a non-linear optimization problem. Madhu showed that this nonlinear problem has a very special structure: it is a fractional linear program with a constraint qualification that allows it to be reformulated as a linear program. She also showed how to efficiently compute the reformulated program. This extension has been published in the Journal of Machine Learning Research (JMLR).

Publications & Presentations

  • Presentation at the 2023 CAIT Showcase, May 2023
  • Madhumitha Sridharan and Garud Iyengar (2022). “Scalable Computation of Causal Bounds”, Proceedings of the International Conference on Machine Learning 2022
  • Madhumitha Shridharan, Garud Iyengar. (2023) “Causal Bounds in Quasi-Markovian Graphs.” Proceedings of the 40th International Conference on Machine Learning, PMLR 
  • Madhumitha Shridharan, Garud Iyengar. (2023) “Scalable Computation of Causal Bounds”. Journal of Machine Learning Research. 24(237): 2023.

Project Summary: Large-scale language models based on transformer architectures, such as GPT-3 have advanced the state of the art in Natural Language Understanding and Generation. However, even though these models have shown impressive performance in a zero-shot, few-shot or supervised-setting for a variety of tasks, they often struggle with implicit or non-compositional meaning. In prior work, Chakrabarty’s research group showed how language models fail to reason about implicit knowledge related to physical and visual world. To handle these limitations effectively, Tuhin’s research combines commonsense knowledge with the power of transfer learning from large-scale pre-trained language models to improve the capabilities of current language models. Results could impact challenging tasks such as figurative language generation, interpretation of idioms.

Publications & Presentations

  • Presentation at the 2023 CAIT Showcase, May 2023

2021 Awards

Faculty Research Projects

PIs: Christos Papadimitriou (Columbia Engineering), Tim Roughgarden (Columbia Engineering)

Christos Papadimitriou and Tim Roughgarden, in collaboration with their Amazon Research contacts Michael Kearns and Aaron Roth, use machine learning, algorithms, and social science techniques to explore through analysis and experiment ways in which the tremendous power of machine learning can be applied to render machine learning more fair. Can deep nets be trained through synthetic fairness criticism to treat their data more equitably, and can the unfair treatment of subpopulations be discovered automatically? How can one predict and mitigate the detrimental effect a classifier can have on people by incentivizing them to modify their behavior in order to "game" the classifier? And what is the precise nature of the incentives and learning behavior involved in the interaction of users with online software platforms?

PIs: Awi Federgruen (CBS), Daniel C. Guetta (CBS), Garud Iyengar (Engineering)

Inventory management is as old as retail - keeping too much inventory on hand results in locking up capital, and incurring high storage costs; keeping too little risks selling out, losing revenue, and customer dissatisfaction. Retail has changed in significant and dramatic ways over the last two decades - demands are now fulfilled from complex fulfillment networks, facilities are often located in increasingly urban areas with very limited storage capacity, and an enormous variety of product compete for space in these facilities. In this project, we build upon a long line of research on this problem, and extend it to be able to cope with the myriad new faces of retail and fulfillment in the 21st century. 

PIs: Zoran Kostic (Engineering), Maxim Topaz (Nursing), Maryam Zolnoori (Nursing)

This study is the first step in exploring an emerging and previously understudied data stream - verbal communication between healthcare providers and patients. In partnership between Columbia Engineering, School of Nursing, Amazon, and the largest home healthcare agency in the US, the study will investigate how to use audio-recorded routine communications between patients and nurses to help identify patients at risk of hospitalization or emergency department visits. The study will combine speech recognition, machine learning and natural language processing to achieve its goals. 

PI: Kathy McKeown (Engineering)

Most research in text summarization today focuses on summarization of news articles and for this genre, much of the wording in the summary is directly copied from the summarized article. In contrast, in many other genres, the summary uses language that is very different from the input. The summary may use extensive paraphrasing, large amounts of compression, syntactic rewriting at the sentence level and fusion of phrases from different parts of the input document. This kind of summarization is quite difficult for today's deep learning systems. In this project, we develop methods to enable generation of three forms of abstraction: paraphrasing, compression and fusion; we aim to develop separate models for each and compare with a joint learning approach. This work will be done within a controllable generation paradigm, where the system can determine the abstractive technique that is most appropriate depending on context.

CAIT PhD Fellows

Noemie Perivier, IEOR (advisor: Vineet Goyal)

Interest: Sequential decision making under uncertainty, design of online algorithms in data-rich environments, with applications in revenue management problems.

Mia Chiquier, CS (advisor: Carl Vondrick)

Interest: Computational framework that integrates sound and vision; to improve current machine perception systems by adopting a more integrated understanding of agents in environments.