Rogerio Schmidt Feris
Principal Scientist and Manager
MIT-IBM Watson AI Lab
IBM Research
email: rsferis-at-us.ibm.com
I am a principal scientist and manager at the MIT-IBM Watson AI lab. I am broadly interested in teaching machines to see, listen, and read, with little or no supervision, like humans do. In particular, my recent work has been centered on augmenting large language models with memory, multiple modalities (vision, sound, speech, ...), and specializing LLMs for enterprise domains.
I am passionate about doing fundamental research as well as developing systems that make a real-world impact. My work has not only been published in top AI conferences, but has also been integrated into multiple products, and covered by media outlets such as the New York Times, ABC News, and CBS 60 minutes. See my bio for more information about me.
News:
Our work on AI/ML for auto-curation of sports highlights has been selected for a Technology and Engineering Emmy award! Our system has been used to automatically produce the official highlights of the USOpen, Wimbledon, and Masters tournaments.
We received the award in Las Vegas, as part of the National Association of Broadcasters (NAB) ceremony -- April 16, 2023
-
We received a best paper award honorable mention at WACV 2023
-
Check our CVPR 2024 Workshop on "What is Next in Multimodal Foundation Models?"
Area Chair: ECCV 2024, ICLR 2024, NeurIPS 2023, ICCV 2023, ICLR 2023, CVPR 2023, NeurIPS 2022, ICLR 2022, NeurIPS 2021, CVPR 2021, ICML 2021, ICLR 2021, NeurIPS 2020, ECCV 2020, CVPR 2020, NeurIPS 2019, NeurIPS 2018, ACM MM 2017, CVPR 2017, CVPR 2016, CVPR 2015, ISVC 2015, ICCV 2015
Associate Editor: IEEE Transactions on Pattern Analysis and Machine Intelligence (2018-2023)
See more Professional Activities
Principal Investigator:
- DARPA Learning with Less Labels (2019-2023)
- IARPA Deep Intermodal Video Analytics (2017-2021)
CAMELoT: Towards Large Language Models with Training-Free Associative Memory
Zexue He, Leonid Karlinsky, Donghyun Kim, Julian McAuley, Dmitry Krotov, Rogerio Feris
[Arxiv Preprint] [Code]
AI Commentary: Generative AI for Sports and Entertainment
We worked closely with the IBM Consulting team to create a system that generated AI commentary for all official highlights of the 2023 US Open and Wimbledon tournaments.
[Project Page] [IBM Blog] [CNN] [Fox News] [ESPN] [Forbes] [NBC News]
Self-Specialization: Uncovering Latent Expertise within Large Language Models
Junmo Kang, Hongyin Luo, Yada Zhu, James Glass, David Cox, Alan Ritter, Rogerio Feris, Leonid Karlinsky
ACL 2024
[Paper]
LATERAL: Learning Automatic, Transfer-Enhanced, and Relation-Aware Labels
Assaf Arbelle, Graeme Blackwood, Leonid Karlinsky, Aadarsh Sahoo, Joseph Schtok, Rogerio Feris
Learning Human Action Recognition Representations Without Real Humans
Howard Zhong, Samarth Mishra, Donghyun Kim, SouYoung Jin, Rameswar Panda, Hilde Kuehne, Leonid Karlinsky, Venkatesh Saligrama, Aude Oliva, Rogerio Feris
NeurIPS 2023
[Paper] [Code and Data]
Synthetic Pre-Training Tasks for Neural Machine Translation
Zexue He, Graeme Blackwood, Rameswar Panda, Julian McAuley, Rogerio Feris
ACL 2023
[Paper]
Learning to Grow Pretrained Models for Efficient Transformer Training
Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Cox, Zhangyang Wang, Yoon Kim
ICLR 2023 (notable-top-25%)
[Paper] [Project Page] [Code]
Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning
Zhen Wang, Rameswar Panda, Leonid Karlinsky, Rogerio Feris, Huan Sun, Yoon Kim
ICLR 2023
[Paper] [Project Page] [Code]
Aadarsh Sahoo, Rameswar Panda, Rogerio Feris, Kate Saenko, Abir Das
WACV 2023 (Best Paper Award Honorable Mention)
[Paper] [Project Page] [Code]
Procedural Image Programs for Representation Learning
Manel Baradad, Chun-Fu Chen, Jonas Wulff, Tongzhou Wang, Rogerio Feris, Antonio Torralba, Phillip Isola
NeurIPS 2022
[Paper] [Project Page] [Code]
How Transferable are Video Representations Based on Synthetic Data?
Yo-whan Kim, Samarth Mishra, SouYoung Jin, Rameswar Panda, Hilde Kuehne, Leonid Karlinsky, Venkatesh Saligrama, Kate Saenko, Aude Oliva, Rogerio Feris
NeurIPS 2022
FETA: Towards Specializing Foundation Models for Expert Task Applications
Amit Alfassy, Assaf Arbelle, Oshri Halimi, Sivan Harary, Roei Herzig, Eli Schwartz, Rameswar Panda, Michele Dolfi, Christoph Auer, Peter W. J. Staar, Kate Saenko, Rogerio Feris, Leonid Karlinsky
NeurIPS 2022
Task2Sim: Towards Effective Pre-training and Transfer from Synthetic Data
Samarth Mishra, Rameswar Panda, Cheng Perng Phoo, Chun-Fu Chen, Leonid Karlinsky, Kate Saenko, Venkatesh Saligrama, Rogerio Feris
CVPR 2022
[Paper] [Project Page] [Code]
Unsupervised Domain Generalization by Learning a Bridge Across Domains
Sivan Harary, Eli Schwartz, Assaf Arbelle, Peter Staar, Shady Abu-Hussein, Elad Amrani, Roei Herzig, Amit Alfassy, Raja Giryes, Hilde Kuehne, Dina Katabi, Kate Saenko, Rogerio Feris, Leonid Karlinsky
CVPR 2022, Oral
SimVQA: Exploring Simulated Environments for Visual Question Answering
Paola Cascante-Bonilla, Hui Wu, Letao Wang, Rogerio Feris, Vicente Ordonez
CVPR 2022
[Paper] [Project Page] [Code]
VALHALLA: Visual Hallucination for Machine Translation
Yi Li, Rameswar Panda, Yoon Kim, Chun-Fu Chen, Rogerio Feris, David Cox, Nuno Vasconcelos
CVPR 2022
[Paper] [Project Page] [Code]
Everything at Once – Multi-modal Fusion Transformer for Video Retrieval
N. Shvetsova, B. Chen, A. Rouditchenko, S.Thomas, B. Kingsbury, R. Feris, D. Harwath, J. Glass, and H. Kuehne
CVPR 2022
[Paper]
Targeted Supervised Contrastive Learning for Long-Tailed Recognition
Tianhong Li, Peng Cao, Yuan Yuan, Lijie Fan, Yuzhe Yang, Rogerio Feris, Piotr Indyk, Dina Katabi
CVPR 2022
[Paper] [Code]
IA-RED^2: Interpretability-Aware Redundancy Reduction for Vision Transformers
Bowen Pan, Rameswar Panda, Yifan Jiang, Zhangyang Wang, Rogerio Feris, Aude Oliva
NeurIPS 2021
[Paper] [Project Page] [Code]
Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data
Ashraful Islam, Chun-Fu Chen, Rameswar Panda, Leonid Karlinsky, Rogerio Feris, Richard J. Radke
NeurIPS 2021
Dynamic Network Quantization for Efficient Video Inference
Ximeng Sun, Rameswar Panda, Chun-Fu Chen, Aude Oliva, Rogerio Feris, Kate Saenko
ICCV 2021
[Paper] [Project Page] [Code]
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
Rameswar Panda, Chun-Fu Chen, Quanfu Fan, Ximeng Sun, Kate Saenko, Aude Oliva, Rogerio Feris
ICCV 2021
[Paper] [Project Page] [Code]
A Broad Study on the Transferability of Visual Representations with Contrastive Learning
Ashraful Islam, Chun-Fu Chen, Rameswar Panda, Leonid Karlinsky, Richard Radke, Rogerio Feris
ICCV 2021
Detector-Free Weakly Supervised Grounding by Separation
Assaf Arbelle, Sivan Doveh, Amit Alfassy, Joseph Shtok, Guy Lev, Eli Schwartz, Hilde Kuehne, Hila Barak Levi, Prasanna Sattigeri, Rameswar Panda, Chun-Fu Chen, Alex Bronstein, Kate Saenko, Shimon Ullman, Raja Giryes, Rogerio Feris, Leonid Karlinsky
ICCV 2021, Oral
[Paper]
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie Boggust, Rameswar Panda, Brian Kingsbury, Rogerio Feris, David Harwath, James Glass, Michael Picheny, Shih-Fu Chang
ICCV 2021
[Paper]
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Andrew Rouditchenko, Angie Boggust, David Harwath, Brian Chen, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Hilde Kuehne, Rameswar Panda, Rogerio Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James Glass
Interspeech 2021
[Paper] [Project Page] [Video Demo] [Code]
Cascaded Multilingual Audio-Visual Learning from Videos
Andrew Rouditchenko, Angie Boggust, David Harwath, Samuel Thomas, Hilde Kuehne, Brian Chen, Rameswar Panda, Rogerio Feris, Brian Kingsbury, Michael Picheny, James Glass
Interspeech 2021
[Paper] [Project Page] [Code]
Fine-grained Angular Contrastive Learning with Coarse Labels
Guy Bukchin, Eli Schwartz, Kate Saenko, Ori Shahar, Rogerio Feris, Raja Giryes, Leonid Karlinsky
CVPR 2021, Oral
[Paper]
Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback
Hui Wu, Yupeng Gao, Xiaoxiao Guo, Ziad Al-Halah, Steven Rennie, Kristen Grauman, Rogerio Feris
CVPR 2021
Separating Skills and Concepts for Novel Visual Question Answering
Spencer Whitehead, Hui Wu, Heng Ji, Rogerio Feris, Kate Saenko
CVPR 2021
[Paper]
Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Mathew Monfort, SouYoung Jin, Alexander Liu, David Harwath, Rogerio Feris, James Glass, Aude Oliva
CVPR 2021
[Project Page] [Paper] [Data]
Semi-Supervised Action Recognition with Temporal Contrastive Learning
Ankit Singh, Omprakash Chakraborty, Ashutosh Varshney, Rameswar Panda, Rogerio Feris, Kate Saenko, Abir Das
CVPR 2021
[Paper]
Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition
Chun-Fu Chen, Rameswar Panda, Kandan Ramakrishnan, Rogerio Feris, John Cohn, Aude Oliva, Quanfu Fan
CVPR 2021
AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition
Yue Meng, Rameswar Panda, Chung-Ching Lin, Prasanna Sattigeri, Leonid Karlinsky, Kate Saenko, Aude Oliva, Rogerio Feris
ICLR 2021
[Paper] [Project Page] [Code]
VA-RED^2: Video Adaptive Redundancy Reduction
Bowen Pan, Rameswar Panda, Camilo Fosco, Chung-Ching Lin, Alex Andonian, Yue Meng, Kate Saenko, Aude Oliva, Rogerio Feris
ICLR 2021
[Paper] [Project Page] [Code]
StarNet: towards Weakly Supervised Few-Shot Object Detection
Leonid Karlinsky, Joseph Shtok, Amit Alfassy, Moshe Lichtenstein, Sivan Harary, Eli Schwartz, Sivan Doveh, Prasanna Sattigeri, Rogerio Feris, Alexander Bronstein, Raja Giryes
AAAI 2021
NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search
Rameswar Panda, Michele Merler, Mayoore Jaiswal, Hui Wu, Kandan Ramakrishnan, Ulrich Finkler, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio Feris, Bishwaranjan Bhattacharjee
AAAI 2021
[Paper]
AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning
Ximeng Sun, Rameswar Panda, Kate Saenko, Rogerio Feris
NeurIPS 2020
[Paper] [Project Page] [Code]
A Broader Study of Cross-Domain Few-Shot Learning
Yunhui Guo, Noel C. Codella, Leonid Karlinsky, James V. Codella, John R. Smith, Kate Saenko, Tajana Rosing, Rogerio Feris
ECCV 2020
See also: CVPR VL3 Workshop and the challenge associated with our benchmark
[Paper] [Code and Data]
AR-Net: Adaptive Frame Resolution for Efficient Action Recognition
Yue Meng, Chung-Ching Lin, Rameswar Panda, Prasanna Sattigeri, Leonid Karlinsky, Aude Oliva, Kate Saenko, Rogerio Feris
ECCV 2020
[Paper] [Project Page] [Code] [MIT News]
OnlineAugment: Online Data Augmentation with Less Domain Knowledge
Zhiqiang Tang, Yunhe Gao, Leonid Karlinsky, Prasanna Sattigeri, Rogerio Feris, Dimitris Metaxas
ECCV 2020
TAFSSL: Task-Adaptive Feature Sub-Space Learning for Few-shot Classification
Moshe Lichtenstein, Prasanna Sattigeri, Rogerio Feris, Raja Giryes, Leonid Karlinsky
ECCV 2020
We Have So Much In Common: Modeling Semantic Relational Set Abstractions in Videos
Camilo Fosco, Mathew Monfort, Allen Lee, Rogerio Feris, Carl Vondrick, Aude Oliva
ECCV 2020
[Paper] [Code] [Project Page] [MIT News]
Video Instance Segmentation Tracking
Chung-Ching Lin, Ying Hung, Rogerio Feris, Linglin He
CVPR 2020
[Paper]
Zhonghao Wang, Mo Yu, Yunchao Wei, Rogerio Feris, Jinjun Xiong, Wen-mei Hwu, Thomas S. Huang, Humphrey Shi
CVPR 2020
Learning from Lexical Perturbations for Consistent Visual Question Answering
Spencer Whitehead, Hui Wu, Yi Fung, Heng. Ji, Rogerio Feris, Kate Saenko
Arxiv 2020
[Paper]
Khoi-Nguyen C. Mac, Dhiraj Joshi, Raymond A. Yeh, Jinjun Xiong, Rogerio S. Feris, Minh N. Do
ICCV 2019, Oral
[Paper] [Code] [Project Page]
LaSO: Label-Set Operations Networks for Multi-label Few-shot Learning
Amit Alfassy, Leonid Karlinsky, Amit Aides, Joseph Shtok, Sivan Harary, Rogerio Feris, Raja Giryes, Alex M. Bronstein
CVPR 2019, Oral
SpotTune: Transfer Learning through Adaptive Fine-tuning
Yunhui Guo, Honghui Shi, Abhishek Kumar, Kristen Grauman, Tajana Rosing, Rogerio Feris
CVPR 2019
Top results on the Visual Decathlon challenge (2019)
RepMet: Representative-based Metric Learning for Classification and One-shot Object Detection
Leonid Karlinsky, Joseph Shtok, Sivan Harary, Eli Schwartz, Amit Aides, Rogerio Feris, Raja Giryes, Alex M. Bronstein
CVPR 2019
Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition
Chun-Fu Chen, Quanfu Fan, Neil Mallinar, Tom Sercu, Rogerio Feris
ICLR 2019
Automatic Curation of Sports Highlights using Multimodal Excitement Features
Michele Merler, Khoi Nguyen C. Mac, Dhiraj Joshi, Quoc Bao Nguyen, Stephen Hammer, John Kent, Jinjun Xiong, Minh N. Do, John R. Smith, Rogerio Feris
IEEE Transactions on MultiMedia (TMM) 2019
Our system was used to produce the official highlights of the USOpen, Wimbledon, and Masters tournaments (and watched by millions of fans worldwide)
[Paper] [Blog] [Video Demo 1] [Video Demo 2] [New York Times] [Fortune] [Newsweek] [Engadget] [NBC News] [Behind the Code]
The Excitement of Sports: Automatic Highlights using Audio-Visual Cues
Michele Merler, D. Joshi, Khoi-Nguyen C. Mac, Q. Nguyen, Stephen Hammer, John Kent, Jinjun Xiong, Minh Do, John Smith, Rogerio Feris
CVPR Workshop on Sight and Sound, 2018
[Paper] [Slides] [Video Demo 1] [Video Demo 2] [Blog] [Venturebeat] [ZDNet]
Delta-Encoder: an Effective Sample Synthesis Method for Few-shot Object Recognition
Eli Schwartz, Leonid Karlinsky, Joseph Shtok, Sivan Harary, Mattias Marder, Rogerio Feris, Abhishek Kumar, Raja Giryes, Alex M. Bronstein
NeurIPS 2018, Spotlight
Dialog-based Interactive Image Retrieval
Xiaoxiao Guo, Hui Wu, Yu Cheng, Steven Rennie, Gerald Tesauro, Rogerio Feris
NeurIPS 2018
Co-regularized Alignment for Unsupervised Domain Adaptation
Abhishek Kumar, Prasanna Sattigeri, Kahini Wadhawan, Leonid Karlinsky, Rogerio Feris, William T. Freeman, Gregory Wornell
NeurIPS 2018
[Paper]
Learning to Separate Object Sounds by Watching Unlabeled Video
Ruohan Gao, Rogerio Feris, Kristen Grauman
ECCV 2018, Oral
[Paper] [Project Page] [Code]
Revisiting RCNN: On Awakening the Classification Power of Faster RCNN
Bowen Cheng, Yunchao Wei, Honghui Shi, Rogerio Feris, Jinjun Xiong, Thomas Huang
ECCV 2018
DCR achieved state-of-the-art results on Pascal VOC and MS-COCO
BlockDrop: Dynamic Inference Paths in Residual Networks
Zuxuan Wu, Tushar Nagarajan, Abhishek Kumar, Steven Rennie, Larry S. Davis, Kristen Grauman, Rogerio Feris
CVPR 2018, Spotlight
Xi Peng, Zhiqiang Tang, Fei Yang, Rogerio Feris, Dimitris Metaxas
CVPR 2018
Yongxi Lu, Abhishek Kumar, Shuangfei Zhai, Yu Cheng, Tara Javidi, Rogerio Feris
CVPR 2017, Spotlight
[Paper]
S3Pool: Pooling with Stochastic Spatial Sampling
Shuangfei Zhai, Hui Wu, Abhishek Kumar, Yu Cheng, Yongxi Lu, Zhongfei Zhang, Rogerio Feris
CVPR 2017
A Unified Multi Scale Deep Convolutional Neural Network for Fast Object Detection
Zhaowei Cai, Quanfu Fan, Rogerio S. Feris, Nuno Vasconcelos
ECCV 2016
MS-CNN achieved state-of-the-art results on the popular KITTI dataset
[Paper] [Code] [Demo] [KITTI results] [Project Page]
A Recurrent Encoder-Decoder Network for Sequential Face Alignment
Xi Peng, Rogerio S. Feris, Xiaoyu Wang, Dimitris N. Metaxas
ECCV 2016
[Paper] [Code] [Project Page] [Video Demo]
Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data
Jing Wang, Yu Cheng, Rogerio Feris
CVPR 2016, Oral
[Paper]
Rogerio Feris, Christoph Lampert, Devi Parikh
Advances in Computer Vision and Pattern Recognition, Springer, 2016
An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections
Yu Cheng, Felix X. Yu, Rogerio S. Feris, Sanjiv Kumar, Alok Choudhary, Shih-Fu Chang
ICCV 2015
[Paper]
Cross-domain Image Retrieval with a Dual Attribute-aware Ranking Network
Junshi Huang, Rogerio S. Feris, Qiang Chen, Shuicheng Yan
ICCV 2015
Deep Domain Adaptation for Describing People Based on Fine-Grained Clothing Attributes
Qiang Chen, Junshi Huang, Rogerio Feris, Lisa M Brown, Jian Dong, Shuicheng Yan
CVPR 2015
[Paper]
Rogerio Feris, Russ Bobbit, Lisa Brown, Sharath Pankanti
ICMR 2014
See also:
-
Walk and Learn (CVPR 2016 Oral)
[Paper] [Video Demo]
Fast Face Detector Training Using Tailored Views
Kristina Scherbaum, James Petterson, Rogerio Feris, Volker Blanz, Hans-Peter Seidel
ICCV 2013
[Paper]
Efficient Maximum Appearance Search for Large-Scale Object Detection
Qiang Chen, Zheng Song, Rogerio Feris, Ankur Datta, Liangliang Cao, Zhongyang Huang, Shuicheng Yan
CVPR 2013
[Paper]
Designing Category-level Attributes for Discriminative Visual Recognition
Felix X. Yu, Liangliang Cao, Rogerio Feris, John R. Smith, Shih-Fu Chang
CVPR 2013
[Paper]
Large-Scale Vehicle Detection, Indexing, and Search in Urban Surveillance Videos
Rogerio Feris, Behjat Siddiquie, James Petterson, Yun Zhai, Ankur Datta, Lisa Brown, Sharath Pankanti
IEEE Transactions on Multimedia, 2012
See also: Feris et al, Attribute-based Vehicle Search in Crowded Surveillance Videos, ICMR 2011
[Paper] [Video Demos]
Image Ranking and Retrieval Based on Multi-Attribute Queries
Behjat Siddiquie, Rogerio Feris, Larry Davis
CVPR 2011, Oral
[Paper]
Shape Classification Through Structured Learning of Matching Measures
Longbin Chen, Julian McAuley, Rogerio Feris, Tiberio Caetano, Matthew Turk
CVPR 2009
A Projector-Camera Setup for Geometry-Invariant Frequency Demultiplexing
Daniel Vaquero, Ramesh Raskar, Rogerio Feris, Matthew Turk
CVPR 2009
[Paper]
Characterizing the Shadow Space of Camera-Light Pairs
Daniel Vaquero, Rogerio Feris, Mathew Turk, Ramesh Raskar
CVPR 2008
[Paper]
Manifold-based Analysis of Facial Expression
Ya Chang, Changbo Hu, Rogerio Feris, Matthew Turk
Image and Vision Computing, 2006
[Paper] [Video Demo]
Discontinuity Preserving Stereo with Small Baseline Multi-Flash Illumination
Rogerio Feris, Longbin Chen, Matthew Turk, Ramesh Raskar, KarhanTan
ICCV 2005, Oral
See also: Feris et al, TPAMI 2007
[Paper] [Project Page] [Code] [Data]
Automatic Human Facial Illustrations with Variable Illumination
Rogerio Feris and Alex Olwal
SIGGRAPH Emerging Technologies, 2005 (Interactive Fogscreen)
[Project Page] [Code]
Non-photorealistic Camera: Depth Edge Detection and Stylized Rendering using Multi-Flash Imaging
Ramesh Raskar, Karhan Tan, Rogerio Feris, Jingyi Yu, Matthew Turk
SIGGRAPH 2004
[Paper] [Code] [Project Page] [Video Demo]
Specular Reflection Reduction with Multi-Flash Imaging
Rogerio Feris, Ramesh Raskar, Karhan Tan, Matthew Turk
SIGGRAPH 2004
[Paper]
Shape Enhanced Surgical Visualizations and Medical Illustrations with Multi-flash Imaging
Karhan Tan, James Kobler, Rogerio Feris, Paul Dietz, Ramesh Raskar
MICCAI 2004
[Paper]
Exploiting Depth Discontinuities for Vision-based Fingerspelling Recognition
Rogerio Feris, Matthew Turk, Ramesh Raskar, Karhan Tan, Gosuke Ohashi
CVPR RTV4HCI Workshop 2004
[Paper]
Hierarchical Wavelet Networks for Facial Feature Localization
Rogerio Feris, James Gemmell, Kentaro Toyama, Volker Krueger
Face and Gesture Recognition 2002
Developed as part of the GazeMaster project for videoconferencing. I did this work during my internship at Microsoft Research in 2001.
Efficient Real-Time Face Tracking in Wavelet Subspace
Rogerio Feris, Volker Krueger, Roberto Cesar
ICCV RATFG-RTS Workshop, 2001
[Paper] [Video Demo 1] [Video Demo 2]
Recent Invited/Keynote Talks
-
“Learning with Real and Unreal Data in the Era of Foundation Models” [pdf]
- Conference on Graphics, Patterns and Images (Sibgrapi 2023), Brazil, 2023.
-
“Learning Trusted Models with Less Data” [pdf]
- IJCAI Workshop on Generalizing from Limited Resources in the Open World, 2023.
-
"Representation Learning based on Synthetic Data"[pdf]
- Google Research India, 2022
- Flatiron Institute, 2022
- Binghamton University, 2022
- University of British Columbia, 2021
-
"Dynamic Neural Networks for Efficient Multimodal Video Understanding" [Video]
- Boston University AIR Distinguished Speaker Series, 2021
- CVPR 2021 L2ID Workshop
- CVPR 2021 MULA Workshop.
- CVPR 2021 LatinX Workshop
-
AI We Can Scale, "Learning to Learn" [Video]
- What's Next in AI event, 2020
- ICML 2020 LatinX in AI Workshop
- CVPR 2020 DIRA Workshop
- NeurIPS 2019 EMC^2 Workshop
-
"Is it All Relative? Interactive Fashion Search based on Relative Natural Language Feedback” [pdf]
- CVPR 2019 FFSS-USAD Workshop
-
“Speeding Up Deep Neural Networks with Adaptive Computation and Efficient Multi-Scale Architectures”[pdf]
- CVPR 2019 EMC^2 Workshop
-
"Learning More from Less: Weak Supervision and Beyond" [pdf]
- CVPR 2019 Workshop on Learning from Imperfect Data
Media Press
-
A simpler path to better computer vision. MIT News, 2022
-
A safer, lower-cost alternative to real data for pretraining computer vision models. IBM Research blog, 2022.
-
Hallucinating to Better Text Translation. Communications of the ACM / MIT News, 2022.
-
IBM’s StarNet brings explainable AI to image classification. VentureBeat, 2020.
-
Shrinking deep learning’s carbon footprint. MIT News, 2020.
-
IBM’s AI creates new labeled image sets using semantic content. VentureBeat, 2019.
-
IBM researchers develop a pair of low-power, high-performance computer vision systems. VentureBeat, 2018.
-
Coffee delivery drone patented by IBM. BBC News, 2018.
-
Enjoy Those U.S. Open Highlights. A Computer Picked Them for You. New York Times, 2017.
-
How an IBM Computer Picks U.S. Open Highlights. Fortune, 2017.
-
IBM’s Watson Serves Up This Year’s U.S. Open Highlights. NBC News, 2017.
-
IBM's Watson is creating US Open tennis highlight videos. Engadget, 2017.
-
IBM uses AI to serve up Wimbledon highlights. CNet, 2017.
-
IBM To Provide Wimbledon Highlights Using Artificial Intelligence. SportTechie, 2017.
-
IBM Watson is creating highlight reels at the Masters. ZDNet, 2017.
-
Fighting terrorism in New York City. CBS 60 Minutes, 2011. (starting at minute 7:20)
-
ABC7 puts video analytics to the test. ABC News, 2010.
-
Cameras help confirm Scott suicide ruling. ABC News, 2009.
-
Intelligent Iris. ABC News, 2008.
-
The Nonphotorealistic camera. SlashDot, 2004.
The postings on this site are my own and don't necessarily represent IBM's positions.