top of page
Rogerio Schmidt Feris

Principal Scientist and Manager

MIT-IBM Watson AI Lab

IBM Research


I am a principal scientist and manager at the MIT-IBM Watson AI lab. I am broadly interested in teaching machines to see, listen, and read, with little or no supervision, like humans do. In particular,  my recent work has been centered on augmenting large language models with memory, multiple modalities (vision, sound, speech, ...), and specializing LLMs for enterprise domains.


I am passionate about doing fundamental research as well as developing systems that make a real-world impact. My work has not only been published in top AI conferences, but has also been integrated into multiple products, and covered by media outlets such as the New York Times, ABC News, and CBS 60 minutes. See my bio for more information about me.


Our work on AI/ML for auto-curation of sports highlights has been selected for a Technology and Engineering Emmy award!  Our system has been used to automatically produce the official highlights of the USOpen, Wimbledon, and Masters tournaments. 

We received the award in Las Vegas, as part of the National Association of Broadcasters (NAB) ceremony -- April 16, 2023


Area Chair: ECCV 2024, ICLR 2024, NeurIPS 2023, ICCV 2023, ICLR 2023, CVPR 2023, NeurIPS 2022, ICLR 2022, NeurIPS 2021, CVPR 2021, ICML 2021, ICLR 2021, NeurIPS 2020, ECCV 2020, CVPR 2020, NeurIPS 2019, NeurIPS 2018, ACM MM 2017, CVPR 2017, CVPR 2016, CVPR 2015, ISVC 2015, ICCV 2015 

Associate Editor: IEEE Transactions on Pattern Analysis and Machine Intelligence (2018-2023)

See more Professional Activities

Principal Investigator: 

- DARPA Learning with Less Labels (2019-2023)

- IARPA Deep Intermodal Video Analytics (2017-2021)

Google Scholar Citations


Selected Publications (see full list by date or by topic)

CAMELoT: Towards Large Language Models with Training-Free Associative Memory


Zexue He, Leonid Karlinsky, Donghyun Kim, Julian McAuley, Dmitry Krotov, Rogerio Feris


[Arxiv Preprint] [Code]

AI Commentary: Generative AI for Sports and Entertainment


We worked closely with the IBM Consulting team to create a system that generated  AI commentary for all official highlights of the 2023 US Open and Wimbledon tournaments.


[Project Page] [IBM Blog] [CNN] [Fox News] [ESPN] [Forbes] [NBC News]

Self-Specialization: Uncovering Latent Expertise within Large Language Models


Junmo Kang, Hongyin Luo, Yada Zhu, James Glass, David Cox, Alan Ritter, Rogerio Feris, Leonid Karlinsky

ACL 2024



LATERAL: Learning Automatic, Transfer-Enhanced, and Relation-Aware Labels

Assaf Arbelle, Graeme Blackwood, Leonid Karlinsky, Aadarsh Sahoo, Joseph Schtok, Rogerio Feris


[DARPA LwLL Final Technical Report]

Learning Human Action Recognition Representations Without Real Humans

Howard Zhong, Samarth Mishra, Donghyun Kim, SouYoung Jin, Rameswar Panda, Hilde Kuehne, Leonid Karlinsky, Venkatesh Saligrama, Aude Oliva, Rogerio Feris

NeurIPS 2023


[Paper] [Code and Data]

Synthetic Pre-Training Tasks for Neural Machine Translation

Zexue He, Graeme Blackwood, Rameswar Panda, Julian McAuley, Rogerio Feris

ACL 2023




Learning to Grow Pretrained Models for Efficient Transformer Training


Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Cox, Zhangyang Wang, Yoon Kim

ICLR 2023 (notable-top-25%)


[Paper]​​ [Project Page] [Code]


Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning


Zhen Wang, Rameswar Panda, Leonid Karlinsky, Rogerio Feris, Huan Sun, Yoon Kim

ICLR 2023


[Paper]​​ [Project Page] [Code]


Select, Label, and Mix: Learning Discriminative Invariant Feature Representations for Partial Domain Adaptation


Aadarsh Sahoo, Rameswar Panda, Rogerio Feris, Kate Saenko, Abir Das

WACV 2023 (Best Paper Award Honorable Mention)


[Paper]​​ [Project Page] [Code]


Procedural Image Programs for Representation Learning


Manel Baradad, Chun-Fu Chen, Jonas Wulff, Tongzhou Wang, Rogerio Feris, Antonio Torralba, Phillip Isola

NeurIPS 2022


[Paper]​​ [Project Page] [Code]


How Transferable are Video Representations Based on Synthetic Data?


Yo-whan Kim, Samarth Mishra, SouYoung Jin, Rameswar Panda, Hilde Kuehne, Leonid Karlinsky, Venkatesh Saligrama, Kate Saenko, Aude Oliva, Rogerio Feris

NeurIPS 2022


[Paper]​​ [Dataset]


FETA: Towards Specializing Foundation Models for Expert Task Applications


Amit Alfassy, Assaf Arbelle, Oshri Halimi, Sivan Harary, Roei Herzig, Eli Schwartz, Rameswar Panda, Michele Dolfi, Christoph Auer, Peter W. J. Staar, Kate Saenko, Rogerio Feris, Leonid Karlinsky

NeurIPS 2022


[Paper]​​ [Code]


Task2Sim: Towards Effective Pre-training and Transfer from Synthetic Data

Samarth Mishra, Rameswar Panda, Cheng Perng Phoo, Chun-Fu Chen, Leonid Karlinsky, Kate Saenko, Venkatesh Saligrama, Rogerio Feris


CVPR 2022


[Paper]​​ [Project Page] [Code]


Unsupervised Domain Generalization by Learning a Bridge Across Domains

Sivan Harary, Eli Schwartz, Assaf Arbelle, Peter Staar, Shady Abu-Hussein, Elad Amrani, Roei Herzig, Amit Alfassy, Raja Giryes, Hilde Kuehne, Dina Katabi, Kate Saenko, Rogerio Feris, Leonid Karlinsky


CVPR 2022, Oral


[Paper]​​ [Demo] [Code]


SimVQA: Exploring Simulated Environments for Visual Question Answering

Paola Cascante-Bonilla, Hui Wu, Letao Wang, Rogerio Feris, Vicente Ordonez


CVPR 2022


[Paper]​​ [Project Page] [Code]


VALHALLA: Visual Hallucination for Machine Translation

Yi Li, Rameswar Panda, Yoon Kim, Chun-Fu Chen, Rogerio Feris, David Cox, Nuno Vasconcelos

CVPR 2022


[Paper]​​ [Project Page] [Code]


Everything at Once – Multi-modal Fusion Transformer for Video Retrieval

N. Shvetsova,  B. Chen,  A. Rouditchenko,  S.Thomas,  B. Kingsbury,  R. Feris,  D. Harwath,  J. Glass,  and H. Kuehne

CVPR 2022




Targeted Supervised Contrastive Learning for Long-Tailed Recognition

Tianhong Li, Peng Cao, Yuan Yuan, Lijie Fan, Yuzhe Yang, Rogerio Feris, Piotr Indyk, Dina Katabi


CVPR 2022


[Paper]​​ [Code]


IA-RED^2: Interpretability-Aware Redundancy Reduction for Vision Transformers


Bowen Pan, Rameswar Panda, Yifan Jiang, Zhangyang Wang, Rogerio Feris, Aude Oliva

NeurIPS 2021


[Paper] [Project Page] [Code]


Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data


Ashraful Islam, Chun-Fu Chen, Rameswar Panda, Leonid Karlinsky, Rogerio Feris, Richard J. Radke

NeurIPS 2021


[Paper] [Code]


Dynamic Network Quantization for Efficient Video Inference


Ximeng Sun, Rameswar Panda, Chun-Fu Chen, Aude Oliva, Rogerio Feris, Kate Saenko

ICCV 2021


[Paper] [Project Page] [Code]


AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition


Rameswar Panda, Chun-Fu Chen, Quanfu Fan, Ximeng Sun, Kate Saenko, Aude Oliva, Rogerio Feris

ICCV 2021


[Paper]​​ [Project Page] [Code]


A Broad Study on the Transferability of Visual Representations with Contrastive Learning

Ashraful Islam, Chun-Fu Chen, Rameswar Panda, Leonid Karlinsky, Richard Radke, Rogerio Feris


ICCV 2021


[Paper] [Code]


Detector-Free Weakly Supervised Grounding by Separation

Assaf Arbelle, Sivan Doveh, Amit Alfassy, Joseph Shtok, Guy Lev, Eli Schwartz, Hilde Kuehne, Hila Barak Levi, Prasanna Sattigeri, Rameswar Panda, Chun-Fu Chen, Alex Bronstein, Kate Saenko, Shimon Ullman, Raja Giryes, Rogerio Feris, Leonid Karlinsky


ICCV 2021, Oral




Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos

Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie Boggust, Rameswar Panda, Brian Kingsbury, Rogerio Feris, David Harwath, James Glass, Michael Picheny, Shih-Fu Chang

ICCV 2021



AVLnet: Learning Audio-Visual Language Representations from Instructional Videos

Andrew Rouditchenko, Angie Boggust, David Harwath, Brian Chen, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Hilde Kuehne, Rameswar Panda, Rogerio Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James Glass

Interspeech 2021


[Paper] [Project Page] [Video Demo] [Code]


Cascaded Multilingual Audio-Visual Learning from Videos

Andrew Rouditchenko, Angie Boggust, David Harwath, Samuel Thomas, Hilde Kuehne, Brian Chen, Rameswar Panda, Rogerio Feris, Brian Kingsbury, Michael Picheny, James Glass

Interspeech 2021

[Paper] [Project Page] [Code]


Fine-grained Angular Contrastive Learning with Coarse Labels


Guy Bukchin, Eli Schwartz, Kate Saenko, Ori Shahar, Rogerio Feris, Raja Giryes, Leonid Karlinsky

CVPR 2021, Oral




Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback

Hui Wu, Yupeng Gao, Xiaoxiao Guo, Ziad Al-Halah, Steven Rennie, Kristen Grauman, Rogerio Feris

CVPR 2021


[Paper] [Data]


Separating Skills and Concepts for Novel Visual Question Answering

Spencer Whitehead, Hui Wu, Heng Ji, Rogerio Feris, Kate Saenko

CVPR 2021




Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions

Mathew Monfort, SouYoung Jin, Alexander Liu, David Harwath, Rogerio Feris, James Glass, Aude Oliva

CVPR 2021


[Project Page] [Paper] [Data]


Semi-Supervised Action Recognition with Temporal Contrastive Learning

Ankit Singh, Omprakash Chakraborty, Ashutosh Varshney, Rameswar Panda, Rogerio Feris, Kate Saenko, Abir Das

CVPR 2021




Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition

Chun-Fu Chen, Rameswar Panda, Kandan Ramakrishnan, Rogerio Feris, John Cohn, Aude Oliva, Quanfu Fan

CVPR 2021


[Paper] [Code]


AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition


Yue Meng, Rameswar Panda, Chung-Ching Lin, Prasanna Sattigeri, Leonid Karlinsky, Kate Saenko, Aude Oliva, Rogerio Feris

ICLR 2021


[Paper]​​ [Project Page] [Code]


VA-RED^2: Video Adaptive Redundancy Reduction


Bowen Pan, Rameswar Panda, Camilo Fosco, Chung-Ching Lin, Alex Andonian, Yue Meng, Kate Saenko, Aude Oliva, Rogerio Feris

ICLR 2021


[Paper]​​ [Project Page] [Code]


StarNet: towards Weakly Supervised Few-Shot Object Detection


Leonid Karlinsky, Joseph Shtok, Amit Alfassy, Moshe Lichtenstein, Sivan Harary, Eli Schwartz, Sivan Doveh, Prasanna Sattigeri, Rogerio Feris, Alexander Bronstein, Raja Giryes

AAAI 2021


[Paper]​​ [Code]


NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search


Rameswar Panda, Michele Merler, Mayoore Jaiswal, Hui Wu, Kandan Ramakrishnan, Ulrich Finkler, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio Feris, Bishwaranjan Bhattacharjee

AAAI 2021




AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning 


Ximeng Sun, Rameswar Panda, Kate Saenko, Rogerio Feris

NeurIPS 2020

[Paper]​​ [Project Page] [Code]


A Broader Study of Cross-Domain Few-Shot Learning


Yunhui Guo, Noel C. Codella, Leonid Karlinsky, James V. Codella, John R. Smith, Kate Saenko, Tajana Rosing, Rogerio Feris

ECCV 2020


See also: CVPR VL3 Workshop and the challenge associated with our benchmark

[Paper]​​ [Code and Data]


AR-Net: Adaptive Frame Resolution for Efficient Action Recognition


Yue Meng, Chung-Ching Lin, Rameswar Panda, Prasanna Sattigeri, Leonid Karlinsky, Aude Oliva, Kate Saenko, Rogerio Feris


ECCV 2020


[Paper]​​ [Project Page] [Code] [MIT News]


OnlineAugment: Online Data Augmentation with Less Domain Knowledge

Zhiqiang Tang, Yunhe Gao, Leonid Karlinsky, Prasanna Sattigeri, Rogerio Feris, Dimitris Metaxas

ECCV 2020


[Paper] [Code]


TAFSSL: Task-Adaptive Feature Sub-Space Learning for Few-shot Classification


Moshe Lichtenstein, Prasanna Sattigeri, Rogerio Feris, Raja Giryes, Leonid Karlinsky

ECCV 2020


[Paper]​​ [Code]


We Have So Much In Common: Modeling Semantic Relational Set Abstractions in Videos

Camilo Fosco, Mathew Monfort, Allen Lee, Rogerio Feris, Carl Vondrick, Aude Oliva

ECCV 2020


[Paper] [Code] [Project Page] [MIT News]


Video Instance Segmentation Tracking

Chung-Ching Lin, Ying Hung, Rogerio Feris, Linglin He

CVPR 2020




Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation


Zhonghao Wang, Mo Yu, Yunchao Wei, Rogerio Feris, Jinjun Xiong, Wen-mei Hwu, Thomas S. Huang, Humphrey Shi

CVPR 2020


[Paper] [Code]


Learning from Lexical Perturbations for Consistent Visual Question Answering

Spencer Whitehead, Hui Wu, Yi Fung, Heng. Ji, Rogerio Feris, Kate Saenko

Arxiv 2020




Learning Motion in Feature Space: Locally- Consistent Deformable Convolution Networks for Fine Grained Action Detection

Khoi-Nguyen C. Mac, Dhiraj Joshi, Raymond A. Yeh, Jinjun Xiong, Rogerio S. Feris, Minh N. Do

ICCV 2019, Oral


[Paper] [Code] [Project Page]


LaSO: Label-Set Operations Networks for Multi-label Few-shot Learning

Amit Alfassy, Leonid Karlinsky, Amit Aides, Joseph Shtok, Sivan Harary, Rogerio Feris, Raja Giryes, Alex M. Bronstein

CVPR 2019, Oral


[Paper] [Code]


SpotTune: Transfer Learning through Adaptive Fine-tuning

Yunhui Guo, Honghui Shi, Abhishek Kumar, Kristen Grauman, Tajana Rosing, Rogerio Feris

CVPR 2019

Top results on the Visual Decathlon challenge (2019)

[Paper] [Code]


RepMet: Representative-based Metric Learning for Classification and One-shot Object Detection

Leonid Karlinsky, Joseph Shtok, Sivan Harary, Eli Schwartz, Amit Aides, Rogerio Feris, Raja Giryes, Alex M. Bronstein

CVPR 2019

[Paper] [Code]


Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition

Chun-Fu Chen, Quanfu Fan, Neil Mallinar, Tom Sercu, Rogerio Feris

ICLR 2019


[Paper] [Code]


Automatic Curation of Sports Highlights using Multimodal Excitement Features

Michele Merler, Khoi Nguyen C. Mac, Dhiraj Joshi, Quoc Bao Nguyen, Stephen Hammer, John Kent, Jinjun Xiong, Minh N. Do, John R. Smith, Rogerio Feris

IEEE Transactions on MultiMedia (TMM) 2019

Our system was used to produce the official highlights of the USOpen, Wimbledon, and Masters tournaments (and watched by millions of fans worldwide)


[Paper] [Blog]  [Video Demo 1] [Video Demo 2] [New York Times] [Fortune] [Newsweek] [Engadget] [NBC News] [Behind the Code


The Excitement of Sports: Automatic Highlights using Audio-Visual Cues

Michele Merler, D. Joshi, Khoi-Nguyen C. Mac, Q. Nguyen, Stephen Hammer, John Kent, Jinjun Xiong, Minh Do, John Smith, Rogerio Feris

CVPR Workshop on Sight and Sound, 2018


[Paper] [Slides] [Video Demo 1] [Video Demo 2] [Blog] [Venturebeat] [ZDNet]


Delta-Encoder: an Effective Sample Synthesis Method for Few-shot Object Recognition

Eli Schwartz, Leonid Karlinsky, Joseph Shtok, Sivan Harary, Mattias Marder, Rogerio Feris, Abhishek Kumar, Raja Giryes, Alex M. Bronstein

NeurIPS 2018, Spotlight


[Paper] [Code]


Dialog-based Interactive Image Retrieval 

Xiaoxiao Guo, Hui Wu, Yu Cheng, Steven Rennie, Gerald Tesauro, Rogerio Feris

NeurIPS 2018 


[Paper] [Code] [Video Demo]


Co-regularized Alignment for Unsupervised Domain Adaptation

Abhishek Kumar, Prasanna Sattigeri, Kahini Wadhawan, Leonid Karlinsky, Rogerio Feris, William T. Freeman, Gregory Wornell

NeurIPS 2018




Learning to Separate Object Sounds by Watching Unlabeled Video

Ruohan Gao, Rogerio Feris, Kristen Grauman

ECCV 2018, Oral


[Paper] [Project Page] [Code]


Revisiting RCNN: On Awakening the Classification Power of Faster RCNN

Bowen Cheng, Yunchao Wei, Honghui Shi, Rogerio Feris, Jinjun Xiong, Thomas Huang

ECCV 2018

DCR achieved state-of-the-art results on Pascal VOC and MS-COCO


[Paper] [Code]


BlockDrop: Dynamic Inference Paths in Residual Networks


Zuxuan Wu, Tushar Nagarajan, Abhishek Kumar, Steven Rennie, Larry S. Davis, Kristen Grauman, Rogerio Feris

CVPR 2018, Spotlight


[Paper]​​ [Code]


Jointly optimize Data Augmentation and Network Training: Adversarial Data Augmentation in Human Pose Estimation

Xi Peng, Zhiqiang Tang, Fei Yang, Rogerio Feris, Dimitris Metaxas

CVPR 2018


[Paper] [Code]


Fully-adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification

Yongxi Lu, Abhishek Kumar, Shuangfei Zhai, Yu Cheng, Tara Javidi, Rogerio Feris

CVPR 2017, Spotlight




S3Pool: Pooling with Stochastic Spatial Sampling


Shuangfei Zhai, Hui Wu, Abhishek Kumar, Yu Cheng, Yongxi Lu, Zhongfei Zhang, Rogerio Feris

CVPR 2017


[Paper] [Code]


A Unified Multi Scale Deep Convolutional Neural Network for Fast Object Detection

Zhaowei Cai, Quanfu Fan, Rogerio S. Feris, Nuno Vasconcelos

ECCV 2016


MS-CNN achieved state-of-the-art results on the popular KITTI dataset

[Paper] [Code] [Demo] [KITTI results] [Project Page]


A Recurrent Encoder-Decoder Network for Sequential Face Alignment

Xi Peng, Rogerio S. Feris, Xiaoyu Wang, Dimitris N. Metaxas

ECCV 2016

[Paper] [Code] [Project Page] [Video Demo]


Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data

Jing Wang, Yu Cheng, Rogerio Feris 

CVPR 2016, Oral




Visual Attributes

Rogerio Feris, Christoph Lampert, Devi Parikh

Advances in Computer Vision and Pattern Recognition, Springer, 2016

[Book Link]


An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections

Yu Cheng, Felix X. Yu, Rogerio S. Feris, Sanjiv Kumar, Alok Choudhary, Shih-Fu Chang

ICCV 2015




Cross-domain Image Retrieval with a Dual Attribute-aware Ranking Network

Junshi Huang, Rogerio S. Feris, Qiang Chen, Shuicheng Yan

ICCV 2015


[Paper] [Data


Deep Domain Adaptation for Describing People Based on Fine-Grained Clothing Attributes

Qiang Chen, Junshi Huang, Rogerio Feris, Lisa M Brown, Jian Dong, Shuicheng Yan

CVPR 2015




Attribute-based People Search

Rogerio Feris, Russ Bobbit, Lisa Brown, Sharath Pankanti

ICMR 2014

See also:

[Paper] [Video Demo


Fast Face Detector Training Using Tailored Views

Kristina Scherbaum, James Petterson, Rogerio Feris, Volker Blanz, Hans-Peter Seidel

ICCV 2013



Efficient Maximum Appearance Search for Large-Scale Object Detection

Qiang Chen, Zheng Song, Rogerio Feris, Ankur Datta, Liangliang Cao, Zhongyang Huang, Shuicheng Yan

CVPR 2013



Designing Category-level Attributes for Discriminative Visual Recognition

Felix X. Yu, Liangliang Cao, Rogerio Feris, John R. Smith, Shih-Fu Chang

CVPR 2013



Large-Scale Vehicle Detection, Indexing, and Search in Urban Surveillance Videos

Rogerio Feris, Behjat Siddiquie, James Petterson, Yun Zhai, Ankur Datta, Lisa Brown, Sharath Pankanti

IEEE Transactions on Multimedia, 2012

See also: Feris et al, Attribute-based Vehicle Search in Crowded Surveillance Videos, ICMR 2011

[Paper] [Video Demos


Image Ranking and Retrieval Based on Multi-Attribute Queries

Behjat Siddiquie, Rogerio Feris, Larry Davis

CVPR 2011, Oral



Shape Classification Through Structured Learning of Matching Measures

Longbin Chen, Julian McAuley, Rogerio Feris, Tiberio Caetano, Matthew Turk

CVPR 2009

[Paper] [Code


A Projector-Camera Setup for Geometry-Invariant Frequency Demultiplexing

Daniel Vaquero, Ramesh Raskar, Rogerio Feris, Matthew Turk

CVPR 2009



Characterizing the Shadow Space of Camera-Light Pairs

Daniel Vaquero, Rogerio Feris, Mathew Turk, Ramesh Raskar

CVPR 2008



Manifold-based Analysis of Facial Expression

Ya Chang, Changbo Hu, Rogerio Feris, Matthew Turk

Image and Vision Computing, 2006

[Paper] [Video Demo


Discontinuity Preserving Stereo with Small Baseline Multi-Flash Illumination

Rogerio Feris, Longbin Chen, Matthew Turk, Ramesh Raskar, KarhanTan

ICCV 2005, Oral 

See also: Feris et al, TPAMI 2007

[Paper] [Project Page] [Code] [Data]


Automatic Human Facial Illustrations with Variable Illumination

Rogerio Feris and Alex Olwal

SIGGRAPH Emerging Technologies, 2005 (Interactive Fogscreen)

[Project Page] [Code


Non-photorealistic Camera: Depth Edge Detection and Stylized Rendering using Multi-Flash Imaging

Ramesh Raskar, Karhan Tan, Rogerio Feris, Jingyi Yu, Matthew Turk


[Paper] [Code] [Project Page] [Video Demo]


Specular Reflection Reduction with Multi-Flash Imaging

Rogerio Feris, Ramesh Raskar, Karhan Tan, Matthew Turk




Shape Enhanced Surgical Visualizations and Medical Illustrations with Multi-flash Imaging

Karhan Tan, James Kobler, Rogerio Feris, Paul Dietz, Ramesh Raskar




Exploiting Depth Discontinuities for Vision-based Fingerspelling Recognition

Rogerio Feris, Matthew Turk, Ramesh Raskar, Karhan Tan, Gosuke Ohashi

CVPR RTV4HCI Workshop 2004



Hierarchical Wavelet Networks for Facial Feature Localization

Rogerio Feris, James Gemmell, Kentaro Toyama, Volker Krueger

Face and Gesture Recognition 2002

Developed as part of the GazeMaster project for videoconferencing. I did this work during my internship at Microsoft Research in 2001.

[Paper] [IFA Head Pose Tracking Demo]


Efficient Real-Time Face Tracking in Wavelet Subspace

Rogerio Feris, Volker Krueger, Roberto Cesar

ICCV RATFG-RTS Workshop, 2001

[Paper] [Video Demo 1] [Video Demo 2]


Recent Invited/Keynote Talks

  • “Learning with Real and Unreal Data in the Era of Foundation Models” [pdf]

- ​Conference on Graphics, Patterns and Images (Sibgrapi 2023), Brazil, 2023.

  • “Learning Trusted Models with Less Data” [pdf]

- ​​IJCAI Workshop on Generalizing from Limited Resources in the Open World, 2023.​​

  • "Representation Learning based on Synthetic Data"[pdf]

- Google Research India, 2022

  • "Computational Visual Pathways for Multi-Task Learning and Simulation" [Video] [pdf]

- Flatiron Institute, 2022

- Binghamton University, 2022

- University of British Columbia, 2021

  • "Dynamic Neural Networks for Efficient Multimodal Video Understanding" [Video]

- Boston University AIR Distinguished Speaker Series, 2021

  • "How Transferable are Contrastive Representations?" [Video] [pdf]

- CVPR 2021 L2ID Workshop

  • "Adaptive Multimodal Learning for Efficient Video Understanding" [Video] [pdf]

- CVPR 2021 MULA Workshop.

- CVPR 2021 LatinX​ Workshop

  • AI We Can Scale, "Learning to Learn" [Video]

- What's Next in AI event, 2020 

  • "Dynamic Neural Networks for Efficient Image and Video Classification" [Video] [pdf]

- ICML 2020 LatinX in AI Workshop

  • "Visual Learning Beyond Natural Images" [Video] [pdf]

- CVPR 2020 DIRA Workshop

  • "Dynamic Neural Networks for Efficient Inference" [Video] [pdf]

- NeurIPS 2019 EMC^2 Workshop

  • "Is it All Relative? Interactive Fashion Search based on Relative Natural Language Feedback” [pdf]

- CVPR 2019 FFSS-USAD Workshop

  • “Speeding Up Deep Neural Networks with Adaptive Computation and Efficient Multi-Scale Architectures”[pdf]

- CVPR 2019 EMC^2 Workshop

  • "Learning More from Less: Weak Supervision and Beyond" [pdf]

- CVPR 2019 Workshop on Learning from Imperfect Data

Media Press


The postings on this site are my own and don't necessarily represent IBM's positions.

bottom of page