Rogerio Schmidt Feris
Principal Scientist and Manager
MIT-IBM Watson AI Lab
I am a principal scientist and manager at the MIT-IBM Watson AI lab. My current work is particularly focused on deep learning methods that are label-efficient (learning with limited labels), sample-efficient (learning with less data), and computationally efficient. I am also interested in multimodal perception methods that combine vision, sound/speech, and language.
I am passionate about doing fundamental research as well as developing systems that make a real-world impact. My work has not only been published in top AI conferences, but has also been integrated into multiple products, and covered by media outlets such as the New York Times, ABC News, and CBS 60 minutes. See my bio for more information about me.
Six papers accepted at CVPR 2022
Two papers accepted at NeurIPS 2021, five papers at ICCV 2021, six papers at CVPR 2021, two papers at ICLR 2021, and two papers at AAAI 2021
I'm an Area Chair of ICLR 2021, CVPR 2021, ICML 2021, and NeurIPS 2021
Five papers accepted at ECCV 2020, two papers at CVPR 2020, and one paper at NeurIPS 2020
I'm an Area Chair of NeurIPS 2020, ECCV 2020, and CVPR 2020
Dynamic Neural Networks
with applications in Efficient Inference, Video Understanding, Transfer / Multi-Task Learning, and Adaptive Data Generation
Instead of relying on one-size-fits-all models, we are investigating dynamic neural networks that adaptively change computation depending on the input.
S. Mishra, R. Panda, C. Phoo, C. Chen, L. Karlinsky, K. Saenko, V. Saligrama, and R. Feris
Bowen Pan, Rameswar Panda, Yifan Jiang, Zhangyang Wang, Rogerio Feris, Aude Oliva
Y. Meng, R. Panda, C. Lin, P. Sattigeri, L. Karlinsky, K. Saenko, A. Oliva, and R. Feris
Y. Meng, C. Lin, R. Panda, P. Sattigeri, L. Karlinsky, A. Oliva, K. Saenko, and R. Feris
X. Sun, R. Panda, R. Feris, and K. Saenko
See also: Fully-adaptive Feature Sharing in Multi-Task Networks (CVPR 2017)
Y. Guo, H. Shi, A. Kumar, K. Grauman, T. Rosing, and R. Feris
Top results on the Visual Decathlon challenge (2019)
Deep Learning with Limited Labeled Data
L. Karlinsky, J. Shtok, S. Harary, E. Schwartz, A. Aides, R. Feris, R. Giryes, and A. Bronstein
See also: Our StarNet paper (AAAI 2021)
Transfer Learning and Adaptation
Z. Wang, M. Yu, Y. Wei, R. Feris, J. Xiong, W. Hwu, T. Huang, and H. Shi
E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, A. Kumar, R. Feris, R. Giryes, and A. Bronstein
NeurIPS 2018, Spotlight
Multimodal Learning (Vision, Audio, Speech, Language) and Applications
N. Shvetsova, B. Chen, A. Rouditchenko, S.Thomas, B. Kingsbury, R. Feris, D. Harwath, J. Glass, and H. Kuehne
B. Chen, A. Rouditchenko, K. Duarte, H. Kuehne, S. Thomas, A. Boggust, R. Panda, B. Kingsbury, R. Feris, D. Harwath, J. Glass, M. Picheny, and S.F. Chang
A. Rouditchenko, A. Boggust, D. Harwath, S.Thomas, H. Kuehne, B. Chen, R. Panda, R. Feris, B. Kingsbury, M. Picheny, and James Glass
A. Rouditchenko, A. Boggust, D. Harwath, D. Joshi, S. Thomas, K. Audhkhasi, R. Feris, B. Kingsbury, M. Picheny, A. Torralba, and J. Glass
M. Merler, D. Joshi, Q. Nguyen, S. Hammer, J. Kent, J. Xiong, M. Do, J. Smith, and R. Feris
IEEE Transactions on MultiMedia (TMM) 2019
Our system was used to produce the official highlights of the USOpen, Wimbledon, and Masters tournaments (and watched by millions of fans worldwide)
Vision and Language for Fashion
Egocentric Video + Geo-location + Weather
Model Compression and Acceleration
More on Video: Action Recognition and Tracking
A. Andonian, C. Fosco, M. Monfort, A. Lee, R. Feris, C. Vondrick, and A. Oliva
M. Khoi-Nguyen, D. Joshi, R. Yeh, J. Xiong, R. Feris, and M. Do
ICCV 2019, Oral
Object Detection and Matching
B. Cheng, Y. Wei, H. She, R. Feris, J. Xiong, and T. Huang
DCR achieved state-of-the-art results on Pascal VOC and MS-COCO
R. Feris, B. Siddiquie, J. Petterson, Y. Zhai, A. Datta, L. Brown, and S. Pankanti
IEEE Transactions on Multimedia, 2012
See also: Feris et al, Attribute-based Vehicle Search in Crowded Surveillance Videos, ICMR 2011
R. Raskar, K. Tan, R. Feris, J. Yu, and M. Turk
R. Feris, J. Gemmell, K. Toyama, and V. Krueger
Face and Gesture Recognition 2002
Developed as part of the GazeMaster project for videoconferencing. I did this work during my internship at Microsoft Research in 2001.
CVPR 2019 FFSS-USAD Workshop. "Is it All Relative? Interactive Fashion Search based on Relative Natural Language Feedback” [pdf]
CVPR 2019 EMC^2 Workshop. “Speeding Up Deep Neural Networks with Adaptive Computation and Efficient Multi-Scale Architectures”[pdf]
CVPR 2019 Workshop on Learning from Imperfect Data. "Learning More from Less: Weak Supervision and Beyond" [pdf]
Shrinking deep learning’s carbon footprint. MIT News, 2020.
Coffee delivery drone patented by IBM. BBC News, 2018.
Enjoy Those U.S. Open Highlights. A Computer Picked Them for You. New York Times, 2017.
How an IBM Computer Picks U.S. Open Highlights. Fortune, 2017.
IBM’s Watson Serves Up This Year’s U.S. Open Highlights. NBC News, 2017.
IBM's Watson is creating US Open tennis highlight videos. Engadget, 2017.
IBM uses AI to serve up Wimbledon highlights. CNet, 2017.
IBM To Provide Wimbledon Highlights Using Artificial Intelligence. SportTechie, 2017.
IBM Watson is creating highlight reels at the Masters. ZDNet, 2017.
Fighting terrorism in New York City. CBS 60 Minutes, 2011. (starting at minute 7:20)
ABC7 puts video analytics to the test. ABC News, 2010.
Cameras help confirm Scott suicide ruling. ABC News, 2009.
Intelligent Iris. ABC News, 2008.
The Nonphotorealistic camera. SlashDot, 2004.
The postings on this site are my own and don't necessarily represent IBM's positions.