Publications | Rogerio Feris

Books

PubsByDate

R. Feris, C. Lampert and D. Parikh (Eds.). Visual Attributes. Advances in Computer Vision and Pattern Recognition, Springer, 2016. [Book Link]

Conference Publications

L. Karlinsky, A. Arbelle, A. Daniels, A. Nassar, A. Alfassi, B. Wu, D. Joshi, E. Schwartz, J. Kondic, N. Shabtay, P. Li, R. Herzig, S. Abedin, S. Perek, S. Harary, U. Barzelay, A. Goldfarb, A. Oliva, B. Wieles, B. Bhattacharjee, B. Huang, C. Auer, D. Gutfreund, D. Beymer, D. Wood, H. Kuehne, J. Hansen, J. Shtok, K. Wong, L. Bathen, M. Mishra, M. Lysak, M. Dolfi, M. Yurochkin, N. Livathinos, N. Harel, O. Azulai, O. Naparstek, R. Teixeira de Lima, R. Panda, S. Doveh, S. Gupta, S. Das, S. Zawad, Y. Kim, Z. He, A. Brooks, G. Goodhart, A. Govindjee, D. Leist, I. Ibrahim, A. Soffer, D. Cox, K. Soule, L. Lastras, N. Desai, S. Ofek-koifman, S. Raghavan, T. Syeda-Mahmood, P. Staar, T. Drory, and R. Feris. Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence. IBM Technical Report, 2025. [pdf]

E. Araujo, A. Rouditchenko, Y. Gong, S. Bhati, S. Thomas, B. Kingsbury, L. Karlinsky, R. Feris, J. Glass, and H. Kuehne. CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment. Conference on Computer Vision and Pattern Recognition (CVPR 2025). [pdf]

Y. Wang, D. Krotov, Y. Hu, Y. Gao, W. Zhou, J. McAuley, D. Gutfreund, R.Feris, and Z. He. M+: Extending MemoryLLM with Scalable Long-Term Memory. International Conference on Machine Learning (ICML 2025). [pdf]

J. Kang, L. Karlinsky, H. Luo, Z. Wang, J. Hansen, J. Glass, D. Cox, R. Panda, R. Feris, and A. Ritter. Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts. International Conference on Learning Representations (ICLR 2025). [pdf]

I. Huang, W. Lin, M. Mirza, J. Hansen, S. Doveh, V. Butoi, R. Herzig, A. Arbelle, H. Kuehne, T. Darrell, C. Gan, A. Oliva, R. Feris, and L. Karlinsky. ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs. Conference on Neural Information Processing Systems (NeurIPS 2024). [pdf]

R. Wang, S. Ghosh, D. Cox, D. Antognini, A. Oliva, R. Feris, and L. Karlinsky. Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning. Conference on Neural Information Processing Systems (NeurIPS 2024). [pdf]

Z. He, L. Karlinsky, D. Kim, J. McAuley, D. Krotov, and R. Feris. CAMELoT: Towards Large Language Models with Training-Free Associative Memory. ICML 2024 Workshop on Long-Context Foundation Models, 2024. [pdf]

J. Li, S. Das, A. Oliva, D. Krotov, L. Karlinsky, and R. Feris. Long Context Understanding using Self-Generated Synthetic Data. ICML 2024 Workshop on Long-Context Foundation Models, 2024. [pdf]

M. Stallone et al. Scaling Granite Code Models to 128K Context. IBM Technical Report, 2024. [pdf]

B. Chen , N. Shvetsova , A. Rouditchenko, D. Kondermann, S. Thomas, S. Chang, R. Feris, J. Glass, and H. Kuehne. What, when, and where? Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions. Conference on Computer Vision and Pattern Recognition (CVPR 2024). [pdf]

J. Smith, L. Valkov, S. Halbe, V. Gutta, R. Feris, Z. Kira, and L. Karlinsky. Adaptive Memory Replay for Continual Learning. CVPR Workshop on Efficient Large Vision Models, 2024. [pdf]

A. Rouditchenko, Y. Gong, S. Thomas, L. Karlinsky, H. Kuehne, R. Feris, and J. Glass. Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation (Interspeech 2024). [pdf]

J. Kang, H. Luo, Y. Zhu, J. Hansen, J. Glass, D. Cox, A. Ritter, R. Feris, L. Karlinsky. Self-Specialization: Uncovering Latent Expertise within Large Language Models (ACL 2024, Findings). [pdf]

B. Pan, R. Panda, S. Jin, R. Feris, A. Oliva, P. Isola, and Y. Kim. LangNav: Language as a Perceptual Representation for Navigation. North American Chapter of the Association for Computational Linguistics (NAACL 2024). [pdf]

Baughman, S. Hammer, R. Agarwal, R. Feris, G. Akay, E. Morales, L. Karlinsky, and T. Johnson. Large Scale Generative AI Text Applied to Sports and Music. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2024). [pdf]

S. Bhati, Y. Gong, L. Karlinsky, H. Kuehne, R. Feris, and J. Glass. DASS: Distilled Audio State Space Models Are Stronger and More Duration-Scalable Learners. IEEE Spoken Language Technology Workshop, 2024. [pdf]

X. Sun, R. Panda, C. Chen, N. Wang, B. Pan, K. Gopalakrishnan, A. Oliva, R. Feris, and K. Saenko. Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths. IEEE Winter Conference on Applications of Computer Vision (WACV 2024). [pdf]

Z. He, G. Blackwood, R. Panda, J. McAuley, and R. Feris. Synthetic Pre-trained Tasks for Neural Machine Translation (ACL 2023, Findings). [pdf]

H. Zhong, S. Mishra, D. Kim, S. Jin, R. Panda, H. Kuehne, L. Karlinsky, V. Saligrama, A. Oliva, and R. Feris. Learning Human Action Recognition Representations Without Real Humans. Conference on Neural Information Processing Systems (NeurIPS 2023 Datasets Track) [pdf]

S. Doveh, A. Arbelle, S. Harary, R. Herzig, D. Kim, P. Cascante-Bonilla, A. Alfassy, R. Panda, R. Giryes, R. Feris, S. Ullman, and L. Karlinsky. Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models. Conference on Neural Information Processing Systems (NeurIPS 2023, Spotlight). [pdf]

M. Mirza, L. Karlinsky, W. Lin, H. Possegger, M. Kozinski, R. Feris, and H. Bischof. LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections. Conference on Neural Information Processing Systems (NeurIPS 2023). [pdf]

R. Herzig, A. Mendelson, L. Karlinsky, A. Arbelle, R. Feris, T. Darrell, A. Globerson. Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs. Conference on Empirical Methods in Natural Language Processing (EMNLP 2023). [pdf]

A. Rouditchenko, S. Khurana, S. Thomas, R. Feris, L. Karlinsky , H. Huehne, D. Harwath, B. Kingsbury, and J. Glass. Comparison of Multilingual Self-Supervised and Weakly- Supervised Speech Pre-Training for Adaptation to Unseen Languages (Interspeech 2023). [pdf]

K. Wang. D. Kim, R. Feris, and M. Betke. CDAC: Cross-domain Attention Consistency in Transformer for Domain Adaptive Semantic Segmentation. International Conference on Computer Vision (ICCV 2023). [pdf]

P. Cascante-Bonilla, K. Shehada, J. Smith, S. Doveh, D. Kim, R. Panda, G. Varol, A. Oliva, V. Ordonez, R. Feris, and L. Karlinsky. Going Beyond Nouns With Vision & Language Models Using Synthetic Data. International Conference on Computer Vision (ICCV 2023) [pdf]

W. Lin, L. Karlinsky, N. Shvetsova, H. Possegger, M. Kozinski, R. Panda, R. Feris, H. Kuehne, and H. Bischof. MAtch, eXpand and Improve: Unsupervised Finetuning for Zero- Shot Action Recognition with Language Knowledge. International Conference on Computer Vision (ICCV 2023) [pdf]

J. Smith, P. Cascante-Bonilla, A. Arbelle, D. Kim, R. Panda, D. Cox, D. Yang, Z. Kira, R. Feris, and L. Karlinsky. ConStruct-VL: Data-Free Continual Structured VL Concepts Learning. Conference on Computer Vision and Pattern Recognition (CVPR 2023). [pdf]

J. Smith, L. Karlinsky, V. Gutta, P. Cascante-Bonilla, D. Kim, A. Arbelle, R. Panda, R. Feris, and Zsolt Kira. CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning. Conference on Computer Vision and Pattern Recognition (CVPR 2023). [pdf]

S. Doveh, A. Arbelle, S. Harary, R. Panda, R. Herzig, E. Schwartz, R. Giryes, R. Feris, S. Ullman, and L. Karlinsky. Teaching Structured Vision & Language Concepts to Vision & Language Models. Conference on Computer Vision and Pattern Recognition (CVPR 2023). [pdf]

P. Wang, R. Panda, L. Hennigen, P. Greengard, L. Karlinsky, R. Feris, D. Cox, Z. Wang, and Y. Kim. Learning to Grow Pretrained Models for Efficient Transformer Training. International Conference on Learning Representations (ICLR 2023, notable-top-25%). [pdf]

Z. Wang, R. Panda, L. Karlinsky, R. Feris, H. Sun, and Y. Kim. Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning. International Conference on Learning Representations (ICLR 2023). [pdf]

A., Y. Chuang, N. Shvetsova, S. Thomas, R. Feris, B. Kingsbury, L. Karlinsky, D. Harwath, H. Kuehne, and J. Glass. C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval. International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023). [pdf]

A. Sahoo, R. Panda, R. Feris, K., and A. Das. Select, Label, and Mix: Learning Discriminative Invariant Feature Representations for Partial Domain Adaptation. IEEE Winter Conference on Applications of Computer Vision (WACV 2023, Best Paper Award Honorable Mention). [pdf]

T. Li, L. Fan, Y. Yuan, H. He, Y. Tian, R. Feris, P. Indyk, and D. Katabi. Addressing Feature Suppression in Unsupervised Visual Representations. IEEE Winter Conference on Applications of Computer Vision (WACV 2023). [pdf]

M. Baradad, R. Chen, J. Wulff, T. Wang, R. Feris, A. Torralba, and P. Isola. Procedural Image Programs for Representation Learning. Conference on Neural Information Processing Systems (NeurIPS 2022). [pdf]

Y. Kim, S. Mishra, S. Jin, R. Panda, H. Kuehne, L. Karlinsky, V. Saligrama, K. Saenko, A. Oliva, and R. Feris. How Transferable are Video Representations Based on Synthetic Data? Conference on Neural Information Processing Systems (NeurIPS 2022 Dataset Track). [pdf]

A. Alfassy, A. Arbelle, O. Halimi, S. Harary, R. Herzig, E. Schwartz, R. Panda, M. Dolfi, C. Auer, P. Staar, K. Saenko, R. Feris, L. Karlinsky. FETA: Towards Specializing Foundational Models for Expert Task Applications. Conference on Neural Information Processing Systems (NeurIPS 2022 Dataset Track). [pdf]

S. Mishra, R. Panda, C. Phoo, C. Chen, L. Karlinsky, K. Saenko, V. Saligrama, and R. Feris. Task2Sim: Towards Effective Pre-training and Transfer from Synthetic Data. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

S. Harary, E. Schwartz, A. Arbelle, P. Staar, S. Abu-Hussein, E. Amrani, R. Herzig, A. Alfassy, R. Giryes, H. Kuehne, D. Katabi, K. Saenko, R. Feris, and L. Karlinsky. Unsupervised Domain Generalization by Learning a Bridge Across Domains. Conference on Computer Vision and Pattern Recognition (CVPR 2022, Oral). [pdf]

P. Cascante-Bonilla, H. Wu, L. Wang, R. Feris, and V. Ordonez. SimVQA: Exploring Simulated Environments for Visual Question Answering. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

Y. Li, R. Panda, Y. Kim, C. Chen, R. Feris, D. Cox, and N. Vasconcelos. VALHALLA: Visual Hallucination for Machine Translation. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

N. Shvetsova, B. Chen, A. Rouditchenko, S. Thomas, B. Kingsbury, R. Feris, D. Harwath, J. Glass, and H. Kuehne. Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

T. Li, P. Cao, Y. Yuan, L. Fan, Y. Yang, R. Feris, P. Indyk, and D. Katabi. Targeted Supervised Contrastive Learning for Long-Tailed Recognition. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

B. Pan, Y. Jiang, R. Panda, Z. Wang, R. Feris, and A. Oliva. IA-RED^2: Interpretability-Aware Redundancy Reduction for Vision Transformers. Conference on Neural Information Processing Systems (NeurIPS 2021). [pdf]

A. Islam. C. Chen, R. Panda, L. Karlinsky, R. Feris, and R. Radke. Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data. Conference on Neural Information Processing Systems (NeurIPS 2021). [pdf]

R. Panda, R. Chen, Q. Fan, X. Sun, K. Saenko, A. Oliva, and R. Feris. AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition. International Conference on Computer Vision (ICCV 2021). [pdf]

A. Islam, R. Chen, R. Panda, L. Karlinsky, R. Radke, and R. Feris. A Broad Study on the Transferability of Visual Representations with Contrastive Learning. International Conference on Computer Vision (ICCV 2021). [pdf]

A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Levi, P. Sattigeri, R. Panda, R. Chen, A. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, and L. Karlinsky. Detector-Free Weakly Supervised Grounding by Separation. International Conference on Computer Vision (ICCV 2021, Oral). [pdf]

X. Sun, R. Panda, R. Chen, A. Oliva, R. Feris, and K. Saenko. Dynamic Network Quantization for Efficient Video Inference. International Conference on Computer Vision (ICCV 2021). [pdf]

B. Chen, A. Rouditchenko, K. Duarte, H. Kuehne, S. Thomas, A. Boggust, R. Panda, B. Kingsbury, R. Feris, D. Harwath, J. Glass, M. Picheny, and S. F. Chang. Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos. International Conference on Computer Vision (ICCV 2021). [pdf]

A. Rouditchenko, A. Boggust, D. Harwath, B. Chen, D. Joshi, S.Thomas, K. Audhkhasi, H. Kuehne, R. Panda, R. Feris, B. Kingsbury, M. Picheny, A.Torralba, and J. Glass. AVLnet: Learning Audio-Visual Language Representations from Instructional Videos. Interspeech 2021 [pdf]

A. Rouditchenko, A. Boggust, D. Harwath, S. Thomas, H. Kuehne, B. Chen, R. Panda, R. Feris, B. Kingsbury, M. Picheny and J. Glass. Cascaded Multilingual Audio-Visual Learning from Videos. Interspeech 2021

G. Bukchin, E. Schwartz, K. Saenko, O. Shahar, R. Feris, R. Giryes, and L. Karlinsky. Fine-grained Angular Contrastive Learning with Coarse Labels. Conference on Computer Vision and Pattern Recognition (CVPR 2021, Oral). [pdf]

H. Wu, Y. Gao, X. Guo, Z. Al-Halah, S. Rennie, K. Grauman, and R. Feris. Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

S. Whitehead, H. Wu, H. Ji, R. Feris, and K. Saenko. Separating Skills and Concepts for Novel Visual Question Answering. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

M. Monfort, S. Jin, D. Harwath, R. Feris, J. Glass, and A. Oliva. Spoken Moments: A Large Scale Dataset of Audio Descriptions of Dynamic Events in Video. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

A. Singh, O. Chakraborty, A. Varshney, R. Panda, R. Feris, K. Saenko, and A. Das. Semi-Supervised Action Recognition with Temporal Contrastive Learning. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

C. Chen, R. Panda, K. Ramakrishnan, R. Feris, J. Cohn, A. Oliva, and Q. Fan. Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

B. Pan, R. Panda, C. Fosco, C. Lin, A. Andonian, Y. Meng, K. Saenko, A. Oliva, and R. Feris. VA-RED^2: Video Adaptive Redundancy Reduction. International Conference on Learning Representations (ICLR 2021). [pdf]

Y. Meng, R. Panda, C. Lin, P. Sattigeri, L. Karlinsky, K. Saenko, A. Oliva, and R. Feris. AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition. International Conference on Learning Representations (ICLR 2021). [pdf]

L. Karlinsky, J. Shtok, A. Alfassy, M. Lichtenstein, S. Harary, E. Schwartz, S. Doveh, P. Sattigeri, R. Feris, A. Bronstein, and R. Giryes. StarNet: towards Weakly Supervised Few-Shot Object Detection. AAAI Conference on Artificial Intelligence (AAAI 2021). [pdf]

R. Panda, M. Merler, M. Jaiswal, H. Wu, K. Ramakrishnan, U. Finkler, C. Chen, M. Cho, R. Feris, D. Kung, and B. Bhattacharjee. NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search. AAAI Conference on Artificial Intelligence (AAAI 2021). [pdf]

U. Finkler, M. Merler, R. Panda, M. Jaiswal, H. Wu, K. Ramakrishnan, C. Chen, M. Cho, D. Kung, R. Feris, and B. Bhattacharjee. Large Scale Neural Architecture Search with Polyharmonic Splines. AAAI Workshop on Meta-Learning for Computer Vision, 2021. [pdf]

X. Sun, R. Panda, R. Feris, and K. Saenko. AdaShare: Learning What to Share for Efficient Deep Multi-Task Learning. Conference on Neural Information Processing Systems (NeurIPS 2020). [pdf]

Y. Guo, N. Codella, L. Karlinsky, J. Codella, J. Smith, K. Saenko, T. Rosing, and R. Feris. A Broader Study of Cross-Domain Few-Shot Learning. European Conference on Computer Vision (ECCV 2020). [pdf]

Y. Meng, C. Lin, R. Panda, P. Sattigeri, L. Karlinsky, K. Saenko, A. Oliva, and R. Feris. AR-Net: Adaptive Frame Resolution for Efficient Action Recognition. European Conference on Computer Vision (ECCV 2020). [pdf]

Z. Tang, Y. Gao, P. Sattigeri, L. Karlinsky, R. Feris, and D. Metaxas. OnlineAugment: Online Data Augmentation with Less Domain Knowledge. European Conference on Computer Vision (ECCV 2020). [pdf]

M. Lichtenstein, P. Sattigeri, R. Feris, R. Giryes, and L. Karlinsky. TAFSSL: Task-Adaptive Feature Sub-Space Learning for Few-shot Classification. European Conference on Computer Vision (ECCV 2020). [pdf]

A. Andonian, C. Fosco, M. Monfort, A. Lee, R. Feris, C. Vondrick, and A. Oliva. We Have So Much In Common: Modeling Semantic Relational Set Abstractions in Videos. European Conference on Computer Vision (ECCV 2020). [pdf]

A. Sahoo, A. Singh, R. Panda, R. Feris, and A. Das. Mitigating Dataset Imbalance via Joint Generation and Classification. ECCV Workshop on Imbalance Problems in Computer Vision, 2020. [pdf]

C. Lin, Y. Hung, R. Feris, and L. He. Video Instance Segmentation Tracking. Conference on Computer Vision and Pattern Recognition (CVPR 2020). [pdf]

Z. Wang, M. Yu, Y. Wei, R. Feris, J. Xiong, W. Hwu, T. Huang, and H. Shi. Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation. Conference on Computer Vision and Pattern Recognition (CVPR 2020). [pdf]

K. Ramakrishnan, R. Panda, Q. Fan, J. Henning, A. Oliva, and R. Feris. Relationship Matters: Relation Guided Knowledge Transfer for Incremental Learning of Object Detectors. CVPR Workshop on Continual Learning in Computer Vision, 2020. [pdf]

M. Khoi-Nguyen, D. Joshi, R. Yeh, J. Xiong, R. Feris, and M. Do. Learning Motion in Feature Space: Locally- Consistent Deformable Convolution Networks for Fine Grained Action Detection. International Conference on Computer Vision (ICCV 2019, Oral), Seoul, Korea, 2019. [pdf]

A. Alfassy, L. Karlinsky, A. Aides, J. Shtok, S. Harary, R. Feris, R. Giryes, and A. Bronstein. LaSO: Label-Set Operations Networks for Multi-label Few-shot Learning. Conference on Computer Vision and Pattern Recognition (CVPR 2019, Oral), Long Beach, California, 2019. [pdf]

Y. Guo, H. Shi, A. Kumar, K. Grauman, T. Rosing, and R. Feris. SpotTune: Transfer Learning through Adaptive Fine-tuning. Conference on Computer Vision and Pattern Recognition (CVPR 2019), Long Beach, California, 2019. [pdf]

L. Karlinsky, J. Shtok, S. Harary, E. Schwartz, A. Aides, R. Feris, R. Giryes, and A. Bronstein. RepMet: Representative-based Metric Learning for Classification and One-shot Object Detection. Conference on Computer Vision and Pattern Recognition (CVPR 2019), Long Beach, California, 2019. [pdf]

C. Chen, Q. Fan, N. Mallinar, T. Sercu, and R. Feris. Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition. International Conference on Learning Representations (ICLR 2019), New Orleans, Louisiana, 2019. [pdf]

A. Boggust, K. Audhkhasi, D. Joshi, D. Harwath, S. Thomas, R. Feris, D. Gutfreund, Y. Zhang, A. Torralba, M. Picheny, and James Glass. Grounding Spoken Words in Unlabeled Video. CVPR Workshop on Sight and Sound, 2019. [pdf]

K. Ramakrishnan, M. Monfort, B. McNamara, A. Lascelles, D. Gutfreund, R. Feris, and A. Oliva. Identifying Interpretable Action Concepts in Deep Networks. CVPR Workshop on Explainable AI, 2019. [pdf]

M. Jaiswal et al. Video-Text Compliance: Activity Verification Based on Natural Language Instructions. ICCV Workshop on Large Scale Holistic Video Understanding, 2019. [pdf]

Z. Shen, H. Shi, J. Yu, H. Phan, R. Feris, L. Cao, D. Liu, X. Wang, T. Huang, M. Savvides. Learning Object Detection from Scratch via Gated Feature Reuse. British Machine Vision Conference (BMVC), 2019. [pdf]

E. Schwartz*, L. Karlinsky*, J. Shtok, S. Harary, M. Marder, A. Kumar, R. Feris, R. Giryes, and A. Bronstein. Delta-Encoder: an Effective Sample Synthesis Method for Few-shot Object Recognition. Neural Information Processing Systems (NeurIPS 2018, Spotlight), Montreal, Canada, 2018. [pdf] (* equal contribution)

X. Guo*, H. Wu*, Y. Cheng, S. Rennie, G. Tesauro, and R. Feris. Dialog-based Interactive Image Retrieval. Neural Information Processing Systems (NeurIPS 2018), Montreal, Canada, 2018. [pdf] [video demo] (*equal contribution)

A. Kumar, P. Sattigeri, K. Wadhawan, L. Karlinsky, R. Feris, W. T. Freeman, and G. Wornell. Co-regularized Alignment for Unsupervised Domain Adaptation. Neural Information Processing Systems (NeurIPS 2018), Montreal, Canada, 2018. [pdf]

R. Gao, R. Feris and K. Grauman. Learning to Separate Object Sounds by Watching Unlabeled Video. European Conference on Computer Vision (ECCV 2018, Oral), Munich, Germany, 2018. [pdf] [project page]

B. Cheng*, Y. Wei*, H. She, R. Feris, J. Xiong, and T. Huang. Revisiting RCNN: On Awakening the Classification Power of Faster RCNN. European Conference on Computer Vision (ECCV 2018), Munich, Germany, 2018. [pdf] (* equal contribution)

Z. Wu*, T. Nagarajan*, A. Kumar, S. Rennie, L. Davis, K. Grauman, and R. Feris. BlockDrop: Dynamic Inference Paths in Residual Networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018, Spotlight), Salt Lake City, Utah, June 2018. [pdf] (* equal contribution)

X. Peng, Z. Tang, F. Yang, R. Feris, and D. Metaxas. Jointly optimize data augmentation and network training: Adversarial data augmentation in human pose estimation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, Utah, June 2018. [pdf]

M. Merler, D. Joshi, K. Mac, Q. Nguyen, S. Hammer, J. Kent, J. Xiong, M. Do, J. Smith, and R. Feris. The Excitement of Sports: Automatic Highlights using Audio-Visual Cues. CVPR Workshop on Sight and Sound, 2018. [pdf]

N. Codella, C. Lin, A. Halpern, M. Hind, R. Feris, and J. Smith. Collaborative Human-AI (CHAI): Evidence-Based Interpretable Melanoma Classification in Dermoscopic Images. MICCAI Workshop on Interpretability of Machine Intelligence in Medical Image Computing, 2018. [pdf]

N. Codella, D. Anderson, T. Philips, A. Porto, K. Massey, J. Snowdon, R. Feris, and J. Smith. Segmentation of both Diseased and Healthy Skin from Clinical Photographs in a Primary Care Setting. International Engineering in Medicine and Biology Conference, 2018. [pdf]

M. Beigi, L. M Brown, Q. Fan, J. Henning, C. Lin, H. Shi, C. Shu, and R. Feris. Object-Centric Spatio-Temporal Activity Detection and Recognition. NIST TRECVID Workshop, 2018. [pdf]

Y. Lu, A. Kumar, S. Zhai, Y. Cheng, T. Javidi, and R. Feris. Fully Adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017, Spotlight), Honolulu, Hawaii, July 2017. [pdf]

S. Zhai, H. Wu, A. Kumar, Y. Cheng, Y. Lu, Z. Zhang,and R. Feris. S3Pool: Pooling with Stochastic Spatial Sampling. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), Honolulu, Hawaii, July 2017. [pdf] [code]

M. Merler, D. Joshi, Q. Nguyen, S. Hammer, J. Kent, J. R Smith, and R. Feris. Automatic Curation of Golf Highlights using Multimodal Excitement Features. Workshop of Computer Vision in Sports (in conjunction with CVPR), 2017. [pdf]

Z. Cai, Q. Fan, R. Feris, and N. Vasconcelos. A Unified Multi Scale Deep Convolutional Neural Network for Fast Object Detection. European Conference on Computer Vision (ECCV 2016), Amsterdam, Netherlands, 2016. [pdf] [code] [demo] [KITTI results]

X. Peng, R. Feris, X. Wang, and D. Metaxas. A Recurrent Encoder-Decoder Network for Sequential Face Alignment. European Conference on Computer Vision (ECCV 2016, Oral), Amsterdam, Netherlands, 2016. [pdf] [project page]

J. Wang, Y. Cheng, and R. Feris. Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016, Oral), Las Vegas, Nevada, June 2016. [pdf]

Y. Cheng, F. Yu, R. Feris, S. Kumar, A. Choudhary, and S. F. Chang. An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections. IEEE International Conference on Computer Vision (ICCV 2015), Santiago, Chile, December 2015. [pdf] [code]

J. Huang, R. Feris, Q. Chen, and S. Yan. Cross-domain Image Retrieval with a Dual Attribute-aware Ranking Network. IEEE International Conference on Computer Vision (ICCV 2015), Santiago, Chile, December 2015. [pdf] [data]

Q. Chen, J. Huang, R. Feris, L. Brown, J. Dong, and S. Yan. Deep Domain Adaptation for Describing People Based on Fine-Grained Clothing Attributes. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), Boston, Massachusetts, June 2015. [pdf]

J. Bowler, R. Feris, L. Cao, J. Wang, and M. Zhou. Automated Axon Segmentation from Highly Noisy Microscopic Videos. Winter Conference on Applications of Computer Vision (WACV 2015), Kona, Hawaii, 2015. [pdf]

R. Feris, R. Bobbitt, and S. Pankanti. Efficient 24/7 Object Detection in Surveillance Videos. IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS 2015), Germany, August 2015. [pdf]

Y. Cheng, L. Brown, Q. Fan, R. Feris, S. Pankanti, and T. Zhang. RiskWheel: Interactive Visual Analytics for Surveillance Event Detection. IEEE International Conference on Multimedia and Expo (ICME 2014, Oral), Chengdu, China, 2014. [pdf]

J. Xie, L. Chou, R. Feris, and M.T. Sun. Single Depth Image Super resolution and Denoising via Coupled Dictionary Learning with Local Constraints and Shock Filtering. IEEE International Conference on Multimedia and Expo (ICME 2014, Oral), 2014. [pdf]

J. Xie, R. Feris, and M.T. Sun. Edge Guided Single Depth Image Super Resolution. IEEE International Conference on Image Processing (ICIP 2014), 2014. [source code and paper]

Y. Cui, Y. Xiang, K. Rong, L. Cao, and R. Feris. A Spatial-Color Layout Feature for Representing Galaxy Images. IEEE Winter Conference on Applications of Computer Vision (WACV 2014), Steamboat Springs, Colorado, 2014. [pdf]

L. Brown, R. Feris, and S. Pankanti. Temporal Non-Maximum Suppression for Pedestrian Detection Using Scene Context. International Conference on Pattern Recognition (ICPR 2014), Stockholm, Sweden, 2014.

R. Feris, L. Brown, S. Pankanti, and M.T. Sun. Appearance-based Object Detection under Varying Environmental Conditions. International Conference on Pattern Recognition (ICPR 2014, Oral), Stockholm, Sweden, 2014.

A. Mattos and R. Feris. Fusing Well-Crafted Feature Descriptors for Efficient Fine-Grained Classification. IEEE International Conference on Image Processing (ICIP 2014), Paris, France 2014.

A. Mattos, R. Herrmann, K. Shigeno, and R. Feris. A Mission-Oriented Citizen Science Platform for Efficient Flower Classification Based on Combination of Feature Descriptors. ICMR Workshop on Environmental Multimedia Retrieval, Glasgow, UK, 2014. [pdf]

A. Mattos, R. Herrmann, K. Shigeno, and R. Feris. Flower Classification for a Citizen Science Mobile App. International Conference on Multimedia Retrieval (ICMR 2014), Glasgow, UK, 2014.

R. Feris, R. Bobbit, L. Brown, and S. Pankanti. Attribute-based People Search: Lessons Learnt from a Practical Surveillance System. ACM International Conference on Multimedia Retrieval (ICMR 2014), Oral Presentation, Glasgow, UK, 2014. [pdf]

K. Scherbaum, R. Feris, J. Petterson, V. Blanz, and H. Seidel. Fast Face Detector Training Using Tailored Views. IEEE International Conference on Computer Vision (ICCV 2013), Sydney, Australia, December 2013. [pdf]

Q. Chen, Z. Song, R. Feris, A. Datta, L. Cao, Z. Huang, and S. Yan. Efficient Maximum Appearance Search for Large-Scale Object Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013), Portland, Oregon, June 2013. [pdf]

F. Yu, L. Cao, R. Feris, J. Smith, and S. Chang. Designing Category-Level Attributes for Discriminative Visual Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013), Portland, Oregon, June 2013. [pdf]

Q. Chen et al. Spatio-Temporal Fisher Vector Coding for Surveillance Event Detection. ACM International Conference on Multimedia (ACM MM 2013), Barcelona, Spain, 2013.

J. Xie, J. Hsu, R. Feris, and M. Sun. Fine Registration of 3D Point Clouds with ICP Using an RGB-D Camera. IEEE International Symposium on Circuits and Systems (ISCAS 2013), Beijing, China, 2013. [pdf]

X. Wang, L. Cao, R. Feris, A. Datta, and T. Xan. Hierarchical Feature Pooling with Structure Learning: A new method for Pedestrian Detection. CVPR Workshop on Structured Prediction: Tractability, Learning, and Inference, Portland, Oregon, 2013.

J. Leandro, R. Cesar, and R. Feris. Shape Analysis using the Spectral Graph Wavelet Transform. IEEE eScience Conference, Beijing, China 2013.

R. Feris, A. Datta, M. T. Sun, and S. Pankanti. Boosting Object Detection Performance in Crowded Surveillance Videos. Winter Conference on Applications of Computer Vision (WACV 2013), Florida, USA, 2013. [pdf]

F. Rashed, B. Siddiquie, R. Feris, and L. Davis. Domain Adaptive Object Detection. Winter Conference on Applications of Computer Vision (WACV 2013), Florida, USA, 2013. [pdf]

Y. Cai et al. CMU-IBM-NUS@TRECVID 2012: Surveillance Event Detection. NIST Technical Report, 2012. First place in the retrospective surveillance event detection task. [pdf]

R. Feris, B. Siddiquie, and S. Pankanti. Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance. International Conference on Multimedia and Expo (ICME 2012), Melbourne, Australia, 2012. [pdf]

B. Siddiquie, R. Feris, A. Datta, and L. Davis. Unsupervised Model Selection for View-Invariant Object Detection in Surveillance Environments. International Conference on Pattern Recognition (ICPR 2012, Oral), Tsukuba City, Japan, 2012. [pdf]

A. Datta, L. Brown, R. Feris, and S. Pankanti. Appearance Modeling for Person Re-Identification using Weighted Brightness Transfer Functions. International Conference on Pattern Recognition (ICPR 2012), Tsukuba City, Japan, 2012. [pdf]

B. Siddiquie, R. Feris, and L. Davis. Image Ranking and Retrieval Based on Multi-Attribute Queries. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011, Oral Presentation), Colorado Springs, USA, 2011. [pdf]

R. Feris, B. Siddiquie, Y. Zhai, J. Petterson, L. Brown, and S. Pankanti. Attribute-based Vehicle Search in Crowded Surveillance Videos. ACM International Conference on Multimedia Retrieval (ICMR 2011, Oral Presentation), Trento, Italy, 2011. [pdf]

A. Datta, R. Feris, and D. Vaquero. Hierarchical Ranking of Facial Attributes. IEEE International Conference on Automatic Face and Gesture Recognition (FG 2011), Santa Barbara, California, 2011. [pdf]

R. Feris, J. Petterson, B. Siddiquie, L. Brown, and S. Pankanti. Large-Scale Vehicle Detection in Challenging Urban Surveillance Environments. Winter Conference on Applications of Computer Vision (WACV 2011), Kona, Hawaii, 2011. [pdf]

H. Liu, R. Feris, V. Krueger, and M.T. Sun. Unsupervised Action Classification Using Space-Time Link Analysis. IEEE International Symposium on Circuits and Systems (ISCAS 2010), Paris, France, 2010. [pdf]

L. Chen, J. McAuley, R. Feris, T. Caetano, and M. Turk. Shape Classification Through Structured Learning of Matching Measures. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, Florida, June 2009. [pdf]

D. Vaquero, R. Raskar, R. Feris, and M. Turk. A Projector-Camera Setup for Geometry-Invariant Frequency Demultiplexing. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, Florida, June 2009. [pdf]

D. Vaquero, R. Feris, L. Brown, and A. Hampapur. Attribute-based people search in surveillance environments. Winter Conference on Applications of Computer Vision (WACV 2009, Oral Presentation), Snowbird, Utah, December 2009. [pdf]

A. Hampapur, R. Bobbitt, L. Brown, M. Desimone, R. Feris, R. Kjeldsen, M. Lu, C. Mercier, C. Milite, S. Russo, C. Shu, Y. Zhai. Video Analytics in Urban Environments. IEEE International Conference on Advanced Video and Signal Based Surveillance, Genova, Italy, 2009.

R. Feris, Y. Tian, Y. Zhai, and A. Hampapur. Facial Image Analysis Using Local Feature Adaptation Prior to Learning. IEEE International Conference on Automatic Face and Gesture Recognition, Amsterdam, Netherlands, September 2008. [pdf]

Y. Tian, R. Feris, and A. Hampapur. Real-Time Detection of Abandoned and Removed Objects in Complex Environments. IEEE International Workshop on Visual Surveillance (in conjunction with ECCV 2008), Marseille, France, 2008. [pdf]

D. Vaquero, R. Feris, M. Turk, and R. Raskar. Characterizing the Shadow Space of Camera-Light Pairs. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008), Anchorage, Alaska, June 2008. [pdf]

L. Chen, R. Feris, Y. Zhai, L. Brown, and A. Hampapur. An Integrated System for Moving Object Classification in Surveillance Videos. IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS 2008), Santa Fe, New Mexico, September 2008. [pdf]

L. Chen, R. Feris, and M. Turk. Efficient Partial Shape Matching Using the Smith-Waterman Algorithm. CVPR Workshop on Non-Rigid Shape Analysis and Deformable Image Alignment, Anchorage, Alaska, June 2008. [pdf]

A. Hampapur, L. Brown, R. Feris, A. Senior, C. Shu, Y. Tian, Y. Zhai, and M. Lu. Searching Surveillance Video. IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS 2007), London, UK, September 2007. [pdf]

A. Senior, L. Brown, A. Hampapur, C. Shu, Y. Zhai, R. Feris, Y. Tian, S. Borger, and C. Carlson. Video Analytics for Retail: the IBM Smart Surveillance Retail Solution. IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS 2007), London , UK , September 2007. [pdf]

R. Feris, Y. Tian, and A. Hampapur. Capturing People in Surveillance Video. IEEE International Workshop on Visual Surveillance, 2007. [pdf]

R. Feris, R. Raskar, and M. Turk. Dealing with Multi-scale Depth Changes and Motion in Depth Edge Detection. Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI 2006), Manaus, Brazil, October 2006. Awarded "One of the Best Image Processing and Computer Vision Papers". [pdf]

H. Guan, J. Chang, L. Chen, R. Feris, and M. Turk. Multi-view Appearance-based 3D Hand Pose Estimation. IEEE Workshop on Vision for Human Computer Interaction (in conjunction with CVPR 2006), New York, NY, June 2006. [pdf]

H. Guan, R. Feris, and M. Turk. The Isometric Self-Organizing Map for Hand Pose Estimation. International Conference on Face and Gesture Recognition, Southampton, UK, 2006. [pdf]

R. Feris, R. Raskar, L. Chen, K. Tan, and M. Turk. Discontinuity Preserving Stereo with Small Baseline Multi-Flash Illumination. International Conference on Computer Vision (ICCV 2005, Oral Presentation), Beijing, China, 2005. [pdf]

Y. Zana, R. Cesar, R. Feris, and M. Turk. Face Verification in Polar Frequency Domain: a Biologically Motivated Approach. International Symposium on Visual Computing, Lake Tahoe, NV, 2005. [pdf]

R. Feris, R. Raskar, K. Tan, and M. Turk. Specular Reflection Reduction with Multi-Flash Imaging. Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI 2004), Curitiba, Brazil, October 2004. Also accepted as a poster in SIGGRAPH 2004. [pdf]

R. Feris, M. Turk, R. Raskar, K. Tan, and G. Ohashi. Exploiting Depth Discontinuities for Vision-based Fingerspelling Recognition. IEEE Workshop on Real-Time Vision for Human-Computer Interaction (in conjunction with CVPR 2004), Washington DC, USA, June 2004. [pdf]

K. Tan, J. Kobler, R. Feris, P. Dietz, and R. Raskar. Shape Enhanced Surgical Visualizations and Medical Illustrations with Multi-flash Imaging. International Conference on Medical Imaging Computing and Computer Assisted Intervention (MICCAI 2004), Rennes, France 2004. [pdf]

C. Hu, Y. Chang, R. Feris, and M. Turk. Manifold Based Analysis of Facial Expression. IEEE Workshop on Face Processing in Video (in conjunction with CVPR 2004), Washington DC, USA, June 2004. [pdf]

C. Hu, R. Feris, and M. Turk. Real-time View-Based Face Alignment Using Active Wavelet Networks. Workshop on Analysis and Modeling of Faces and Gestures (in conjunction with ICCV 2003), Nice, France, October 2003. [pdf]

C. Hu, R. Feris, and M. Turk. Active Wavelet Networks for Face Alignment. British Machine Vision Conference (BMVC 2003), Norwich , 2003. [pdf]

M. Turk, C. Hu, R. Feris, F. Lashkari, and A. Beall. TLA Based Face Tracking. International Conference on Vision Interfaces, Calgary, Canada , 2002. [pdf]

R. Feris, J. Gemmell, K. Toyama, and V. Krueger. Hierarchical Wavelet Networks for Facial Feature Localization. International Conference on Automatic Face and Gesture Recognition, Washington D.C., USA, May 20-21, 2002. [pdf]

R. Feris, V. Krueger, and R. M. Cesar Jr. Efficient Real-Time Face Tracking in Wavelet Subspace. Workshop on Recognition, Analysis and Tracking of Faces and Gestures in Real-Time Systems (in conjunction with ICCV 2001), Vancouver ,BC, 2001. [pdf]

R. Feris and R. M. Cesar Jr. Locating and Tracking Facial Landmarks Using Gabor Wavelet Networks. Lecture Notes in Computer Science, vol. 2013, pp. 311-320. International Conference on Advances in Pattern Recognition, Rio de Janeiro, Brazil, May 2001. [pdf]

V. Krueger and R. Feris. Wavelet Subspace Method for Real-time Face Tracking. Proc. Pattern Recognition. DAGM Symposium, Munich, Germany 2001.

R. Feris, T. E. Campos, and R. M. Cesar Jr. A Project for Face Recognition from Video Sequences Using GWN and Eigenfeature Selection. Workshop in Artificial Intelligence and Computer Vision, Atibaia, Brazil, November 2000. [pdf]

R. Feris and R. M. Cesar Jr. Tracking Facial Features Using Gabor Wavelet Networks. Symposium on Computer Graphics and Image Processing (SIBGRAPI 2000), Gramado, Brazil, October 2000. [pdf]

T. Campos, R. Feris, and R. M. Cesar Jr. Improved Face versus Non-Face Discrimination Using Fourier Descriptors through Feature Selection. Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI 2000), Gramado, Brazil, October 2000. [pdf]

R. Feris, T. Campos, and R. M. Cesar Jr. Detection and Tracking of Facial Features in Video Sequences. Lecture Notes on Artificial Intelligence, vol. 1793, pp. 127-135. Proceedings of MICAI-2000, Acapulco, Mexico, April 2000. [pdf]

T. Campos, R. Feris, and R. M. Cesar Jr. Eigenfaces versus Eigeneyes: First Steps Towards Performance Assessment of Representations for Face Recognition. Lecture Notes on Artificial Intelligence, vol. 1793, pp. 193-201. Proceedings of MICAI-2000, Acapulco, Mexico, April 2000. [pdf]

R. Feris and W. Lages. Stereo Image Matching Using Correlation and Relaxation Labeling. IV Congress for Scientific Initiation and Postgraduation at Aeronautics Institute of Technology, pp. 193-199, Sao Jose dos Campos-SP, Brazil, October 1998 (in portuguese).

Journal Publications

A. Rouditchenko, S. Thomas, H. Kuehne, R. Feris, and J. Glass. mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition. IEEE Signal Processing Letters, 2025. [pdf]

M. Monfort, K. Ramakrishnan, A. Andonian, B. McNamara, A. Lascelles, B. Pan, Q. Fan, D. Gutfreund, R. Feris, and A. Oliva. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022. [pdf]

J. Lee, Y. Bu, P. Sattigeri, R. Panda, G. Wornell, L. Karlinsky, and R. Feris. A Maximal Correlation Framework for Fair Machine Learning. Entropy 24(4), 461, 2022. [pdf]

E. Schwartz, L. Karlinsky, R. Feris, R. Giryes, and A. Bronstein. Baby steps towards few- shot learning with multiple semantics. Pattern Recognition Letters, 2022. [pdf]

S. Doveh, E. Schwartz, C. Xue, R. Feris, A. Bronstein, R. Giryes, and L. Karlinsky. MetAdapt: Meta-Learned Task-Adaptive Architecture for Few-Shot Classification. Pattern Recognition Letters, 2021. [pdf]

M. Merler, D. Joshi, Q. Nguyen, S. Hammer, J. Kent, J. Xiong, M. Do, J. Smith, and R. Feris. Automatic Curation of Sports Highlights using Multimodal Excitement Features. IEEE Transactions on MultiMedia (TMM) 2019. [pdf]

X. Peng , R. Feris, X. Wang, and D. Metaxas. Red-net: A recurrent encoder-decoder network for video-based face alignment. International Journal of Computer Vision (IJCV), 2018. [pdf] [code]

J. Xie, R. Feris, and M.T. Sun. Edge-Guided Single Depth Image Super Resolution. IEEE Transactions on Image Processing (TIP), vol. 25, no. 1, pp. 428-438, 2016. [source code and paper]

J. Xie, Y. Hsu, R. Feris, and M.T. Sun. Fine registration of 3D point clouds fusing structural and photometric information using an RGB-D camera. Journal of Visual Communication and Image Representation (JVCI), vol. 32, pp. 194-204, 2015. [pdf] [source code]

J. Xie, R. Feris, S. Yu, and M.T. Sun. Joint Super Resolution and De-noising from a Single Depth Image. IEEE Transactions on Multimedia (TMM), vol.17, no.9, pp.1525-1537, 2015.

R. Feris, B. Siddiquie, J. Petterson, Y. Zhai, A. Datta, L. Brown, and S. Pankanti. Large-Scale Vehicle Detection, Indexing, and Search in Urban Surveillance Videos. IEEE Transactions on Multimedia, 2012. [pdf]

S. Pankanti, L. Brown, J. Connell, A. Datta, Q. Fan, R. Feris, N. Haas, Y. Li, N. Ratha, and H. Thinh. Practical Computer Vision: Example Techniques and Challenges. IBM Journal of Research and Development, 201. [pdf]

Y. Tian, R. Feris, H. Liu, A. Hampapur, and M. Sun. Robust Detection of Abandoned and Removed Objects in Complex Surveillance Videos. IEEE Transactions on Systems, Man, and Cybernetics--Part C: Applications and Reviews, 2010. [pdf]

H. Liu, R. Feris, V. Krueger, and M.T. Sun. Unsupervised Action Classification Using Space-Time Link Analysis. Eurasip Journal on Advances in Signal Processing, 2010.

A. Hampapur et al. Analytics Driven Asset Management. IBM Smart Cities Journal, 2010.

R. Feris, R. Raskar, L. Chen, K. Tan, and M. Turk. Multi-Flash Stereopsis: Depth Edge Preserving Stereo with Small Baseline Illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 30, no. 1, pp. 147-159, 2008. [pdf]

Y. Zana, R. Cesar, R. Feris and M. Turk. Local Approach for Face Verification in Polar Frequency Domain. Image and Vision Computing Journal, vol. 24, no. 8, pp. 904-913, 2006. [pdf]

R. Feris, M. Turk, R. Raskar, and K. Tan. Specular Highlights Detection and Reduction with Multi-Flash Photography. International Journal of the Brazilian Computer Society, vol. 1, no. 12, pp. 35-42, 2006. [pdf]

Y. Chang, C. Hu, R. Feris, and M. Turk. Manifold-Based Analysis of Facial Expressions. Image and Vision Computing Journal, vol.24, no. 6, pp. 605-614, 2006. [pdf]

R. Raskar, K. Tan, R. Feris, J. Kobler, J. Yu, and M. Turk. Harnessing Real-World Depth Edges with Multi-Flash Imaging. IEEE Computer Graphics and Applications (IEEE CG&A), vol. 25, no. 1, pp. 32-38, January 2005. [pdf]

R. Feris, V. Krueger, and R. Cesar. A Wavelet Subspace Method for Real-time Face Tracking. Journal of Real-time Imaging, vol. 10, pp. 339-350, 2004. [pdf]

R. Raskar, K. Tan, R. Feris, J. Yu, and M. Turk. Non-photorealistic Camera: Depth Edge Detection and Stylized Rendering using Multi-Flash Imaging. ACM Transactions on Graphics (SIGGRAPH 2004), Vol. 23, Issue 3, August 2004. Also accepted in SIGGRAPH Emergent Technologies, 2004. [pdf]

Book Chapters

R. Feris, C. Lampert and D. Parikh. Introduction to Visual Attributes. In Visual Attributes (eds: R. S. Feris, C. Lampert, and D. Parikh), Springer, 2017. [pdf]

H. Liu, R. Feris, and M.T. Sun. Benchmarking Datasets for Human Activity Recognition. Visual Analysis of Humans – Looking at People, Springer 2011. [Book Link] [Slides]

Y. Zhai, R. Feris, A. Hampapur, S. Russo, and S. Pankanti. Parsing Object Events in Heavy Urban Traffic. Object Tracking, Intech, 2011.

H. Liu, M.T. Sun, and R. Feris. Video Activity Recognition. Multimedia Analysis, Processing and Communication, Z. Li, J. Kacprzyk, D. Tao, E. Izquierdo, W. Lin, and H. Wang ed., Springer, 2010. [Book Link]

R. Feris, A. Hampapur, Y. Zhai, R. Bobbitt, L. Brown, D. Vaquero, Y-L. Tian, H. Liu, and M-T. Sun. Case Study: IBM Smart Surveillance System. Intelligent Video Surveillance: Systems and Technology, by Taylor & Francis Group, LLC, 2009. [pdf, Book Link]

Y. Zhai, R. Feris, L. Brown, R. Bobbitt, A. Hampapur, S. Pankanti, Q. Fan, A. Yanagawa, Y. Tian, and S. Velipasalar. Composite Event Detection in Multi-Camera and Multi-Sensor Surveillance Networks. Multi-Camera Networks: Concepts and Applications, by Elsevier, 2009. [Book Link]

Y. Tian, R. Feris, L. Brown, D. Vaquero, Y. Zhai, and A.Hampapur. Multi-Scale People Detection and Motion Analysis for Video Surveillance. Machine Learning for Human Motion Analysis: Theory and Practice, IGI Global, 2009. [Book Link]

D. Vaquero, R. Feris, L. Brown, A. Hampapur, and Matthew Turk. Attribute-based People Search. Intelligent Video Surveillance: Systems and Technology, by Taylor & Francis Group, LLC, 2009. [Book Link]

Y. Tian, A. Hampapur, L. Brown, R. Feris, M. Lu, A. Senior, C. Shu, and Y. Zhai. Event Detection, Query, and Retrieval for Video Surveillance. Artificial Intelligence for Maximizing Content Based Image Retrieval, 2008. [Book Link]

R. Feris, M. Turk, R. Raskar, K. Tan, and G. Ohashi. Recognition of Isolated Fingerspelling Gestures Using Depth Edges. B. Kisacanin, V. Pavlovic and T. Huang (eds.), Real-time Vision for Human-Computer Interaction, Springer-Verlag, 2005. [pdf, Book Link]

Theses and Reports

A. Arbelle, G. Blackwood, L. Karlinsky, A. Sahoo, J. Schtok, and R. Feris. LATERAL: Learning Automatic, Transfer-Enhanced, and Relation-Aware Labels. Technical Report, DARPA Learning with Less Labels, 2023. [pdf]

M. Merler, N. Ratha, R. Feris, and J. Smith. Diversity in Faces. arXiv 2019. [pdf]

B. Cheng, Y. Wei, R. Feris, J. Xiong, W. Hwu, T. Huang, and H. Shi. Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection. arXiv 2018. [pdf] [code]

R. Feris. Detection and Analysis of Depth Discontinuities with Lighting and Viewpoint Variation. PhD thesis, University of California, Santa Barbara, 2006. [pdf]

R. Feris, J. Gemmell, K. Toyama, and V. Krueger. Facial Feature Detection using a Hierarchical Wavelet Face Database. Microsoft Research Technical Report MSR-TR-2002-05, January 2002. [pdf]

R. Feris. Efficient Real-time Face Tracking in Wavelet Subspace. MSc. thesis, University of Sao Paulo, May 2001. Awarded as the Second Best Computer Science MSc. Thesis in Brazil (Nationwide Competition, 2002). [pdf]

Demos and Posters

D. Joshi, M. Merler , Q. Nguyen, S. Hammer, J. Kent, J. Smith, and R. Feris. IBM High-Five: Highlights from Intelligent Video Engine. ACM Multimedia, 2017. [pdf]

F. Yu, L. Cao, R. Feris, J. Smith, and S. F. Chang. Designing Category-level Attributes for Discriminative Visual Recognition. Greater New York Area Workshop on Multimedia and Vision, 2013. (Best Poster Award)

IBM Centennial Demo – Traffic Analysis Using Video Analytics. Lincoln Center, Manhattan, New York 2011.

R. Feris, L. Brown, Q. Fan, A. Datta, S. Pankanti, A. Hampapur, Y. Zhai, R. Bobbitt, R. Kjeldsen, M. Lu, C. Shu, S. Russo, and C. Milite. IBM Smart Surveillance System: Searchable Video Analytics for Urban Environments and Retail Stores. CVPR Demo, 2010.

A. Agrawal, V. Branzoi, R. Chellappa, R. Feris, R. Raskar, K. Tan, and M. Turk. Depth Edges in Real-Time Using a Multi-Flash Camera. CVPR Demo, 2005.

R. Raskar, K. Tan, R. Feris, J. Yu, and M. Turk. Non-photorealistic Camera: Automatic Stylization with Multi-Flash Imaging. SIGGRAPH Emergent Technologies, 2004.

R. Feris, R. Raskar, K. Tan, and M. Turk. Specular Reflection Reduction Using a Multi-Flash Camera. SIGGRAPH poster, 2004.

Issued Patents

11,263,488 - System and method for augmenting few-shot object classification with semantic information from multiple sources
11,176,587 - Method, a system, and a computer readable storage medium for auto recommendations for an online shopping cart
11,100,145 - Dialog-based image retrieval with contextual information
11,079,924 - Cognitive graphical control element
10,977,303 - Image retrieval using interactive natural language dialog
10,679,047 - System and method for pose-aware feature learning
10,607,089 - Re-identifying an object in a test image
10,595,101 - Auto-curation and personalization of sports highlights
10,424,342 - Facilitating people search in video surveillance
10,390,986 - Control device for controlling a rigidity of an orthosis and method of controlling a rigidity of an orthosis
10,275,608 - Object-centric video redaction
10,169,661 - Filtering methods for visual object detection
10,163,355 - Dynamic management system, method, and recording medium for cognitive drone-swarms
10,163,042 - Finding missing persons by learning features for person attribute classification based on deep learning
10,089,551 - Self-optimized object detection using online detector selection
10,040,551 - Drone delivery of coffee based on a cognitive state of an individual
9,495,599 - Determination of train presence and motion state in railway environments
9,477,890 - Object detection using limited learned attribute ranges
9,471,852 - User-configurable settings for content obfuscation
9,460,361 - Foreground analysis based on tracking information
9,460,349 - Background understanding in video data
9,443,148 - Visual monitoring of queues using auxiliary devices
9,430,874 - Estimation of object properties in 3D world
9,424,659 - Real time processing of video frames
9,396,548 - Multi-cue object detection and analysis
9,342,594 - Indexing and searching according to attributes of a person
9,330,312 - Multispectral detection of personal attributes for video surveillance
9,330,111 - Hierarchical ranking of facial attributes
9,322,647 - Determining camera height using distr of object heights and object image heights
9,299,162 - Multi-mode video event indexing
9,280,833 - Topology determination for non-overlapping camera network
9,262,445 - Image ranking based on attribute correlation
9,251,425 - Object retrieval in video data using complementary detectors
9,245,186 - Semantic parsing of objects in video
9,224,049 - Detection of static object on thoroughfare crossings
9,224,046 - Multi-view object detection using appearance model transfer from similar scenes
9,165,375 - Automatically determining field of view overlap among multiple cameras
9,134,399 - Attribute-based person tracking across multiple cameras
9,104,919 - Multi-cue object association
9,082,201 - Surface contamination determination system
9,069,104 - Pathway management using model analysis and forecasting
9,058,669 - Incorporating video meta-data in 3D models
8,948,454 - Boosting object detection performance in videos
8,934,670 - Real time processing of video frames for triggering an alert
8,837,776 - Rule-based combination of a hierarchy of classifiers for occlusion detection
8,824,791 - Color correction for static cameras
8,811,663 - Object detection in crowded scenes
8,774,532 - Calibration of video object classification
8,675,917 - Abandoned object recognition using pedestrian detection
8,620,026 - Video-based detection of multiple object types under varying poses
8,488,881 - Object segmentation at a self-checkout
8,483,481 - Foreground analysis based on tracking information
8,249,301 - Video object classification
8,170,276 - Object detection system based on a pool of adaptive features
8,107,678 - Detection of abandoned and removed objects in a video stream
7,738,725 - Stylized rendering using a multi-flash camera