PapersByTopic | Rogerio Feris

Publications by Topic:

MultimodalLearning

Multimodal Learning (Vision, Speech, Sound, and Language)

Vision and Language

L. Karlinsky, A. Arbelle, A. Daniels, A. Nassar, A. Alfassi, B. Wu, D. Joshi, E. Schwartz, J. Kondic, N. Shabtay, P. Li, R. Herzig, S. Abedin, S. Perek, S. Harary, U. Barzelay, A. Goldfarb, A. Oliva, B. Wieles, B. Bhattacharjee, B. Huang, C. Auer, D. Gutfreund, D. Beymer, D. Wood, H. Kuehne, J. Hansen, J. Shtok, K. Wong, L. Bathen, M. Mishra, M. Lysak, M. Dolfi, M. Yurochkin, N. Livathinos, N. Harel, O. Azulai, O. Naparstek, R. Teixeira de Lima, R. Panda, S. Doveh, S. Gupta, S. Das, S. Zawad, Y. Kim, Z. He, A. Brooks, G. Goodhart, A. Govindjee, D. Leist, I. Ibrahim, A. Soffer, D. Cox, K. Soule, L. Lastras, N. Desai, S. Ofek-koifman, S. Raghavan, T. Syeda-Mahmood, P. Staar, T. Drory, and R. Feris. Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence. IBM Technical Report, 2025. [pdf]

I. Huang, W. Lin, M. Mirza, J. Hansen, S. Doveh, V. Butoi, R. Herzig, A. Arbelle, H. Kuehne, T. Darrell, C. Gan, A. Oliva, R. Feris, and L. Karlinsky. ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs. Conference on Neural Information Processing Systems (NeurIPS 2024). [pdf]

B. Chen , N. Shvetsova , A. Rouditchenko, D. Kondermann, S. Thomas, S. Chang, R. Feris, J. Glass, and H. Kuehne. What, when, and where? Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions. Conference on Computer Vision and Pattern Recognition (CVPR 2024). [pdf]

B. Pan, R. Panda, S. Jin, R. Feris, A. Oliva, P. Isola, and Y. Kim. LangNav: Language as a Perceptual Representation for Navigation. North American Chapter of the Association for Computational Linguistics (NAACL 2024). [pdf]

S. Doveh, A. Arbelle, S. Harary, R. Herzig, D. Kim, P. Cascante-Bonilla, A. Alfassy, R. Panda, R. Giryes, R. Feris, S. Ullman, and L. Karlinsky. Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models. Conference on Neural Information Processing Systems (NeurIPS 2023, Spotlight). [pdf]

M. Mirza, L. Karlinsky, W. Lin, H. Possegger, M. Kozinski, R. Feris, and H. Bischof. LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections. Conference on Neural Information Processing Systems (NeurIPS 2023). [pdf]

R. Herzig, A. Mendelson, L. Karlinsky, A. Arbelle, R. Feris, T. Darrell, A. Globerson. Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs. Conference on Empirical Methods in Natural Language Processing (EMNLP 2023). [pdf]

P. Cascante-Bonilla, K. Shehada, J. Smith, S. Doveh, D. Kim, R. Panda, G. Varol, A. Oliva, V. Ordonez, R. Feris, and L. Karlinsky. Going Beyond Nouns With Vision & Language Models Using Synthetic Data. International Conference on Computer Vision (ICCV 2023) [pdf]

W. Lin, L. Karlinsky, N. Shvetsova, H. Possegger, M. Kozinski, R. Panda, R. Feris, H. Kuehne, and H. Bischof. MAtch, eXpand and Improve: Unsupervised Finetuning for Zero- Shot Action Recognition with Language Knowledge. International Conference on Computer Vision (ICCV 2023) [pdf]

J. Smith, P. Cascante-Bonilla, A. Arbelle, D. Kim, R. Panda, D. Cox, D. Yang, Z. Kira, R. Feris, and L. Karlinsky. ConStruct-VL: Data-Free Continual Structured VL Concepts Learning. Conference on Computer Vision and Pattern Recognition (CVPR 2023). [pdf]

S. Doveh, A. Arbelle, S. Harary, R. Panda, R. Herzig, E. Schwartz, R. Giryes, R. Feris, S. Ullman, and L. Karlinsky. Teaching Structured Vision & Language Concepts to Vision & Language Models. Conference on Computer Vision and Pattern Recognition (CVPR 2023). [pdf]

A., Y. Chuang, N. Shvetsova, S. Thomas, R. Feris, B. Kingsbury, L. Karlinsky, D. Harwath, H. Kuehne, and J. Glass. C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval. International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023). [pdf]

A. Alfassy, A. Arbelle, O. Halimi, S. Harary, R. Herzig, E. Schwartz, R. Panda, M. Dolfi, C. Auer, P. Staar, K. Saenko, R. Feris, L. Karlinsky. FETA: Towards Specializing Foundational Models for Expert Task Applications. Conference on Neural Information Processing Systems (NeurIPS 2022 Dataset Track). [pdf]

P. Cascante-Bonilla, H. Wu, L. Wang, R. Feris, and V. Ordonez. SimVQA: Exploring Simulated Environments for Visual Question Answering. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

Y. Li, R. Panda, Y. Kim, C. Chen, R. Feris, D. Cox, and N. Vasconcelos. VALHALLA: Visual Hallucination for Machine Translation. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Levi, P. Sattigeri, R. Panda, R. Chen, A. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, and L. Karlinsky. Detector-Free Weakly Supervised Grounding by Separation. International Conference on Computer Vision (ICCV 2021, Oral). [pdf]

H. Wu, Y. Gao, X. Guo, Z. Al-Halah, S. Rennie, K. Grauman, and R. Feris. Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

S. Whitehead, H. Wu, H. Ji, R. Feris, and K. Saenko. Separating Skills and Concepts for Novel Visual Question Answering. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

M. Jaiswal et al. Video-Text Compliance: Activity Verification Based on Natural Language Instructions. ICCV Workshop on Large Scale Holistic Video Understanding, 2019. [pdf]

X. Guo*, H. Wu*, Y. Cheng, S. Rennie, G. Tesauro, and R. Feris. Dialog-based Interactive Image Retrieval. Neural Information Processing Systems (NeurIPS 2018), Montreal, Canada, 2018. [pdf] [video demo] (*equal contribution)

Audio-Visual Learning

E. Araujo, A. Rouditchenko, Y. Gong, S. Bhati, S. Thomas, B. Kingsbury, L. Karlinsky, R. Feris, J. Glass, and H. Kuehne. CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment. Conference on Computer Vision and Pattern Recognition (CVPR 2025). [pdf]

A. Rouditchenko, S. Thomas, H. Kuehne, R. Feris, and J. Glass. mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition. IEEE Signal Processing Letters, 2025. [pdf]

A. Rouditchenko, Y. Gong, S. Thomas, L. Karlinsky, H. Kuehne, R. Feris, and J. Glass. Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation (Interspeech 2024). [pdf]

S. Bhati, Y. Gong, L. Karlinsky, H. Kuehne, R. Feris, and J. Glass. DASS: Distilled Audio State Space Models Are Stronger and More Duration-Scalable Learners. IEEE Spoken Language Technology Workshop, 2024. [pdf]

A. Rouditchenko, S. Khurana, S. Thomas, R. Feris, L. Karlinsky , H. Huehne, D. Harwath, B. Kingsbury, and J. Glass. Comparison of Multilingual Self-Supervised and Weakly- Supervised Speech Pre-Training for Adaptation to Unseen Languages (Interspeech 2023). [pdf]

N. Shvetsova, B. Chen, A. Rouditchenko, S. Thomas, B. Kingsbury, R. Feris, D. Harwath, J. Glass, and H. Kuehne. Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

R. Panda, R. Chen, Q. Fan, X. Sun, K. Saenko, A. Oliva, and R. Feris. AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition. International Conference on Computer Vision (ICCV 2021). [pdf]

B. Chen, A. Rouditchenko, K. Duarte, H. Kuehne, S. Thomas, A. Boggust, R. Panda, B. Kingsbury, R. Feris, D. Harwath, J. Glass, M. Picheny, and S. F. Chang. Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos. International Conference on Computer Vision (ICCV 2021). [pdf]

A. Rouditchenko, A. Boggust, D. Harwath, B. Chen, D. Joshi, S.Thomas, K. Audhkhasi, H. Kuehne, R. Panda, R. Feris, B. Kingsbury, M. Picheny, A.Torralba, and J. Glass. AVLnet: Learning Audio-Visual Language Representations from Instructional Videos. Interspeech 2021 [pdf]

A. Rouditchenko, A. Boggust, D. Harwath, S. Thomas, H. Kuehne, B. Chen, R. Panda, R. Feris, B. Kingsbury, M. Picheny and J. Glass. Cascaded Multilingual Audio-Visual Learning from Videos. Interspeech 2021

M. Monfort, S. Jin, D. Harwath, R. Feris, J. Glass, and A. Oliva. Spoken Moments: A Large Scale Dataset of Audio Descriptions of Dynamic Events in Video. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

A. Boggust, K. Audhkhasi, D. Joshi, D. Harwath, S. Thomas, R. Feris, D. Gutfreund, Y. Zhang, A. Torralba, M. Picheny, and James Glass. Grounding Spoken Words in Unlabeled Video. CVPR Workshop on Sight and Sound, 2019. [pdf]

M. Merler, D. Joshi, Q. Nguyen, S. Hammer, J. Kent, J. Xiong, M. Do, J. Smith, and R. Feris. Automatic Curation of Sports Highlights using Multimodal Excitement Features. IEEE Transactions on MultiMedia (TMM) 2019. [pdf]

R. Gao, R. Feris and K. Grauman. Learning to Separate Object Sounds by Watching Unlabeled Video. European Conference on Computer Vision (ECCV 2018, Oral), Munich, Germany, 2018. [pdf] [project page]

M. Merler, D. Joshi, K. Mac, Q. Nguyen, S. Hammer, J. Kent, J. Xiong, M. Do, J. Smith, and R. Feris. The Excitement of Sports: Automatic Highlights using Audio-Visual Cues. CVPR Workshop on Sight and Sound, 2018. [pdf]

M. Merler, D. Joshi, Q. Nguyen, S. Hammer, J. Kent, J. R Smith, and R. Feris. Automatic Curation of Golf Highlights using Multimodal Excitement Features. Workshop of Computer Vision in Sports (in conjunction with CVPR), 2017. [pdf]

Multimodal Fashion Image Analysis

H. Wu, Y. Gao, X. Guo, Z. Al-Halah, S. Rennie, K. Grauman, and R. Feris. Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

X. Guo*, H. Wu*, Y. Cheng, S. Rennie, G. Tesauro, and R. Feris. Dialog-based Interactive Image Retrieval. Neural Information Processing Systems (NeurIPS 2018), Montreal, Canada, 2018. [pdf] [video demo] (*equal contribution)

J. Huang, R. Feris, Q. Chen, and S. Yan. Cross-domain Image Retrieval with a Dual Attribute-aware Ranking Network. IEEE International Conference on Computer Vision (ICCV 2015), Santiago, Chile, December 2015. [pdf] [data]

Q. Chen, J. Huang, R. Feris, L. Brown, J. Dong, and S. Yan. Deep Domain Adaptation for Describing People Based on Fine-Grained Clothing Attributes. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), Boston, Massachusetts, June 2015. [pdf]

Egocentric Video + Geo-location + Weather

J. Wang, Y. Cheng, and R. Feris. Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016, Oral), Las Vegas, Nevada, June 2016. [pdf]

DataEfficiency

Data Efficiency: Learning More From Less

Pre-training and Transfer from Synthetic Data

R. Wang, S. Ghosh, D. Cox, D. Antognini, A. Oliva, R. Feris, and L. Karlinsky. Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning. Conference on Neural Information Processing Systems (NeurIPS 2024). [pdf]

J. Kang, H. Luo, Y. Zhu, J. Hansen, J. Glass, D. Cox, A. Ritter, R. Feris, L. Karlinsky. Self-Specialization: Uncovering Latent Expertise within Large Language Models (ACL 2024, Findings). [pdf]

H. Zhong, S. Mishra, D. Kim, S. Jin, R. Panda, H. Kuehne, L. Karlinsky, V. Saligrama, A. Oliva, and R. Feris. Learning Human Action Recognition Representations Without Real Humans. Conference on Neural Information Processing Systems (NeurIPS 2023 Datasets Track) [pdf]

Z. He, G. Blackwood, R. Panda, J. McAuley, and R. Feris. Synthetic Pre-trained Tasks for Neural Machine Translation (ACL 2023, Findings). [pdf]

P. Cascante-Bonilla, K. Shehada, J. Smith, S. Doveh, D. Kim, R. Panda, G. Varol, A. Oliva, V. Ordonez, R. Feris, and L. Karlinsky. Going Beyond Nouns With Vision & Language Models Using Synthetic Data. International Conference on Computer Vision (ICCV 2023) [pdf]

M. Baradad, R. Chen, J. Wulff, T. Wang, R. Feris, A. Torralba, and P. Isola. Procedural Image Programs for Representation Learning. Conference on Neural Information Processing Systems (NeurIPS 2022). [pdf]

Y. Kim, S. Mishra, S. Jin, R. Panda, H. Kuehne, L. Karlinsky, V. Saligrama, K. Saenko, A. Oliva, and R. Feris. How Transferable are Video Representations Based on Synthetic Data? Conference on Neural Information Processing Systems (NeurIPS 2022 Dataset Track). [pdf]

S. Mishra, R. Panda, C. Phoo, C. Chen, L. Karlinsky, K. Saenko, V. Saligrama, and R. Feris. Task2Sim: Towards Effective Pre-training and Transfer from Synthetic Data. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

P. Cascante-Bonilla, H. Wu, L. Wang, R. Feris, and V. Ordonez. SimVQA: Exploring Simulated Environments for Visual Question Answering. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

Y. Li, R. Panda, Y. Kim, C. Chen, R. Feris, D. Cox, and N. Vasconcelos. VALHALLA: Visual Hallucination for Machine Translation. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

Transfer, Multi-Task, and Continual Learning

J. Smith, L. Valkov, S. Halbe, V. Gutta, R. Feris, Z. Kira, and L. Karlinsky. Adaptive Memory Replay for Continual Learning. CVPR Workshop on Efficient Large Vision Models, 2024. [pdf]

A. Arbelle, G. Blackwood, L. Karlinsky, A. Sahoo, J. Schtok, and R. Feris. LATERAL: Learning Automatic, Transfer-Enhanced, and Relation-Aware Labels. Technical Report, DARPA Learning with Less Labels, 2023. [pdf]

J. Smith, L. Karlinsky, V. Gutta, P. Cascante-Bonilla, D. Kim, A. Arbelle, R. Panda, R. Feris, and Zsolt Kira. CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning. Conference on Computer Vision and Pattern Recognition (CVPR 2023). [pdf]

J. Smith, P. Cascante-Bonilla, A. Arbelle, D. Kim, R. Panda, D. Cox, D. Yang, Z. Kira, R. Feris, and L. Karlinsky. ConStruct-VL: Data-Free Continual Structured VL Concepts Learning. Conference on Computer Vision and Pattern Recognition (CVPR 2023). [pdf]

Z. Wang, R. Panda, L. Karlinsky, R. Feris, H. Sun, and Y. Kim. Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning. International Conference on Learning Representations (ICLR 2023). [pdf]

A. Alfassy, A. Arbelle, O. Halimi, S. Harary, R. Herzig, E. Schwartz, R. Panda, M. Dolfi, C. Auer, P. Staar, K. Saenko, R. Feris, L. Karlinsky. FETA: Towards Specializing Foundational Models for Expert Task Applications. Conference on Neural Information Processing Systems (NeurIPS 2022 Dataset Track). [pdf]

A. Islam, R. Chen, R. Panda, L. Karlinsky, R. Radke, and R. Feris. A Broad Study on the Transferability of Visual Representations with Contrastive Learning. International Conference on Computer Vision (ICCV 2021). [pdf]

R. Panda, M. Merler, M. Jaiswal, H. Wu, K. Ramakrishnan, U. Finkler, C. Chen, M. Cho, R. Feris, D. Kung, and B. Bhattacharjee. NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search. AAAI Conference on Artificial Intelligence (AAAI 2021). [pdf]

X. Sun, R. Panda, R. Feris, and K. Saenko. AdaShare: Learning What to Share for Efficient Deep Multi-Task Learning. Conference on Neural Information Processing Systems (NeurIPS 2020). [pdf]

Y. Guo, N. Codella, L. Karlinsky, J. Codella, J. Smith, K. Saenko, T. Rosing, and R. Feris. A Broader Study of Cross-Domain Few-Shot Learning. European Conference on Computer Vision (ECCV 2020). [pdf]

K. Ramakrishnan, R. Panda, Q. Fan, J. Henning, A. Oliva, and R. Feris. Relationship Matters: Relation Guided Knowledge Transfer for Incremental Learning of Object Detectors. CVPR Workshop on Continual Learning in Computer Vision, 2020. [pdf]

Y. Guo, H. Shi, A. Kumar, K. Grauman, T. Rosing, and R. Feris. SpotTune: Transfer Learning through Adaptive Fine-tuning. Conference on Computer Vision and Pattern Recognition (CVPR 2019), Long Beach, California, 2019. [pdf]

Y. Lu, A. Kumar, S. Zhai, Y. Cheng, T. Javidi, and R. Feris. Fully Adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017, Spotlight), Honolulu, Hawaii, July 2017. [pdf]

Self-Supervised Representation Learning

T. Li, L. Fan, Y. Yuan, H. He, Y. Tian, R. Feris, P. Indyk, and D. Katabi. Addressing Feature Suppression in Unsupervised Visual Representations. IEEE Winter Conference on Applications of Computer Vision (WACV 2023). [pdf]

M. Baradad, R. Chen, J. Wulff, T. Wang, R. Feris, A. Torralba, and P. Isola. Procedural Image Programs for Representation Learning. Conference on Neural Information Processing Systems (NeurIPS 2022). [pdf]

S. Harary, E. Schwartz, A. Arbelle, P. Staar, S. Abu-Hussein, E. Amrani, R. Herzig, A. Alfassy, R. Giryes, H. Kuehne, D. Katabi, K. Saenko, R. Feris, and L. Karlinsky. Unsupervised Domain Generalization by Learning a Bridge Across Domains. Conference on Computer Vision and Pattern Recognition (CVPR 2022, Oral). [pdf]

N. Shvetsova, B. Chen, A. Rouditchenko, S. Thomas, B. Kingsbury, R. Feris, D. Harwath, J. Glass, and H. Kuehne. Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

T. Li, P. Cao, Y. Yuan, L. Fan, Y. Yang, R. Feris, P. Indyk, and D. Katabi. Targeted Supervised Contrastive Learning for Long-Tailed Recognition. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

A. Islam. C. Chen, R. Panda, L. Karlinsky, R. Feris, and R. Radke. Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data. Conference on Neural Information Processing Systems (NeurIPS 2021). [pdf]

A. Islam, R. Chen, R. Panda, L. Karlinsky, R. Radke, and R. Feris. A Broad Study on the Transferability of Visual Representations with Contrastive Learning. International Conference on Computer Vision (ICCV 2021). [pdf]

B. Chen, A. Rouditchenko, K. Duarte, H. Kuehne, S. Thomas, A. Boggust, R. Panda, B. Kingsbury, R. Feris, D. Harwath, J. Glass, M. Picheny, and S. F. Chang. Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos. International Conference on Computer Vision (ICCV 2021). [pdf]

A. Rouditchenko, A. Boggust, D. Harwath, B. Chen, D. Joshi, S.Thomas, K. Audhkhasi, H. Kuehne, R. Panda, R. Feris, B. Kingsbury, M. Picheny, A.Torralba, and J. Glass. AVLnet: Learning Audio-Visual Language Representations from Instructional Videos. Interspeech 2021 [pdf]

G. Bukchin, E. Schwartz, K. Saenko, O. Shahar, R. Feris, R. Giryes, and L. Karlinsky. Fine-grained Angular Contrastive Learning with Coarse Labels. Conference on Computer Vision and Pattern Recognition (CVPR 2021, Oral). [pdf]

A. Singh, O. Chakraborty, A. Varshney, R. Panda, R. Feris, K. Saenko, and A. Das. Semi-Supervised Action Recognition with Temporal Contrastive Learning. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

Few-shot Learning

E. Schwartz, L. Karlinsky, R. Feris, R. Giryes, and A. Bronstein. Baby steps towards few- shot learning with multiple semantics. Pattern Recognition Letters, 2022. [pdf]

A. Islam. C. Chen, R. Panda, L. Karlinsky, R. Feris, and R. Radke. Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data. Conference on Neural Information Processing Systems (NeurIPS 2021). [pdf]

L. Karlinsky, J. Shtok, A. Alfassy, M. Lichtenstein, S. Harary, E. Schwartz, S. Doveh, P. Sattigeri, R. Feris, A. Bronstein, and R. Giryes. StarNet: towards Weakly Supervised Few-Shot Object Detection. AAAI Conference on Artificial Intelligence (AAAI 2021). [pdf]

S. Doveh, E. Schwartz, C. Xue, R. Feris, A. Bronstein, R. Giryes, and L. Karlinsky. MetAdapt: Meta-Learned Task-Adaptive Architecture for Few-Shot Classification. Pattern Recognition Letters, 2021. [pdf]

Y. Guo, N. Codella, L. Karlinsky, J. Codella, J. Smith, K. Saenko, T. Rosing, and R. Feris. A Broader Study of Cross-Domain Few-Shot Learning. European Conference on Computer Vision (ECCV 2020). [pdf]

M. Lichtenstein, P. Sattigeri, R. Feris, R. Giryes, and L. Karlinsky. TAFSSL: Task-Adaptive Feature Sub-Space Learning for Few-shot Classification. European Conference on Computer Vision (ECCV 2020). [pdf]

A. Alfassy, L. Karlinsky, A. Aides, J. Shtok, S. Harary, R. Feris, R. Giryes, and A. Bronstein. LaSO: Label-Set Operations Networks for Multi-label Few-shot Learning. Conference on Computer Vision and Pattern Recognition (CVPR 2019, Oral), Long Beach, California, 2019. [pdf]

L. Karlinsky, J. Shtok, S. Harary, E. Schwartz, A. Aides, R. Feris, R. Giryes, and A. Bronstein. RepMet: Representative-based Metric Learning for Classification and One-shot Object Detection. Conference on Computer Vision and Pattern Recognition (CVPR 2019), Long Beach, California, 2019. [pdf]

E. Schwartz*, L. Karlinsky*, J. Shtok, S. Harary, M. Marder, A. Kumar, R. Feris, R. Giryes, and A. Bronstein. Delta-Encoder: an Effective Sample Synthesis Method for Few-shot Object Recognition. Neural Information Processing Systems (NeurIPS 2018, Spotlight), Montreal, Canada, 2018. [pdf] (* equal contribution)

Domain Adaptation and Generalization

A. Sahoo, R. Panda, R. Feris, K., and A. Das. Select, Label, and Mix: Learning Discriminative Invariant Feature Representations for Partial Domain Adaptation. IEEE Winter Conference on Applications of Computer Vision (WACV 2023, Best Paper Award Honorable Mention). [pdf]

S. Harary, E. Schwartz, A. Arbelle, P. Staar, S. Abu-Hussein, E. Amrani, R. Herzig, A. Alfassy, R. Giryes, H. Kuehne, D. Katabi, K. Saenko, R. Feris, and L. Karlinsky. Unsupervised Domain Generalization by Learning a Bridge Across Domains. Conference on Computer Vision and Pattern Recognition (CVPR 2022, Oral). [pdf]

Z. Wang, M. Yu, Y. Wei, R. Feris, J. Xiong, W. Hwu, T. Huang, and H. Shi. Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation. Conference on Computer Vision and Pattern Recognition (CVPR 2020). [pdf]

A. Kumar, P. Sattigeri, K. Wadhawan, L. Karlinsky, R. Feris, W. T. Freeman, and G. Wornell. Co-regularized Alignment for Unsupervised Domain Adaptation. Neural Information Processing Systems (NeurIPS 2018), Montreal, Canada, 2018. [pdf]

F. Rashed, B. Siddiquie, R. Feris, and L. Davis. Domain Adaptive Object Detection. Winter Conference on Applications of Computer Vision (WACV 2013), Florida, USA, 2013. [pdf]

Generative Data Augmentation

Z. Tang, Y. Gao, P. Sattigeri, L. Karlinsky, R. Feris, and D. Metaxas. OnlineAugment: Online Data Augmentation with Less Domain Knowledge. European Conference on Computer Vision (ECCV 2020). [pdf]

A. Sahoo, A. Singh, R. Panda, R. Feris, and A. Das. Mitigating Dataset Imbalance via Joint Generation and Classification. ECCV Workshop on Imbalance Problems in Computer Vision, 2020. [pdf]

A. Alfassy, L. Karlinsky, A. Aides, J. Shtok, S. Harary, R. Feris, R. Giryes, and A. Bronstein. LaSO: Label-Set Operations Networks for Multi-label Few-shot Learning. Conference on Computer Vision and Pattern Recognition (CVPR 2019, Oral), Long Beach, California, 2019. [pdf]

E. Schwartz*, L. Karlinsky*, J. Shtok, S. Harary, M. Marder, A. Kumar, R. Feris, R. Giryes, and A. Bronstein. Delta-Encoder: an Effective Sample Synthesis Method for Few-shot Object Recognition. Neural Information Processing Systems (NeurIPS 2018, Spotlight), Montreal, Canada, 2018. [pdf] (* equal contribution)

X. Peng, Z. Tang, F. Yang, R. Feris, and D. Metaxas. Jointly optimize data augmentation and network training: Adversarial data augmentation in human pose estimation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, Utah, June 2018. [pdf]

S. Zhai, H. Wu, A. Kumar, Y. Cheng, Y. Lu, Z. Zhang,and R. Feris. S3Pool: Pooling with Stochastic Spatial Sampling. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), Honolulu, Hawaii, July 2017. [pdf] [code]

ModelEfficiency

Model Efficiency: Dynamic Neural Networks and Beyond

Dynamic Neural Networks

X. Sun, R. Panda, C. Chen, N. Wang, B. Pan, K. Gopalakrishnan, A. Oliva, R. Feris, and K. Saenko. Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths. IEEE Winter Conference on Applications of Computer Vision (WACV 2024). [pdf]

B. Pan, Y. Jiang, R. Panda, Z. Wang, R. Feris, and A. Oliva. IA-RED^2: Interpretability-Aware Redundancy Reduction for Vision Transformers. Conference on Neural Information Processing Systems (NeurIPS 2021). [pdf]

R. Panda, R. Chen, Q. Fan, X. Sun, K. Saenko, A. Oliva, and R. Feris. AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition. International Conference on Computer Vision (ICCV 2021). [pdf]

X. Sun, R. Panda, R. Chen, A. Oliva, R. Feris, and K. Saenko. Dynamic Network Quantization for Efficient Video Inference. International Conference on Computer Vision (ICCV 2021). [pdf]

B. Pan, R. Panda, C. Fosco, C. Lin, A. Andonian, Y. Meng, K. Saenko, A. Oliva, and R. Feris. VA-RED^2: Video Adaptive Redundancy Reduction. International Conference on Learning Representations (ICLR 2021). [pdf]

Y. Meng, R. Panda, C. Lin, P. Sattigeri, L. Karlinsky, K. Saenko, A. Oliva, and R. Feris. AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition. International Conference on Learning Representations (ICLR 2021). [pdf]

X. Sun, R. Panda, R. Feris, and K. Saenko. AdaShare: Learning What to Share for Efficient Deep Multi-Task Learning. Conference on Neural Information Processing Systems (NeurIPS 2020). [pdf]

Y. Meng, C. Lin, R. Panda, P. Sattigeri, L. Karlinsky, K. Saenko, A. Oliva, and R. Feris. AR-Net: Adaptive Frame Resolution for Efficient Action Recognition. European Conference on Computer Vision (ECCV 2020). [pdf]

Y. Guo, H. Shi, A. Kumar, K. Grauman, T. Rosing, and R. Feris. SpotTune: Transfer Learning through Adaptive Fine-tuning. Conference on Computer Vision and Pattern Recognition (CVPR 2019), Long Beach, California, 2019. [pdf]

Z. Wu*, T. Nagarajan*, A. Kumar, S. Rennie, L. Davis, K. Grauman, and R. Feris. BlockDrop: Dynamic Inference Paths in Residual Networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018, Spotlight), Salt Lake City, Utah, June 2018. [pdf] (* equal contribution)

Efficient Architectures

P. Wang, R. Panda, L. Hennigen, P. Greengard, L. Karlinsky, R. Feris, D. Cox, Z. Wang, and Y. Kim. Learning to Grow Pretrained Models for Efficient Transformer Training. International Conference on Learning Representations (ICLR 2023, notable-top-25%). [pdf]

Z. Wang, R. Panda, L. Karlinsky, R. Feris, H. Sun, and Y. Kim. Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning. International Conference on Learning Representations (ICLR 2023). [pdf]

C. Chen, Q. Fan, N. Mallinar, T. Sercu, and R. Feris. Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition. International Conference on Learning Representations (ICLR 2019), New Orleans, Louisiana, 2019. [pdf]

Y. Lu, A. Kumar, S. Zhai, Y. Cheng, T. Javidi, and R. Feris. Fully Adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017, Spotlight), Honolulu, Hawaii, July 2017. [pdf]

Z. Cai, Q. Fan, R. Feris, and N. Vasconcelos. A Unified Multi Scale Deep Convolutional Neural Network for Fast Object Detection. European Conference on Computer Vision (ECCV 2016), Amsterdam, Netherlands, 2016. [pdf] [code] [demo] [KITTI results]

Y. Cheng, F. Yu, R. Feris, S. Kumar, A. Choudhary, and S. F. Chang. An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections. IEEE International Conference on Computer Vision (ICCV 2015), Santiago, Chile, December 2015. [pdf] [code]

NLP

Natural Language Processing

Large Language Models: Memory, Long-context Modeling, and Self-specialization

Y. Wang, D. Krotov, Y. Hu, Y. Gao, W. Zhou, J. McAuley, D. Gutfreund, R.Feris, and Z. He. M+: Extending MemoryLLM with Scalable Long-Term Memory. International Conference on Machine Learning (ICML 2025). [pdf]

J. Kang, L. Karlinsky, H. Luo, Z. Wang, J. Hansen, J. Glass, D. Cox, R. Panda, R. Feris, and A. Ritter. Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts. International Conference on Learning Representations (ICLR 2025). [pdf]

R. Wang, S. Ghosh, D. Cox, D. Antognini, A. Oliva, R. Feris, and L. Karlinsky. Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning. Conference on Neural Information Processing Systems (NeurIPS 2024). [pdf]

Z. He, L. Karlinsky, D. Kim, J. McAuley, D. Krotov, and R. Feris. CAMELoT: Towards Large Language Models with Training-Free Associative Memory. ICML 2024 Workshop on Long-Context Foundation Models, 2024. [pdf]

M. Stallone et al. Scaling Granite Code Models to 128K Context. IBM Technical Report, 2024. [pdf]

J. Li, S. Das, A. Oliva, D. Krotov, L. Karlinsky, and R. Feris. Long Context Understanding using Self-Generated Synthetic Data. ICML 2024 Workshop on Long-Context Foundation Models, 2024. [pdf]

J. Kang, H. Luo, Y. Zhu, J. Hansen, J. Glass, D. Cox, A. Ritter, R. Feris, L. Karlinsky. Self-Specialization: Uncovering Latent Expertise within Large Language Models (ACL 2024, Findings). [pdf]

Data and Model Efficiency in NLP

Z. He, G. Blackwood, R. Panda, J. McAuley, and R. Feris. Synthetic Pre-trained Tasks for Neural Machine Translation (ACL 2023, Findings). [pdf]

P. Wang, R. Panda, L. Hennigen, P. Greengard, L. Karlinsky, R. Feris, D. Cox, Z. Wang, and Y. Kim. Learning to Grow Pretrained Models for Efficient Transformer Training. International Conference on Learning Representations (ICLR 2023, notable-top-25%). [pdf]

Z. Wang, R. Panda, L. Karlinsky, R. Feris, H. Sun, and Y. Kim. Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning. International Conference on Learning Representations (ICLR 2023). [pdf]

Multimodal NLP

R. Herzig, A. Mendelson, L. Karlinsky, A. Arbelle, R. Feris, T. Darrell, A. Globerson. Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs. Conference on Empirical Methods in Natural Language Processing (EMNLP 2023). [pdf]

S. Doveh, A. Arbelle, S. Harary, R. Herzig, D. Kim, P. Cascante-Bonilla, A. Alfassy, R. Panda, R. Giryes, R. Feris, S. Ullman, and L. Karlinsky. Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models. Conference on Neural Information Processing Systems (NeurIPS 2023, Spotlight). [pdf]

S. Doveh, A. Arbelle, S. Harary, R. Panda, R. Herzig, E. Schwartz, R. Giryes, R. Feris, S. Ullman, and L. Karlinsky. Teaching Structured Vision & Language Concepts to Vision & Language Models. Conference on Computer Vision and Pattern Recognition (CVPR 2023). [pdf]

Y. Li, R. Panda, Y. Kim, C. Chen, R. Feris, D. Cox, and N. Vasconcelos. VALHALLA: Visual Hallucination for Machine Translation. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

A. Alfassy, A. Arbelle, O. Halimi, S. Harary, R. Herzig, E. Schwartz, R. Panda, M. Dolfi, C. Auer, P. Staar, K. Saenko, R. Feris, L. Karlinsky. FETA: Towards Specializing Foundational Models for Expert Task Applications. Conference on Neural Information Processing Systems (NeurIPS 2022 Dataset Track). [pdf]

P. Cascante-Bonilla, H. Wu, L. Wang, R. Feris, and V. Ordonez. SimVQA: Exploring Simulated Environments for Visual Question Answering. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Levi, P. Sattigeri, R. Panda, R. Chen, A. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, and L. Karlinsky. Detector-Free Weakly Supervised Grounding by Separation. International Conference on Computer Vision (ICCV 2021, Oral). [pdf]

H. Wu, Y. Gao, X. Guo, Z. Al-Halah, S. Rennie, K. Grauman, and R. Feris. Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

S. Whitehead, H. Wu, H. Ji, R. Feris, and K. Saenko. Separating Skills and Concepts for Novel Visual Question Answering. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

S. Whitehead, H. Wu, Y. Fung, H. Ji, R. Feris, and K. Saenko. Learning from Lexical Perturbations for Consistent Visual Question Answering (Arxix 2020). [pdf]

M. Jaiswal et al. Video-Text Compliance: Activity Verification Based on Natural Language Instructions. ICCV Workshop on Large Scale Holistic Video Understanding, 2019. [pdf]

X. Guo*, H. Wu*, Y. Cheng, S. Rennie, G. Tesauro, and R. Feris. Dialog-based Interactive Image Retrieval. Neural Information Processing Systems (NeurIPS 2018), Montreal, Canada, 2018. [pdf] [video demo] (*equal contribution)

VideoUnderstanding

Video Understanding, Action Recognition, and Tracking

Action Recognition and Multimodal Video Understanding

E. Araujo, A. Rouditchenko, Y. Gong, S. Bhati, S. Thomas, B. Kingsbury, L. Karlinsky, R. Feris, J. Glass, and H. Kuehne. CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment. Conference on Computer Vision and Pattern Recognition (CVPR 2025). [pdf]

B. Chen , N. Shvetsova , A. Rouditchenko, D. Kondermann, S. Thomas, S. Chang, R. Feris, J. Glass, and H. Kuehne. What, when, and where? Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions. Conference on Computer Vision and Pattern Recognition (CVPR 2024). [pdf]

Baughman, S. Hammer, R. Agarwal, R. Feris, G. Akay, E. Morales, L. Karlinsky, and T. Johnson. Large Scale Generative AI Text Applied to Sports and Music. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2024). [pdf]

H. Zhong, S. Mishra, D. Kim, S. Jin, R. Panda, H. Kuehne, L. Karlinsky, V. Saligrama, A. Oliva, and R. Feris. Learning Human Action Recognition Representations Without Real Humans. Conference on Neural Information Processing Systems (NeurIPS 2023 Datasets Track) [pdf]

W. Lin, L. Karlinsky, N. Shvetsova, H. Possegger, M. Kozinski, R. Panda, R. Feris, H. Kuehne, and H. Bischof. MAtch, eXpand and Improve: Unsupervised Finetuning for Zero- Shot Action Recognition with Language Knowledge. International Conference on Computer Vision (ICCV 2023) [pdf]

Y. Kim, S. Mishra, S. Jin, R. Panda, H. Kuehne, L. Karlinsky, V. Saligrama, K. Saenko, A. Oliva, and R. Feris. How Transferable are Video Representations Based on Synthetic Data? Conference on Neural Information Processing Systems (NeurIPS 2022 Dataset Track). [pdf]

N. Shvetsova, B. Chen, A. Rouditchenko, S. Thomas, B. Kingsbury, R. Feris, D. Harwath, J. Glass, and H. Kuehne. Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval. Conference on Computer Vision and Pattern Recognition (CVPR 2022). [pdf]

M. Monfort, K. Ramakrishnan, A. Andonian, B. McNamara, A. Lascelles, B. Pan, Q. Fan, D. Gutfreund, R. Feris, and A. Oliva. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022. [pdf]

B. Pan, Y. Jiang, R. Panda, Z. Wang, R. Feris, and A. Oliva. IA-RED^2: Interpretability-Aware Redundancy Reduction for Vision Transformers. Conference on Neural Information Processing Systems (NeurIPS 2021). [pdf]

R. Panda, R. Chen, Q. Fan, X. Sun, K. Saenko, A. Oliva, and R. Feris. AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition. International Conference on Computer Vision (ICCV 2021). [pdf]

X. Sun, R. Panda, R. Chen, A. Oliva, R. Feris, and K. Saenko. Dynamic Network Quantization for Efficient Video Inference. International Conference on Computer Vision (ICCV 2021). [pdf]

B. Chen, A. Rouditchenko, K. Duarte, H. Kuehne, S. Thomas, A. Boggust, R. Panda, B. Kingsbury, R. Feris, D. Harwath, J. Glass, M. Picheny, and S. F. Chang. Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos. International Conference on Computer Vision (ICCV 2021). [pdf]

A. Rouditchenko, A. Boggust, D. Harwath, B. Chen, D. Joshi, S.Thomas, K. Audhkhasi, H. Kuehne, R. Panda, R. Feris, B. Kingsbury, M. Picheny, A.Torralba, and J. Glass. AVLnet: Learning Audio-Visual Language Representations from Instructional Videos. Interspeech 2021 [pdf]

A. Rouditchenko, A. Boggust, D. Harwath, S. Thomas, H. Kuehne, B. Chen, R. Panda, R. Feris, B. Kingsbury, M. Picheny and J. Glass. Cascaded Multilingual Audio-Visual Learning from Videos. Interspeech 2021

M. Monfort, S. Jin, D. Harwath, R. Feris, J. Glass, and A. Oliva. Spoken Moments: A Large Scale Dataset of Audio Descriptions of Dynamic Events in Video. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

A. Singh, O. Chakraborty, A. Varshney, R. Panda, R. Feris, K. Saenko, and A. Das. Semi-Supervised Action Recognition with Temporal Contrastive Learning. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

C. Chen, R. Panda, K. Ramakrishnan, R. Feris, J. Cohn, A. Oliva, and Q. Fan. Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

B. Pan, R. Panda, C. Fosco, C. Lin, A. Andonian, Y. Meng, K. Saenko, A. Oliva, and R. Feris. VA-RED^2: Video Adaptive Redundancy Reduction. International Conference on Learning Representations (ICLR 2021). [pdf]

Y. Meng, R. Panda, C. Lin, P. Sattigeri, L. Karlinsky, K. Saenko, A. Oliva, and R. Feris. AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition. International Conference on Learning Representations (ICLR 2021). [pdf]

Y. Meng, C. Lin, R. Panda, P. Sattigeri, L. Karlinsky, K. Saenko, A. Oliva, and R. Feris. AR-Net: Adaptive Frame Resolution for Efficient Action Recognition. European Conference on Computer Vision (ECCV 2020). [pdf]

A. Andonian, C. Fosco, M. Monfort, A. Lee, R. Feris, C. Vondrick, and A. Oliva. We Have So Much In Common: Modeling Semantic Relational Set Abstractions in Videos. European Conference on Computer Vision (ECCV 2020). [pdf]

M. Khoi-Nguyen, D. Joshi, R. Yeh, J. Xiong, R. Feris, and M. Do. Learning Motion in Feature Space: Locally- Consistent Deformable Convolution Networks for Fine Grained Action Detection. International Conference on Computer Vision (ICCV 2019, Oral), Seoul, Korea, 2019. [pdf]

A. Boggust, K. Audhkhasi, D. Joshi, D. Harwath, S. Thomas, R. Feris, D. Gutfreund, Y. Zhang, A. Torralba, M. Picheny, and James Glass. Grounding Spoken Words in Unlabeled Video. CVPR Workshop on Sight and Sound, 2019. [pdf]

K. Ramakrishnan, M. Monfort, B. McNamara, A. Lascelles, D. Gutfreund, R. Feris, and A. Oliva. Identifying Interpretable Action Concepts in Deep Networks. CVPR Workshop on Explainable AI, 2019. [pdf]

M. Jaiswal et al. Video-Text Compliance: Activity Verification Based on Natural Language Instructions. ICCV Workshop on Large Scale Holistic Video Understanding, 2019. [pdf]

M. Merler, D. Joshi, Q. Nguyen, S. Hammer, J. Kent, J. Xiong, M. Do, J. Smith, and R. Feris. Automatic Curation of Sports Highlights using Multimodal Excitement Features. IEEE Transactions on MultiMedia (TMM) 2019. [pdf]

R. Gao, R. Feris and K. Grauman. Learning to Separate Object Sounds by Watching Unlabeled Video. European Conference on Computer Vision (ECCV 2018, Oral), Munich, Germany, 2018. [pdf] [project page]

M. Merler, D. Joshi, K. Mac, Q. Nguyen, S. Hammer, J. Kent, J. Xiong, M. Do, J. Smith, and R. Feris. The Excitement of Sports: Automatic Highlights using Audio-Visual Cues. CVPR Workshop on Sight and Sound, 2018. [pdf]

M. Beigi, L. M Brown, Q. Fan, J. Henning, C. Lin, H. Shi, C. Shu, and R. Feris. Object-Centric Spatio-Temporal Activity Detection and Recognition. NIST TRECVID Workshop, 2018. [pdf]

M. Merler, D. Joshi, Q. Nguyen, S. Hammer, J. Kent, J. R Smith, and R. Feris. Automatic Curation of Golf Highlights using Multimodal Excitement Features. Workshop of Computer Vision in Sports (in conjunction with CVPR), 2017. [pdf]

H. Liu, R. Feris, and M.T. Sun. Benchmarking Datasets for Human Activity Recognition. Visual Analysis of Humans – Looking at People, Springer 2011. [Book Link] [Slides]

H. Liu, M.T. Sun, and R. Feris. Video Activity Recognition. Multimedia Analysis, Processing and Communication, Z. Li, J. Kacprzyk, D. Tao, E. Izquierdo, W. Lin, and H. Wang ed., Springer, 2010. [Book Link]

H. Liu, R. Feris, V. Krueger, and M.T. Sun. Unsupervised Action Classification Using Space-Time Link Analysis. Eurasip Journal on Advances in Signal Processing, 2010.

H. Liu, R. Feris, V. Krueger, and M.T. Sun. Unsupervised Action Classification Using Space-Time Link Analysis. IEEE International Symposium on Circuits and Systems (ISCAS 2010), Paris, France, 2010. [pdf]

Video Analytics for Safety and Security

J. Wang, Y. Cheng, and R. Feris. Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016, Oral), Las Vegas, Nevada, June 2016. [pdf]

R. Feris, R. Bobbitt, and S. Pankanti. Efficient 24/7 Object Detection in Surveillance Videos. IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS 2015), Germany, August 2015. [pdf]

Y. Cheng, L. Brown, Q. Fan, R. Feris, S. Pankanti, and T. Zhang. RiskWheel: Interactive Visual Analytics for Surveillance Event Detection. IEEE International Conference on Multimedia and Expo (ICME 2014, Oral), Chengdu, China, 2014. [pdf]

L. Brown, R. Feris, and S. Pankanti. Temporal Non-Maximum Suppression for Pedestrian Detection Using Scene Context. International Conference on Pattern Recognition (ICPR 2014), Stockholm, Sweden, 2014.

Q. Chen et al. Spatio-Temporal Fisher Vector Coding for Surveillance Event Detection. ACM International Conference on Multimedia (ACM MM 2013), Barcelona, Spain, 2013.

R. Feris, A. Datta, M. T. Sun, and S. Pankanti. Boosting Object Detection Performance in Crowded Surveillance Videos. Winter Conference on Applications of Computer Vision (WACV 2013), Florida, USA, 2013. [pdf]

Y. Cai et al. CMU-IBM-NUS@TRECVID 2012: Surveillance Event Detection. NIST Technical Report, 2012. First place in the retrospective surveillance event detection task. [pdf]

R. Feris, B. Siddiquie, and S. Pankanti. Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance. International Conference on Multimedia and Expo (ICME 2012), Melbourne, Australia, 2012. [pdf]

B. Siddiquie, R. Feris, A. Datta, and L. Davis. Unsupervised Model Selection for View-Invariant Object Detection in Surveillance Environments. International Conference on Pattern Recognition (ICPR 2012, Oral), Tsukuba City, Japan, 2012. [pdf]

R. Feris, B. Siddiquie, Y. Zhai, J. Petterson, L. Brown, and S. Pankanti. Attribute-based Vehicle Search in Crowded Surveillance Videos. ACM International Conference on Multimedia Retrieval (ICMR 2011, Oral Presentation), Trento, Italy, 2011. [pdf]

R. Feris, J. Petterson, B. Siddiquie, L. Brown, and S. Pankanti. Large-Scale Vehicle Detection in Challenging Urban Surveillance Environments. Winter Conference on Applications of Computer Vision (WACV 2011), Kona, Hawaii, 2011. [pdf]

Y. Zhai, R. Feris, A. Hampapur, S. Russo, and S. Pankanti. Parsing Object Events in Heavy Urban Traffic. Object Tracking, Intech, 2011.

Y. Tian, R. Feris, H. Liu, A. Hampapur, and M. Sun. Robust Detection of Abandoned and Removed Objects in Complex Surveillance Videos. IEEE Transactions on Systems, Man, and Cybernetics--Part C: Applications and Reviews, 2010. [pdf]

R. Feris, A. Hampapur, Y. Zhai, R. Bobbitt, L. Brown, D. Vaquero, Y-L. Tian, H. Liu, and M-T. Sun. Case Study: IBM Smart Surveillance System. Intelligent Video Surveillance: Systems and Technology, by Taylor & Francis Group, LLC, 2009. [pdf, Book Link]

Y. Zhai, R. Feris, L. Brown, R. Bobbitt, A. Hampapur, S. Pankanti, Q. Fan, A. Yanagawa, Y. Tian, and S. Velipasalar. Composite Event Detection in Multi-Camera and Multi-Sensor Surveillance Networks. Multi-Camera Networks: Concepts and Applications, by Elsevier, 2009. [Book Link]

Y. Tian, R. Feris, L. Brown, D. Vaquero, Y. Zhai, and A.Hampapur. Multi-Scale People Detection and Motion Analysis for Video Surveillance. Machine Learning for Human Motion Analysis: Theory and Practice, IGI Global, 2009. [Book Link]

D. Vaquero, R. Feris, L. Brown, and A. Hampapur. Attribute-based people search in surveillance environments. Winter Conference on Applications of Computer Vision (WACV 2009, Oral Presentation), Snowbird, Utah, December 2009. [pdf]

A. Hampapur, R. Bobbitt, L. Brown, M. Desimone, R. Feris, R. Kjeldsen, M. Lu, C. Mercier, C. Milite, S. Russo, C. Shu, Y. Zhai. Video Analytics in Urban Environments. IEEE International Conference on Advanced Video and Signal Based Surveillance, Genova, Italy, 2009.

D. Vaquero, R. Feris, L. Brown, A. Hampapur, and Matthew Turk. Attribute-based People Search. Intelligent Video Surveillance: Systems and Technology, by Taylor & Francis Group, LLC, 2009. [Book Link]

Y. Tian, A. Hampapur, L. Brown, R. Feris, M. Lu, A. Senior, C. Shu, and Y. Zhai. Event Detection, Query, and Retrieval for Video Surveillance. Artificial Intelligence for Maximizing Content Based Image Retrieval, 2008. [Book Link]

Y. Tian, R. Feris, and A. Hampapur. Real-Time Detection of Abandoned and Removed Objects in Complex Environments. IEEE International Workshop on Visual Surveillance (in conjunction with ECCV 2008), Marseille, France, 2008. [pdf]

L. Chen, R. Feris, Y. Zhai, L. Brown, and A. Hampapur. An Integrated System for Moving Object Classification in Surveillance Videos. IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS 2008), Santa Fe, New Mexico, September 2008. [pdf]

A. Hampapur, L. Brown, R. Feris, A. Senior, C. Shu, Y. Tian, Y. Zhai, and M. Lu. Searching Surveillance Video. IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS 2007), London, UK, September 2007. [pdf]

A. Senior, L. Brown, A. Hampapur, C. Shu, Y. Zhai, R. Feris, Y. Tian, S. Borger, and C. Carlson. Video Analytics for Retail: the IBM Smart Surveillance Retail Solution. IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS 2007), London , UK , September 2007. [pdf]

R. Feris, Y. Tian, and A. Hampapur. Capturing People in Surveillance Video. IEEE International Workshop on Visual Surveillance, 2007. [pdf]

R. Feris, T. E. Campos, and R. M. Cesar Jr. A Project for Face Recognition from Video Sequences Using GWN and Eigenfeature Selection. Workshop in Artificial Intelligence and Computer Vision, Atibaia, Brazil, November 2000. [pdf]

Object Tracking

C. Lin, Y. Hung, R. Feris, and L. He. Video Instance Segmentation Tracking. Conference on Computer Vision and Pattern Recognition (CVPR 2020). [pdf]

R. Feris, V. Krueger, and R. Cesar. A Wavelet Subspace Method for Real-time Face Tracking. Journal of Real-time Imaging, vol. 10, pp. 339-350, 2004. [pdf]

M. Turk, C. Hu, R. Feris, F. Lashkari, and A. Beall. TLA Based Face Tracking. International Conference on Vision Interfaces, Calgary, Canada , 2002. [pdf]

R. Feris, V. Krueger, and R. M. Cesar Jr. Efficient Real-Time Face Tracking in Wavelet Subspace. Workshop on Recognition, Analysis and Tracking of Faces and Gestures in Real-Time Systems (in conjunction with ICCV 2001), Vancouver ,BC, 2001. [pdf]

R. Feris and R. M. Cesar Jr. Locating and Tracking Facial Landmarks Using Gabor Wavelet Networks. Lecture Notes in Computer Science, vol. 2013, pp. 311-320. International Conference on Advances in Pattern Recognition, Rio de Janeiro, Brazil, May 2001. [pdf]

V. Krueger and R. Feris. Wavelet Subspace Method for Real-time Face Tracking. Proc. Pattern Recognition. DAGM Symposium, Munich, Germany 2001.

R. Feris and R. M. Cesar Jr. Tracking Facial Features Using Gabor Wavelet Networks. Symposium on Computer Graphics and Image Processing (SIBGRAPI 2000), Gramado, Brazil, October 2000. [pdf]

R. Feris, T. Campos, and R. M. Cesar Jr. Detection and Tracking of Facial Features in Video Sequences. Lecture Notes on Artificial Intelligence, vol. 1793, pp. 127-135. Proceedings of MICAI-2000, Acapulco, Mexico, April 2000. [pdf]

Object Detection and Segmentation

Other

K. Wang. D. Kim, R. Feris, and M. Betke. CDAC: Cross-domain Attention Consistency in Transformer for Domain Adaptive Semantic Segmentation. International Conference on Computer Vision (ICCV 2023). [pdf]

L. Karlinsky, J. Shtok, A. Alfassy, M. Lichtenstein, S. Harary, E. Schwartz, S. Doveh, P. Sattigeri, R. Feris, A. Bronstein, and R. Giryes. StarNet: towards Weakly Supervised Few-Shot Object Detection. AAAI Conference on Artificial Intelligence (AAAI 2021). [pdf]

C. Lin, Y. Hung, R. Feris, and L. He. Video Instance Segmentation Tracking. Conference on Computer Vision and Pattern Recognition (CVPR 2020). [pdf]

Z. Wang, M. Yu, Y. Wei, R. Feris, J. Xiong, W. Hwu, T. Huang, and H. Shi. Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation. Conference on Computer Vision and Pattern Recognition (CVPR 2020). [pdf]

K. Ramakrishnan, R. Panda, Q. Fan, J. Henning, A. Oliva, and R. Feris. Relationship Matters: Relation Guided Knowledge Transfer for Incremental Learning of Object Detectors. CVPR Workshop on Continual Learning in Computer Vision, 2020. [pdf]

L. Karlinsky, J. Shtok, S. Harary, E. Schwartz, A. Aides, R. Feris, R. Giryes, and A. Bronstein. RepMet: Representative-based Metric Learning for Classification and One-shot Object Detection. Conference on Computer Vision and Pattern Recognition (CVPR 2019), Long Beach, California, 2019. [pdf]

Z. Shen, H. Shi, J. Yu, H. Phan, R. Feris, L. Cao, D. Liu, X. Wang, T. Huang, M. Savvides. Learning Object Detection from Scratch via Gated Feature Reuse. British Machine Vision Conference (BMVC), 2019. [pdf]

B. Cheng*, Y. Wei*, H. She, R. Feris, J. Xiong, and T. Huang. Revisiting RCNN: On Awakening the Classification Power of Faster RCNN. European Conference on Computer Vision (ECCV 2018), Munich, Germany, 2018. [pdf] (* equal contribution)

N. Codella, D. Anderson, T. Philips, A. Porto, K. Massey, J. Snowdon, R. Feris, and J. Smith. Segmentation of both Diseased and Healthy Skin from Clinical Photographs in a Primary Care Setting. International Engineering in Medicine and Biology Conference, 2018. [pdf]

J. Bowler, R. Feris, L. Cao, J. Wang, and M. Zhou. Automated Axon Segmentation from Highly Noisy Microscopic Videos. Winter Conference on Applications of Computer Vision (WACV 2015), Kona, Hawaii, 2015. [pdf]

R. Feris, R. Bobbitt, and S. Pankanti. Efficient 24/7 Object Detection in Surveillance Videos. IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS 2015), Germany, August 2015. [pdf]

L. Brown, R. Feris, and S. Pankanti. Temporal Non-Maximum Suppression for Pedestrian Detection Using Scene Context. International Conference on Pattern Recognition (ICPR 2014), Stockholm, Sweden, 2014.

R. Feris, L. Brown, S. Pankanti, and M.T. Sun. Appearance-based Object Detection under Varying Environmental Conditions. International Conference on Pattern Recognition (ICPR 2014, Oral), Stockholm, Sweden, 2014.

Q. Chen, Z. Song, R. Feris, A. Datta, L. Cao, Z. Huang, and S. Yan. Efficient Maximum Appearance Search for Large-Scale Object Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013), Portland, Oregon, June 2013. [pdf]

X. Wang, L. Cao, R. Feris, A. Datta, and T. Xan. Hierarchical Feature Pooling with Structure Learning: A new method for Pedestrian Detection. CVPR Workshop on Structured Prediction: Tractability, Learning, and Inference, Portland, Oregon, 2013.

R. Feris, A. Datta, M. T. Sun, and S. Pankanti. Boosting Object Detection Performance in Crowded Surveillance Videos. Winter Conference on Applications of Computer Vision (WACV 2013), Florida, USA, 2013. [pdf]

F. Rashed, B. Siddiquie, R. Feris, and L. Davis. Domain Adaptive Object Detection. Winter Conference on Applications of Computer Vision (WACV 2013), Florida, USA, 2013. [pdf]

R. Feris, B. Siddiquie, J. Petterson, Y. Zhai, A. Datta, L. Brown, and S. Pankanti. Large-Scale Vehicle Detection, Indexing, and Search in Urban Surveillance Videos. IEEE Transactions on Multimedia, 2012. [pdf]

R. Feris, B. Siddiquie, and S. Pankanti. Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance. International Conference on Multimedia and Expo (ICME 2012), Melbourne, Australia, 2012. [pdf]

B. Siddiquie, R. Feris, A. Datta, and L. Davis. Unsupervised Model Selection for View-Invariant Object Detection in Surveillance Environments. International Conference on Pattern Recognition (ICPR 2012, Oral), Tsukuba City, Japan, 2012. [pdf]

R. Feris, J. Petterson, B. Siddiquie, L. Brown, and S. Pankanti. Large-Scale Vehicle Detection in Challenging Urban Surveillance Environments. Winter Conference on Applications of Computer Vision (WACV 2011), Kona, Hawaii, 2011. [pdf]

Visual Attributes

H. Wu, Y. Gao, X. Guo, Z. Al-Halah, S. Rennie, K. Grauman, and R. Feris. Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback. Conference on Computer Vision and Pattern Recognition (CVPR 2021). [pdf]

R. Feris, C. Lampert and D. Parikh. Introduction to Visual Attributes. In Visual Attributes (eds: R. S. Feris, C. Lampert, and D. Parikh), Springer, 2017. [pdf]

J. Wang, Y. Cheng, and R. Feris. Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016, Oral), Las Vegas, Nevada, June 2016. [pdf]

R. Feris, C. Lampert and D. Parikh (Eds.). Visual Attributes. Advances in Computer Vision and Pattern Recognition, Springer, 2016. [Book Link]

J. Huang, R. Feris, Q. Chen, and S. Yan. Cross-domain Image Retrieval with a Dual Attribute-aware Ranking Network. IEEE International Conference on Computer Vision (ICCV 2015), Santiago, Chile, December 2015. [pdf] [data]

Q. Chen, J. Huang, R. Feris, L. Brown, J. Dong, and S. Yan. Deep Domain Adaptation for Describing People Based on Fine-Grained Clothing Attributes. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), Boston, Massachusetts, June 2015. [pdf]

R. Feris, R. Bobbit, L. Brown, and S. Pankanti. Attribute-based People Search: Lessons Learnt from a Practical Surveillance System. ACM International Conference on Multimedia Retrieval (ICMR 2014), Oral Presentation, Glasgow, UK, 2014. [pdf]

F. Yu, L. Cao, R. Feris, J. Smith, and S. Chang. Designing Category-Level Attributes for Discriminative Visual Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013), Portland, Oregon, June 2013. [pdf]

R. Feris, B. Siddiquie, J. Petterson, Y. Zhai, A. Datta, L. Brown, and S. Pankanti. Large-Scale Vehicle Detection, Indexing, and Search in Urban Surveillance Videos. IEEE Transactions on Multimedia, 2012. [pdf]

B. Siddiquie, R. Feris, and L. Davis. Image Ranking and Retrieval Based on Multi-Attribute Queries. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011, Oral Presentation), Colorado Springs, USA, 2011. [pdf]

R. Feris, B. Siddiquie, Y. Zhai, J. Petterson, L. Brown, and S. Pankanti. Attribute-based Vehicle Search in Crowded Surveillance Videos. ACM International Conference on Multimedia Retrieval (ICMR 2011, Oral Presentation), Trento, Italy, 2011. [pdf]

A. Datta, R. Feris, and D. Vaquero. Hierarchical Ranking of Facial Attributes. IEEE International Conference on Automatic Face and Gesture Recognition (FG 2011), Santa Barbara, California, 2011. [pdf]

D. Vaquero, R. Feris, L. Brown, A. Hampapur, and Matthew Turk. Attribute-based People Search. Intelligent Video Surveillance: Systems and Technology, by Taylor & Francis Group, LLC, 2009. [Book Link]

D. Vaquero, R. Feris, L. Brown, and A. Hampapur. Attribute-based people search in surveillance environments. Winter Conference on Applications of Computer Vision (WACV 2009, Oral Presentation), Snowbird, Utah, December 2009. [pdf]

Computational Photography

J. Xie, R. Feris, and M.T. Sun. Edge-Guided Single Depth Image Super Resolution. IEEE Transactions on Image Processing (TIP), vol. 25, no. 1, pp. 428-438, 2016. [source code and paper]

J. Xie, R. Feris, S. Yu, and M.T. Sun. Joint Super Resolution and De-noising from a Single Depth Image. IEEE Transactions on Multimedia (TMM), vol.17, no.9, pp.1525-1537, 2015.

J. Xie, L. Chou, R. Feris, and M.T. Sun. Single Depth Image Super resolution and Denoising via Coupled Dictionary Learning with Local Constraints and Shock Filtering. IEEE International Conference on Multimedia and Expo (ICME 2014, Oral), 2014. [pdf]

J. Xie, R. Feris, and M.T. Sun. Edge Guided Single Depth Image Super Resolution. IEEE International Conference on Image Processing (ICIP 2014), 2014. [source code and paper]

D. Vaquero, R. Raskar, R. Feris, and M. Turk. A Projector-Camera Setup for Geometry-Invariant Frequency Demultiplexing. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, Florida, June 2009. [pdf]

D. Vaquero, R. Feris, M. Turk, and R. Raskar. Characterizing the Shadow Space of Camera-Light Pairs. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008), Anchorage, Alaska, June 2008. [pdf]

R. Feris, R. Raskar, L. Chen, K. Tan, and M. Turk. Multi-Flash Stereopsis: Depth Edge Preserving Stereo with Small Baseline Illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 30, no. 1, pp. 147-159, 2008. [pdf]

R. Feris, M. Turk, R. Raskar, and K. Tan. Specular Highlights Detection and Reduction with Multi-Flash Photography. International Journal of the Brazilian Computer Society, vol. 1, no. 12, pp. 35-42, 2006. [pdf]

R. Feris, R. Raskar, and M. Turk. Dealing with Multi-scale Depth Changes and Motion in Depth Edge Detection. Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI 2006), Manaus, Brazil, October 2006. Awarded "One of the Best Image Processing and Computer Vision Papers". [pdf]

R. Feris. Detection and Analysis of Depth Discontinuities with Lighting and Viewpoint Variation. PhD thesis, University of California, Santa Barbara, 2006. [pdf]

R. Feris, R. Raskar, L. Chen, K. Tan, and M. Turk. Discontinuity Preserving Stereo with Small Baseline Multi-Flash Illumination. International Conference on Computer Vision (ICCV 2005, Oral Presentation), Beijing, China, 2005. [pdf]

R. Raskar, K. Tan, R. Feris, J. Kobler, J. Yu, and M. Turk. Harnessing Real-World Depth Edges with Multi-Flash Imaging. IEEE Computer Graphics and Applications (IEEE CG&A), vol. 25, no. 1, pp. 32-38, January 2005. [pdf]

R. Feris, M. Turk, R. Raskar, K. Tan, and G. Ohashi. Recognition of Isolated Fingerspelling Gestures Using Depth Edges. B. Kisacanin, V. Pavlovic and T. Huang (eds.), Real-time Vision for Human-Computer Interaction, Springer-Verlag, 2005. [pdf, Book Link]

R. Feris, R. Raskar, K. Tan, and M. Turk. Specular Reflection Reduction with Multi-Flash Imaging. Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI 2004), Curitiba, Brazil, October 2004. Also accepted as a poster in SIGGRAPH 2004. [pdf]

R. Feris, M. Turk, R. Raskar, K. Tan, and G. Ohashi. Exploiting Depth Discontinuities for Vision-based Fingerspelling Recognition. IEEE Workshop on Real-Time Vision for Human-Computer Interaction (in conjunction with CVPR 2004), Washington DC, USA, June 2004. [pdf]

K. Tan, J. Kobler, R. Feris, P. Dietz, and R. Raskar. Shape Enhanced Surgical Visualizations and Medical Illustrations with Multi-flash Imaging. International Conference on Medical Imaging Computing and Computer Assisted Intervention (MICCAI 2004), Rennes, France 2004. [pdf]

R. Raskar, K. Tan, R. Feris, J. Yu, and M. Turk. Non-photorealistic Camera: Depth Edge Detection and Stylized Rendering using Multi-Flash Imaging. ACM Transactions on Graphics (SIGGRAPH 2004), Vol. 23, Issue 3, August 2004. [pdf]

R. Raskar, K. Tan, R. Feris, J. Yu, and M. Turk. Non-photorealistic Camera: Automatic Stylization with Multi-Flash Imaging. SIGGRAPH Emergent Technologies, 2004.

R. Feris, R. Raskar, K. Tan, and M. Turk. Specular Reflection Reduction Using a Multi-Flash Camera. SIGGRAPH poster, 2004.

Human Sensing

X. Peng , R. Feris, X. Wang, and D. Metaxas. Red-net: A recurrent encoder-decoder network for video-based face alignment. International Journal of Computer Vision (IJCV), 2018. [pdf] [code]

X. Peng, Z. Tang, F. Yang, R. Feris, and D. Metaxas. Jointly optimize data augmentation and network training: Adversarial data augmentation in human pose estimation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, Utah, June 2018. [pdf]

X. Peng, R. Feris, X. Wang, and D. Metaxas. A Recurrent Encoder-Decoder Network for Sequential Face Alignment. European Conference on Computer Vision (ECCV 2016, Oral), Amsterdam, Netherlands, 2016. [pdf] [project page]

J. Wang, Y. Cheng, and R. Feris. Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016, Oral), Las Vegas, Nevada, June 2016. [pdf]

R. Feris, R. Bobbit, L. Brown, and S. Pankanti. Attribute-based People Search: Lessons Learnt from a Practical Surveillance System. ACM International Conference on Multimedia Retrieval (ICMR 2014), Oral Presentation, Glasgow, UK, 2014. [pdf]

K. Scherbaum, R. Feris, J. Petterson, V. Blanz, and H. Seidel. Fast Face Detector Training Using Tailored Views. IEEE International Conference on Computer Vision (ICCV 2013), Sydney, Australia, December 2013. [pdf]

A. Datta, L. Brown, R. Feris, and S. Pankanti. Appearance Modeling for Person Re-Identification using Weighted Brightness Transfer Functions. International Conference on Pattern Recognition (ICPR 2012), Tsukuba City, Japan, 2012. [pdf]

B. Siddiquie, R. Feris, and L. Davis. Image Ranking and Retrieval Based on Multi-Attribute Queries. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011, Oral Presentation), Colorado Springs, USA, 2011. [pdf]

D. Vaquero, R. Feris, L. Brown, and A. Hampapur. Attribute-based people search in surveillance environments. Winter Conference on Applications of Computer Vision (WACV 2009, Oral Presentation), Snowbird, Utah, December 2009. [pdf]

R. Feris, Y. Tian, Y. Zhai, and A. Hampapur. Facial Image Analysis Using Local Feature Adaptation Prior to Learning. IEEE International Conference on Automatic Face and Gesture Recognition, Amsterdam, Netherlands, September 2008. [pdf]

R. Feris, Y. Tian, and A. Hampapur. Capturing People in Surveillance Video. IEEE International Workshop on Visual Surveillance, 2007. [pdf]

H. Guan, J. Chang, L. Chen, R. Feris, and M. Turk. Multi-view Appearance-based 3D Hand Pose Estimation. IEEE Workshop on Vision for Human Computer Interaction (in conjunction with CVPR 2006), New York, NY, June 2006. [pdf]

H. Guan, R. Feris, and M. Turk. The Isometric Self-Organizing Map for Hand Pose Estimation. International Conference on Face and Gesture Recognition, Southampton, UK, 2006. [pdf]

Y. Zana, R. Cesar, R. Feris and M. Turk. Local Approach for Face Verification in Polar Frequency Domain. Image and Vision Computing Journal, vol. 24, no. 8, pp. 904-913, 2006. [pdf]

Y. Chang, C. Hu, R. Feris, and M. Turk. Manifold-Based Analysis of Facial Expressions. Image and Vision Computing Journal, vol.24, no. 6, pp. 605-614, 2006. [pdf]

Y. Zana, R. Cesar, R. Feris, and M. Turk. Face Verification in Polar Frequency Domain: a Biologically Motivated Approach. International Symposium on Visual Computing, Lake Tahoe, NV, 2005. [pdf]

C. Hu, Y. Chang, R. Feris, and M. Turk. Manifold Based Analysis of Facial Expression. IEEE Workshop on Face Processing in Video (in conjunction with CVPR 2004), Washington DC, USA, June 2004. [pdf]

R. Feris, V. Krueger, and R. Cesar. A Wavelet Subspace Method for Real-time Face Tracking. Journal of Real-time Imaging, vol. 10, pp. 339-350, 2004. [pdf]

C. Hu, R. Feris, and M. Turk. Real-time View-Based Face Alignment Using Active Wavelet Networks. Workshop on Analysis and Modeling of Faces and Gestures (in conjunction with ICCV 2003), Nice, France, October 2003. [pdf]

C. Hu, R. Feris, and M. Turk. Active Wavelet Networks for Face Alignment. British Machine Vision Conference (BMVC 2003), Norwich , 2003. [pdf]

M. Turk, C. Hu, R. Feris, F. Lashkari, and A. Beall. TLA Based Face Tracking. International Conference on Vision Interfaces, Calgary, Canada , 2002. [pdf]

R. Feris, J. Gemmell, K. Toyama, and V. Krueger. Hierarchical Wavelet Networks for Facial Feature Localization. International Conference on Automatic Face and Gesture Recognition, Washington D.C., USA, May 20-21, 2002. [pdf]

R. Feris, V. Krueger, and R. M. Cesar Jr. Efficient Real-Time Face Tracking in Wavelet Subspace. Workshop on Recognition, Analysis and Tracking of Faces and Gestures in Real-Time Systems (in conjunction with ICCV 2001), Vancouver ,BC, 2001. [pdf]

R. Feris and R. M. Cesar Jr. Locating and Tracking Facial Landmarks Using Gabor Wavelet Networks. Lecture Notes in Computer Science, vol. 2013, pp. 311-320. International Conference on Advances in Pattern Recognition, Rio de Janeiro, Brazil, May 2001. [pdf]

R. Feris. Efficient Real-time Face Tracking in Wavelet Subspace. MSc. thesis, University of Sao Paulo, May 2001. Awarded as the Second Best Computer Science MSc. Thesis in Brazil (Nationwide Competition). [pdf]

V. Krueger and R. Feris. Wavelet Subspace Method for Real-time Face Tracking. Proc. Pattern Recognition. DAGM Symposium, Munich, Germany 2001.

R. Feris, T. E. Campos, and R. M. Cesar Jr. A Project for Face Recognition from Video Sequences Using GWN and Eigenfeature Selection. Workshop in Artificial Intelligence and Computer Vision, Atibaia, Brazil, November 2000. [pdf]

R. Feris and R. M. Cesar Jr. Tracking Facial Features Using Gabor Wavelet Networks. Symposium on Computer Graphics and Image Processing (SIBGRAPI 2000), Gramado, Brazil, October 2000. [pdf]

T. Campos, R. Feris, and R. M. Cesar Jr. Improved Face versus Non-Face Discrimination Using Fourier Descriptors through Feature Selection. Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI 2000), Gramado, Brazil, October 2000. [pdf]

R. Feris, T. Campos, and R. M. Cesar Jr. Detection and Tracking of Facial Features in Video Sequences. Lecture Notes on Artificial Intelligence, vol. 1793, pp. 127-135. Proceedings of MICAI-2000, Acapulco, Mexico, April 2000. [pdf]

T. Campos, R. Feris, and R. M. Cesar Jr. Eigenfaces versus Eigeneyes: First Steps Towards Performance Assessment of Representations for Face Recognition. Lecture Notes on Artificial Intelligence, vol. 1793, pp. 193-201. Proceedings of MICAI-2000, Acapulco, Mexico, April 2000. [pdf]

Other

J. Lee, Y. Bu, P. Sattigeri, R. Panda, G. Wornell, L. Karlinsky, and R. Feris. A Maximal Correlation Framework for Fair Machine Learning. Entropy 24(4), 461, 2022. [pdf]

U. Finkler, M. Merler, R. Panda, M. Jaiswal, H. Wu, K. Ramakrishnan, C. Chen, M. Cho, D. Kung, R. Feris, and B. Bhattacharjee. Large Scale Neural Architecture Search with Polyharmonic Splines. AAAI Workshop on Meta-Learning for Computer Vision, 2021. [pdf]

N. Codella, C. Lin, A. Halpern, M. Hind, R. Feris, and J. Smith. Collaborative Human-AI (CHAI): Evidence-Based Interpretable Melanoma Classification in Dermoscopic Images. MICCAI Workshop on Interpretability of Machine Intelligence in Medical Image Computing, 2018. [pdf]

S. Zhai, H. Wu, A. Kumar, Y. Cheng, Y. Lu, Z. Zhang,and R. Feris. S3Pool: Pooling with Stochastic Spatial Sampling. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), Honolulu, Hawaii, July 2017. [pdf] [code]

J. Xie, Y. Hsu, R. Feris, and M.T. Sun. Fine registration of 3D point clouds fusing structural and photometric information using an RGB-D camera. Journal of Visual Communication and Image Representation (JVCI), vol. 32, pp. 194-204, 2015. [pdf] [source code]

Y. Cui, Y. Xiang, K. Rong, L. Cao, and R. Feris. A Spatial-Color Layout Feature for Representing Galaxy Images. IEEE Winter Conference on Applications of Computer Vision (WACV 2014), Steamboat Springs, Colorado, 2014. [pdf]

A. Mattos and R. Feris. Fusing Well-Crafted Feature Descriptors for Efficient Fine-Grained Classification. IEEE International Conference on Image Processing (ICIP 2014), Paris, France 2014.

A. Mattos, R. Herrmann, K. Shigeno, and R. Feris. A Mission-Oriented Citizen Science Platform for Efficient Flower Classification Based on Combination of Feature Descriptors. ICMR Workshop on Environmental Multimedia Retrieval, Glasgow, UK, 2014. [pdf]

A. Mattos, R. Herrmann, K. Shigeno, and R. Feris. Flower Classification for a Citizen Science Mobile App. International Conference on Multimedia Retrieval (ICMR 2014), Glasgow, UK, 2014.

J. Xie, J. Hsu, R. Feris, and M. Sun. Fine Registration of 3D Point Clouds with ICP Using an RGB-D Camera. IEEE International Symposium on Circuits and Systems (ISCAS 2013), Beijing, China, 2013. [pdf]

J. Leandro, R. Cesar, and R. Feris. Shape Analysis using the Spectral Graph Wavelet Transform. IEEE eScience Conference, Beijing, China 2013.

S. Pankanti, L. Brown, J. Connell, A. Datta, Q. Fan, R. Feris, N. Haas, Y. Li, N. Ratha, and H. Thinh. Practical Computer Vision: Example Techniques and Challenges. IBM Journal of Research and Development, 2011. [pdf]

A. Hampapur et al. Analytics Driven Asset Management. IBM Smart Cities Journal, 2010.

L. Chen, J. McAuley, R. Feris, T. Caetano, and M. Turk. Shape Classification Through Structured Learning of Matching Measures. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, Florida, June 2009. [pdf]

L. Chen, R. Feris, and M. Turk. Efficient Partial Shape Matching Using the Smith-Waterman Algorithm. CVPR Workshop on Non-Rigid Shape Analysis and Deformable Image Alignment, Anchorage, Alaska, June 2008. [pdf]

R. Feris and W. Lages. Stereo Image Matching Using Correlation and Relaxation Labeling. IV Congress for Scientific Initiation and Postgraduation at Aeronautics Institute of Technology, pp. 193-199, Sao Jose dos Campos-SP, Brazil, October 1998 (in portuguese)