At the AIR lab, we conduct research in the emerging field of computer audition, i.e., designing computational systems that are able to analyze and understand sounds including music, speech, and environmental sounds. We address fundamental issues such as parsing polyphonic auditory scenes (the cocktail party effect), as well as designing novel applications such as sound retrieval and music information retrieval. We also combine sound analysis with the analysis of other signal modalities such as text and video towards multi-modal scene analysis. Various projects that we have been working on include audio source separation, automatic music transcription, audio-score alignment, speech enhancement, speech diarization and emotion recognition, sound retrieval, sound event detection, and audio-visual scene understanding.
Our work is funded by the National Science Foundation under grants No. 1617107, titled "III: Small: Collaborative Research: Algorithms for Query by Example of Audio Databases" (project website), No. 1741472, titled "BIGDATA: F: Audio-Visual Scene Understanding" (project website), and No. 1846174, titled "CAREER: Human-Computer Collaborative Music Making". Our work is also funded by the University of Rochester internal pilot awards on AR/VR and health analytics.
We are looking for highly motivated students to join the AIR lab. Students are expected to have a solid background in mathematics, programming, and academic writing. Experiences in music activities will be a plus. Most importantly, students should be fascinated by humans' ability in perceiving and understanding sounds, and are willing to make computers to achieve this capability! If you are interested, please apply to the ECE Ph.D. program, and mention Prof. Zhiyao Duan in your application. If you are a master's or undergrad student at UR and want to do a project/thesis in the AIR lab, please send Dr. Duan an email or stop by his office.