2024
Retrieval guided music captioning via multimodal prefixes
Nikita Srivatsan, Ke Chen, Shlomo Dubnov, Taylor Berg-Kirkpatrick
IJCAI 2024 (Special Track on AI, the Arts, and Creativity)
[PDF]
TeaserGen: Generating Teasers for Long Documentaries
Weihan Xu, Paul Pu Liang, Haven Kim, Julian McAuley, Taylor Berg-Kirkpatrick, Hao-Wen Dong
Presto! Distilling Steps and Layers for Accelerating Music Generation
Zachary Novack, Ge Zhu, Jonah Casebeer, Julian McAuley, Taylor Berg-Kirkpatrick, Nicholas J Bryan
CoLLAP: Contrastive Long-form Language-Audio Pretraining with Musical Temporal Structure Augmentation
Junda Wu, Warren Li, Zachary Novack, Amit Namburi, Carol Chen, Julian McAuley
Generating Symbolic Music from Natural Language Prompts using an LLM-Enhanced Dataset
Weihan Xu, Julian McAuley, Taylor Berg-Kirkpatrick, Shlomo Dubnov, Hao-Wen Dong
PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing
Phillip Long, Zachary Novack, Taylor Berg-Kirkpatrick, Julian McAuley
Creativity and Visual Communication from Machine to Musician: Sharing a Score through a Robotic Camera
Ross Greer, Laura Fleig, Shlomo Dubnov
Multi-Track MusicLDM: Towards Versatile Music Generation with Latent Diffusion Model
Tornike Karchkhadze, Mohammad Rasool Izadi, Ke Chen, Gerard Assayag, Shlomo Dubnov
Improving Generalization of Speech Separation in Real-World Scenarios: Strategies in Simulation, Optimization, and Evaluation
Ke Chen, Jiaqi Su, Taylor Berg-Kirkpatrick, Shlomo Dubnov, Zeyu Jin
LyCon: Lyrics Reconstruction from the Bag-of-Words Using Large Language Models
Haven Kim, Kahyun Choi
FUTGA: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation
Junda Wu, Zachary Novack, Amit Namburi, Jiaheng Dai, Hao-Wen Dong, Zhouhang Xie, Carol Chen, Julian McAuley
Nested Music Transformer: Sequentially Decoding Compound Tokens in Symbolic Music and Audio Generation
Jiwoo Ryu, Hao-Wen Dong, Jongmin Jung, Dasaem Jeong
International Society for Music Information Retrieval (ISMIR) 2024
[PDF]
Emotion-driven Piano Music Generation via Two-stage Disentanglement and Functional Representation
Jingyue Huang, Ke Chen, Yi-Hsuan Yang
International Society for Music Information Retrieval (ISMIR) 2024
[PDF]
Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning
Fang-Duo Tsai, Shih-Lun Wu, Haven Kim, Bo-Yu Chen, Hao-Chung Cheng, Yi-Hsuan Yang
International Society for Music Information Retrieval (ISMIR) 2024
[PDF]
DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation
Zachary Novack, Julian McAuley, Taylor Berg-Kirkpatrick, Nicholas J. Bryan
International Society for Music Information Retrieval (ISMIR) 2024
[PDF]
MusicLDM: Enhancing novelty in text-to-music generation using beat-synchronous mixup strategies
Ke Chen, Yusong Wu, Haohe Liu, Marianna Nezhurina, Taylor Berg-Kirkpatrick, Shlomo Dubnov
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
[PDF]
Latent CLAP Loss for Better Foley Sound Synthesis
Tornike Karchkhadze, Hassan Salami Kavaki, Mohammad Rasool Izadi, Bryce Irvin, Mikolaj Kegler, Ari Hertz, Shuo Zhang, Marko Stamenovic
Europian Association for Signal Processing (EUSIPCO) 2024
[PDF]
DITTO: Diffusion inference-time t-optimization for music generation
Zachary Novack, Julian McAuley, Taylor Berg-Kirkpatrick, Nicholas J Bryan
International Conference on Machine Learning (ICML) 2024
[PDF]
2023
Equipping Pretrained Unconditional Music Transformers with Instrument and Genre Controls
Weihan Xu, Julian McAuley, Shlomo Dubnov, Hao-Wen Dong
International Conference on Big Data (BigData) 2023
[PDF]
CLIPSonic: Text-to-audio synthesis with unlabeled videos and pretrained language-vision models
Hao-Wen Dong, Xiaoyu Liu, Jordi Pons, Gautam Bhattacharya, Santiago Pascual, Joan SerrĂ , Taylor Berg-Kirkpatrick, Julian McAuley
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2023
[PDF]
Unsupervised Lead Sheet Generation via Semantic Compression
Zachary Novack, Nikita Srivatsan, Taylor Berg-Kirkpatrick, Julian McAuley
Towards improving harmonic sensitivity and prediction stability for singing melody extraction
Keren Shao, Ke Chen, Taylor Berg-Kirkpatrick, Shlomo Dubnov
International Society for Music Information Retrieval (ISMIR) 2023
[PDF]
Large-scale contrastive language-audio pretraining with feature fusion and keyword-to-caption augmentation
Yusong Wu, Ke Chen, Tianyu Zhang, Yuchen Hui, Taylor Berg-Kirkpatrick, Shlomo Dubnov
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023
[PDF]
Multitrack music transformer
Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley, Taylor Berg-Kirkpatrick
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023
[PDF]
Universal source separation with weakly labelled data
Qiuqiang Kong, Ke Chen, Haohe Liu, Xingjian Du, Taylor Berg-Kirkpatrick, Shlomo Dubnov, Mark D Plumbley
2022
ClipSep: Learning text-queried sound separation with noisy unlabeled videos
Hao-Wen Dong, Naoya Takahashi, Yuki Mitsufuji, Julian McAuley, Taylor Berg-Kirkpatrick
International Conference on Learning Representations (ICLR) 2023
[PDF]
Checklist models for improved output fluency in piano fingering prediction
Nikita Srivatsan, Taylor Berg-Kirkpatrick
International Society for Music Information Retrieval (ISMIR) 2022
[PDF]
Improving choral music separation through expressive synthesized data from sampled instruments
Ke Chen, Hao-Wen Dong, Yi Luo, Julian McAuley, Taylor Berg-Kirkpatrick, Miller Puckette, Shlomo Dubnov
International Society for Music Information Retrieval (ISMIR) 2022
[PDF]
Zero-shot audio source separation through query-based learning from weakly-labeled data
Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov
Association for the Advancement of Artificial Intelligence Conference (AAAI) 2022
[PDF]
Hts-at: A hierarchical token-semantic audio transformer for sound classification and detection
Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
[PDF]
Deep performer: Score-to-audio music performance synthesis
Hao-Wen Dong, Cong Zhou, Taylor Berg-Kirkpatrick, Julian McAuley
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
[PDF]
Tonet: Tone-octave network for singing melody extraction from polyphonic music
Ke Chen, Shuai Yu, Cheng-i Wang, Wei Li, Taylor Berg-Kirkpatrick, Shlomo Dubnov
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
[PDF]
2021
An empirical evaluation of end-to-end polyphonic optical music recognition
Sachinda Edirisooriya, Hao-Wen Dong, Julian McAuley, Taylor Berg-Kirkpatrick
International Society for Music Information Retrieval (ISMIR) 2021
[PDF]
Towards automatic instrumentation by learning to separate parts in symbolic multitrack music
Hao-Wen Dong, Chris Donahue, Taylor Berg-Kirkpatrick, Julian McAuley
International Society for Music Information Retrieval (ISMIR) 2021
[PDF]
2020
Discovering music relations with sequential attention
Junyan Jiang, Gus Xia, Taylor Berg-Kirkpatrick
Proceedings of the 1st Workshop on NLP for Music and Audio (MLP4MusA) 2020
[PDF]
Muspy: A toolkit for symbolic music generation
Hao-Wen Dong, Ke Chen, Julian McAuley, Taylor Berg-Kirkpatrick
International Society for Music Information Retrieval (ISMIR) 2020
[PDF]
Music sketchnet: Controllable music generation via factorized representations of pitch and rhythm
Ke Chen, Cheng-i Wang, Taylor Berg-Kirkpatrick, Shlomo Dubnov
International Society for Music Information Retrieval (ISMIR) 2020
[PDF]
Continuous Melody Generation via Disentangled Short-Term Representations and Structural Conditions
Ke Chen, Gus Xia, Shlomo Dubnov
International Conference on Semantic Computing (ICSC) 2020
[PDF]
2019
LakhNES: Improving multi-instrumental music generation with cross-domain pre-training
Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W Cottrell, Julian McAuley
International Society for Music Information Retrieval (ISMIR) 2019
[PDF]
The effect of explicit structure encoding of deep neural networks for symbolic music generation
Ke Chen, Weilin Zhang, Shlomo Dubnov, Gus Xia, Wei Li
International Workshop on Multilayer Music Representation and Processing (MMRP) 2019
[PDF]
Adversarial audio synthesis
Chris Donahue, Julian McAuley, Miller Puckette
International Conference on Learning Representations (ICLR) 2019
[PDF]
2018
The NES music database: A multi-instrumental dataset with expressive performance attributes
Chris Donahue, Huanru Henry Mao, Julian McAuley
International Society for Music Information Retrieval (ISMIR) 2018
[PDF]
2017
Dance dance convolution
Chris Donahue, Zachary C Lipton, Julian McAuley
International Conference on Machine Learning (ICML) 2017
[PDF]