Zihao Zhu

zihaozhu@link.cuhk.edu.cn

Hi, this is Zihao Zhu (朱梓豪). I am currently a Ph.D. student in Data Science at The Chinese University of Hong Kong, Shenzhen, under the supervision of Prof. Baoyuan Wu. Previously, I received my Master’s degree from the Institute of Information Engineering at the University of Chinese Academy of Sciences in 2021.

My research interests broadly span the field of AI security, with a particular focus on the following areas:

Safety of Large Language Models: I study the safety challenges associated with large language models (LLMs), including jailbreak attacks and safety alignment, aiming to enhance the robustness and reliability of LLMs without compromising their utility.
Data Safety in AI Systems: Data is the fuel of AI. I investigate various aspects of data safety in Data-centric AI (DCAI), with particular emphasis on backdoor attacks and data quality assessment.
Safety in Embodied AI: I explore safety concerns in embodied AI systems, focusing on risk assessment for embodied AI agents. This emerging area is crucial as AI systems become more integrated into physical environments.

I am currently on the job market and seeking full-time opportunities in academia or industry. I would be delighted to connect if you have relevant openings or suggestions.

news

Sep 25, 2025	One open-source project I participated in “Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers” has been accepted to NeurISP 2025 workshop!
Sep 25, 2025	Our work “To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models” has been accepted to NeurISP 2025 workshop!
Sep 10, 2025	Our survey “Defenses in adversarial machine learning: A survey” has been accepted to IEEE TPAMI!
Sep 01, 2025	Our survey “Attacks in adversarial machine learning: A systematic survey from the life-cycle perspective” has been accepted to IJCV!
May 19, 2025	Our paper “BlackboxBench: A Comprehensive Benchmark of Black-box Adversarial Attacks” has been accepted to IEEE TPAMI!

selected publications

arXiv

AdvChain: Adversarial Chain-of-Thought Tuning for Robust Safety Alignment of Large Reasoning Models

Zihao Zhu , Xinyu Wu, Gehan Hu, Siwei Lyu, Ke Xu, and Baoyuan Wu

arXiv preprint, 2025

Bib PDF

@article{zhu2025advchain,
  title = {AdvChain: Adversarial Chain-of-Thought Tuning for Robust Safety Alignment of Large Reasoning Models},
  author = {Zhu, Zihao and Wu, Xinyu and Hu, Gehan and Lyu, Siwei and Xu, Ke and Wu, Baoyuan},
  journal = {arXiv preprint},
  year = {2025},
}

NeurIPS

To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models

Zihao Zhu , Hongbao Zhang, Ruotong Wang, Xu Ke, Lyu Siwei, and Baoyuan Wu

In NeurIPS 2025 Workshop on Foundations of Reasoning in Language Models, 2025

Bib PDF Code

@inproceedings{zhu2025unthinking,
  title = {To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models},
  author = {Zhu, Zihao and Zhang, Hongbao and Wang, Ruotong and Ke, Xu and Siwei, Lyu and Wu, Baoyuan},
  booktitle = {NeurIPS 2025 Workshop on Foundations of Reasoning in Language Models},
  year = {2025},
}

TPAMI

Defenses in adversarial machine learning: A survey

Baoyuan Wu, Shaokui Wei, Mingli Zhu, Meixi Zheng, Zihao Zhu , Mingda Zhang, Hongrui Chen, Danni Yuan, Li Liu, and Qingshan Liu

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

Bib PDF

@article{wu2025defenses,
  title = {Defenses in adversarial machine learning: A survey},
  author = {Wu, Baoyuan and Wei, Shaokui and Zhu, Mingli and Zheng, Meixi and Zhu, Zihao and Zhang, Mingda and Chen, Hongrui and Yuan, Danni and Liu, Li and Liu, Qingshan},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year = {2025},
}

NeurIPS

Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers

Xingyue Huang, Rishabh, Gregor Franke, Ziyi Yang, Jiamu Bai, Weijie Bai, Jinhe Bi, Zifeng Ding, Yiqun Duan, Chengyu Fan, Wendong Fan, Xin Gao, Ruohao Guo, Yuan He, Zhuangzhuang He, Xianglong Hu, Neil Johnson, Bowen Li, Fangru Lin, Siyu Lin, Tong Liu, Yunpu Ma, Hao Shen, Hao Sun, Beibei Wang, Fangyijie Wang, Hao Wang, Haoran Wang, Yang Wang, Yifeng Wang, Zhaowei Wang, Ziyang Wang, Yifan Wu, Zikai Xiao, Chengxing Xie, Fan Yang, Junxiao Yang, Qianshuo Ye, Ziyu Ye, Guangtao Zeng, Yuwen Ebony Zhang, Zeyu Zhang, Zihao Zhu , Bernard Ghanem, Philip Torr, and Guohao Li

In NeurIPS 2025 Workshop LAW, 2025

Bib PDF Code Website

@inproceedings{huang2025loongsynthesizelongchainofthoughts,
  title = {Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers},
  author = {Huang, Xingyue and Rishabh and Franke, Gregor and Yang, Ziyi and Bai, Jiamu and Bai, Weijie and Bi, Jinhe and Ding, Zifeng and Duan, Yiqun and Fan, Chengyu and Fan, Wendong and Gao, Xin and Guo, Ruohao and He, Yuan and He, Zhuangzhuang and Hu, Xianglong and Johnson, Neil and Li, Bowen and Lin, Fangru and Lin, Siyu and Liu, Tong and Ma, Yunpu and Shen, Hao and Sun, Hao and Wang, Beibei and Wang, Fangyijie and Wang, Hao and Wang, Haoran and Wang, Yang and Wang, Yifeng and Wang, Zhaowei and Wang, Ziyang and Wu, Yifan and Xiao, Zikai and Xie, Chengxing and Yang, Fan and Yang, Junxiao and Ye, Qianshuo and Ye, Ziyu and Zeng, Guangtao and Zhang, Yuwen Ebony and Zhang, Zeyu and Zhu, Zihao and Ghanem, Bernard and Torr, Philip and Li, Guohao},
  year = {2025},
  booktitle = {NeurIPS 2025 Workshop LAW},
  eprint = {2509.03059},
  dataset = {https://huggingface.co/datasets/camel-ai/loong},
}

TPAMI

BlackboxBench: A Comprehensive Benchmark of Black-box Adversarial Attacks

Meixi Zheng, Xuanchen Yan, Zihao Zhu , Hongrui Chen, and Baoyuan Wu

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

Bib PDF Code Website

@article{zheng2025blackboxbench,
  title = {BlackboxBench: A Comprehensive Benchmark of Black-box Adversarial Attacks},
  author = {Zheng, Meixi and Yan, Xuanchen and Zhu, Zihao and Chen, Hongrui and Wu, Baoyuan},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year = {2025},
}

IJCV

BackdoorBench: A Comprehensive Benchmark and Analysis of Backdoor Learning

Baoyuan Wu, Hongrui Chen, Mingda Zhang, Zihao Zhu , Shaokui Wei, Danni Yuan, Mingli Zhu, Ruotong Wang, Li Liu, and Chao Shen

IJCV, 2025

Bib PDF

@article{wu2025backdoorbench,
  title = {BackdoorBench: A Comprehensive Benchmark and Analysis of Backdoor Learning},
  author = {Wu, Baoyuan and Chen, Hongrui and Zhang, Mingda and Zhu, Zihao and Wei, Shaokui and Yuan, Danni and Zhu, Mingli and Wang, Ruotong and Liu, Li and Shen, Chao},
  journal = {IJCV},
  year = {2025},
}

arXiv

HMGIE: Hierarchical and Multi-Grained Inconsistency Evaluation for Vision-Language Data Cleansing

Zihao Zhu , Hongbao Zhang, Guanzong Wu, Siwei Lyu, and Baoyuan Wu

arXiv preprint, 2024

Bib PDF

@article{zhu2024hmgie,
  title = {HMGIE: Hierarchical and Multi-Grained Inconsistency Evaluation for Vision-Language Data Cleansing},
  author = {Zhu, Zihao and Zhang, Hongbao and Wu, Guanzong and Lyu, Siwei and Wu, Baoyuan},
  journal = {arXiv preprint},
  year = {2024},
}

arXiv

Reliable Poisoned Sample Detection against Backdoor Attacks Enhanced by Sharpness Aware Minimization

Mingda Zhang, Mingli Zhu, Zihao Zhu , and Baoyuan Wu

arXiv preprint, 2024

Bib PDF

@article{zhang2024reliable,
  title = {Reliable Poisoned Sample Detection against Backdoor Attacks Enhanced by Sharpness Aware Minimization},
  author = {Zhang, Mingda and Zhu, Mingli and Zhu, Zihao and Wu, Baoyuan},
  journal = {arXiv preprint},
  year = {2024},
}

arXiv

EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents

Zihao Zhu , Bingzhe Wu, Zhengyou Zhang, Lei Han, Qingshan Liu, and Baoyuan Wu

arXiv preprint, 2024

Bib PDF Code

@article{zhu2024earbench,
  title = {EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents},
  author = {Zhu, Zihao and Wu, Bingzhe and Zhang, Zhengyou and Han, Lei and Liu, Qingshan and Wu, Baoyuan},
  year = {2024},
  journal = {arXiv preprint},
}

ICLR

VDC: Versatile Data Cleanser based on Visual-Linguistic Inconsistency by Multimodal Large Language Models

Zihao Zhu , Mingda Zhang, Shaokui Wei, Bingzhe Wu, and Baoyuan Wu

In International Conference on Learning Representations, 2024

Bib PDF Code Website

@inproceedings{zhu2024vdc,
  title = {VDC: Versatile Data Cleanser based on Visual-Linguistic Inconsistency by Multimodal Large Language Models},
  author = {Zhu, Zihao and Zhang, Mingda and Wei, Shaokui and Wu, Bingzhe and Wu, Baoyuan},
  booktitle = {International Conference on Learning Representations},
  year = {2024},
}

arXiv

Boosting backdoor attack with a learnable poisoning sample selection strategy

Zihao Zhu , Mingda Zhang, Shaokui Wei, Li Shen, Yanbo Fan, and Baoyuan Wu

arXiv preprint, 2023

Bib PDF

@article{zhu2023boosting,
  title = {Boosting backdoor attack with a learnable poisoning sample selection strategy},
  author = {Zhu, Zihao and Zhang, Mingda and Wei, Shaokui and Shen, Li and Fan, Yanbo and Wu, Baoyuan},
  journal = {arXiv preprint},
  year = {2023},
}

arXiv

Versatile Backdoor Attack with Visible, Semantic, Sample-Specific, and Compatible Triggers

Ruotong Wang, Hongrui Chen, Zihao Zhu , Li Liu, and Baoyuan Wu

arXiv preprint, 2024

Bib PDF

@article{wang2024versatile,
  title = {Versatile Backdoor Attack with Visible, Semantic, Sample-Specific, and Compatible Triggers},
  author = {Wang, Ruotong and Chen, Hongrui and Zhu, Zihao and Liu, Li and Wu, Baoyuan},
  journal = {arXiv preprint},
  year = {2024},
}

AAAI

Learning to Optimize Permutation Flow Shop Scheduling via Graph-based Imitation Learning

Longkang Li, Siyuan Liang, Zihao Zhu , Xiaochun Cao, Chris Ding, Hongyuan Zha, and Baoyuan Wu

In AAAI Conference on Artificial Intelligence, 2024

Bib PDF

@inproceedings{li2024learning,
  title = {Learning to Optimize Permutation Flow Shop Scheduling via Graph-based Imitation Learning},
  author = {Li, Longkang and Liang, Siyuan and Zhu, Zihao and Cao, Xiaochun and Ding, Chris and Zha, Hongyuan and Wu, Baoyuan},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year = {2024},
}

arXiv

Attacks in adversarial machine learning: A systematic survey from the life-cycle perspective

Baoyuan Wu, Zihao Zhu , Li Liu, Qingshan Liu, Zhaofeng He, and Siwei Lyu

arXiv preprint, 2023

Bib PDF

@article{wu2023attacks,
  title = {Attacks in adversarial machine learning: A systematic survey from the life-cycle perspective},
  author = {Wu, Baoyuan and Zhu, Zihao and Liu, Li and Liu, Qingshan and He, Zhaofeng and Lyu, Siwei},
  journal = {arXiv preprint},
  year = {2023},
}

NeurIPS

BackdoorBench: A Comprehensive Benchmark of Backdoor Learning

Baoyuan Wu, Hongrui Chen, Mingda Zhang, Zihao Zhu , Shaokui Wei, Danni Yuan, and Hongyuan Zha

In Advances in Neural Information Processing Systems, 2022

Bib PDF Code Website

@inproceedings{wu2022backdoorbench,
  title = {BackdoorBench: A Comprehensive Benchmark of Backdoor Learning},
  author = {Wu, Baoyuan and Chen, Hongrui and Zhang, Mingda and Zhu, Zihao and Wei, Shaokui and Yuan, Danni and Zha, Hongyuan},
  booktitle = {Advances in Neural Information Processing Systems},
  year = {2022},
}

ICASSP

From Shallow to Deep: Compositional Reasoning over Graphs for Visual Question Answering

Zihao Zhu

In IEEE International Conference on Acoustics, Speech and Signal Processing, 2022

Bib PDF

@inproceedings{zhu2022shallow,
  title = {From Shallow to Deep: Compositional Reasoning over Graphs for Visual Question Answering},
  author = {Zhu, Zihao},
  booktitle = {IEEE International Conference on Acoustics, Speech and Signal Processing},
  year = {2022},
}

Cross-Modal Knowledge Reasoning for Knowledge-based Visual Question Answering

Jing Yu, Zihao Zhu , Yujing Wang, Weifeng Zhang, Yue Hu, and Jianlong Tan

Pattern Recognition, 2020

Bib PDF

@article{yu2020cross,
  title = {Cross-Modal Knowledge Reasoning for Knowledge-based Visual Question Answering},
  author = {Yu, Jing and Zhu, Zihao and Wang, Yujing and Zhang, Weifeng and Hu, Yue and Tan, Jianlong},
  journal = {Pattern Recognition},
  year = {2020},
}

IJCAI

Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering

Zihao Zhu , Jing Yu, Yujing Wang, Yajing Sun, Yue Hu, and Qi Wu

In Proceedings of the International Joint Conference on Artificial Intelligence, 2020

Bib PDF

@inproceedings{zhu2020mucko,
  title = {Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering},
  author = {Zhu, Zihao and Yu, Jing and Wang, Yujing and Sun, Yajing and Hu, Yue and Wu, Qi},
  booktitle = {Proceedings of the International Joint Conference on Artificial Intelligence},
  year = {2020},
}

IJCAI

DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses

Xiaoze Jiang, Jing Yu, Zengchang Qin, Zihao Zhu , and Qi Wu

In Proceedings of the International Joint Conference on Artificial Intelligence, 2020

Bib PDF

@inproceedings{jiang2020dam,
  title = {DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses},
  author = {Jiang, Xiaoze and Yu, Jing and Qin, Zengchang and Zhu, Zihao and Wu, Qi},
  booktitle = {Proceedings of the International Joint Conference on Artificial Intelligence},
  year = {2020},
}