Publications

You can also find my recent publications on my Google Scholar profile.

^* means Equal Contribution.

Zhang, S., Zhang, W., & Gu, Q. (2025). Energy-Weighted Flow Matching for Offline Reinforcement Learning. International Conference on Learning Representations.
Zhou, Y., Wang, Z., Wang, T., Xing, S., Xia, P., Li, B., Zheng, K., Zhang, Z., Chen, Z., Zheng, W., & others. (2025). AnyPrefer: An Automatic Framework for Preference Data Synthesis. International Conference on Learning Representations.
Zhao, L., Deng, Y., Zhang, W., & Gu, Q. (2025). Mitigating Object Hallucination in Large Vision-Language Models via Image-Grounded Guidance. Forty-Second International Conference on Machine Learning \Textbf(Spotlight). https://openreview.net/forum?id=w0xYx9CJhY
Wang, Z., He, W., Liang, Z., Zhang, X., Bansal, C., Wei, Y., Zhang, W., & Yao, H. (2025). CREAM: Consistency Regularized Self-Rewarding Language Models. International Conference on Learning Representations.
Sun, J., Zhang, W., Chen, Y., Hoar, B. B., Sheng, H., Yang, J. Y., Gu, Q., & Liu, C. (2025). Inquiry into the Appropriate Data Preprocessing of Electrochemical Impedance Spectroscopy for Machine Learning. The Journal of Physical Chemistry C.
Zhang, J., Zhang, W., Zhou, D., & Gu, Q. (2024). Uncertainty-Aware Reward-Free Exploration with General Function Approximation. Forty-First International Conference on Machine Learning.
Zhang, W., Fan, Z., He, J., & Gu, Q. (2024). Achieving Constant Regret in Linear Markov Decision Processes. The Thirty-Eighth Annual Conference on Neural Information Processing Systems.
Zheng, W., Chen, Y., Zhang, W., Kundu, S., Li, Y., Liu, Z., Xing, E. P., Wang, H., & Yao, H. (2024). CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing. Adaptive Foundation Models: Evolving AI for Personalized and Efficient Learning.
Huang, Z., Hwang, J., Zhang, J., Baik, J., Zhang, W., Wodarz, D., Sun, Y., Gu, Q., & Wang, W. (2024). Causal Graph ODE: Continuous Treatment Effect Modeling in Multi-agent Dynamical Systems. Proceedings of the ACM on Web Conference 2024, 4607–4617.
Hoar, B. B., Zhang, W., Chen, Y., Sun, J., Sheng, H., Zhang, Y., Chen, Y., Yang, J. Y., Costentin, C., Gu, Q., & others. (2024). Redox-Detecting Deep Learning for Mechanism Discernment in Cyclic Voltammograms of Multiple Redox Events. ACS Electrochemistry, 1(1), 52–62.
Sheng, H., Sun, J., Rodrı́guez Oliver, Hoar, B. B., Zhang, W., Xiang, D., Tang, T., Hazra, A., Min, D. S., Doyle, A. G., & others. (2024). Autonomous closed-loop mechanistic investigation of molecular electrochemistry via automation. Nature Communications, 15(1), 2781.
Lopez, V. K., Cramer, E. Y., Pagano, R., Drake, J. M., O’Dea, E. B., Adee, M., Ayer, T., Chhatwal, J., Dalgic, O. O., Ladd, M. A., & others. (2024). Challenges of COVID-19 Case Forecasting in the US, 2020–2021. PLoS Computational Biology, 20(5), e1011200.
Zhang, J., Zhang, W., & Gu, Q. (2023). Optimal horizon-free reward-free exploration for linear mixture mdps. International Conference on Machine Learning, 41902–41930.
Shea, K., Borchering, R. K., Probert, W. J. M., Howerton, E., Bogich, T. L., Li, S.-L., van Panhuis, W. G., Viboud, C., Aguás, R., Belov, A. A., & others. (2023). Multiple models for outbreak decision support in the face of uncertainty. Proceedings of the National Academy of Sciences, 120(18), e2207537120.
Ji, K., Zhao, Q., He, J., Zhang, W., & Gu, Q. (2023). Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs. The Twelfth International Conference on Learning Representations.
Zhang, W., He, J., Fan, Z., & Gu, Q. (2023). On the interplay between misspecification and sub-optimality gap in linear contextual bandits. International Conference on Machine Learning, 41111–41132.
Zhang, W., He, J., Zhou, D., Gu, Q., & Zhang, A. (2023). Provably efficient representation selection in low-rank Markov decision processes: from online to offline RL. Uncertainty in Artificial Intelligence, 2488–2497.
Zhang, W., Wang, X., Smith, J., Eaton, J., Rees, B., & Gu, Q. (2023). Diffmol: 3d structured molecule generation with discrete denoising diffusion probabilistic models. ICML 2023 Workshop on Structured Probabilistic Inference {\Backslash&} Generative Modeling.
Deng, Y., Zhang, W., Chen, Z., & Gu, Q. (2023). Rephrase and respond: Let large language models ask better questions for themselves. ArXiv Preprint ArXiv:2311.04205.
Zhang, W., Wang, X., Nie, W., Eaton, J., Rees, B., & Gu, Q. (2023). MoleculeGPT: Instruction Following Large Language Models for Molecular Property Prediction. NeurIPS 2023 Workshop on New Frontiers of AI for Drug Discovery and Development.
Cramer, E. Y., Ray, E. L., Lopez, V. K., Bracher, J., Brennen, A., Castro Rivadeneira, A. J., Gerding, A., Gneiting, T., House, K. H., Huang, Y., & others. (2022). Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the United States. Proceedings of the National Academy of Sciences, 119(15), e2113561119.
Hoar, B. B., Zhang, W., Xu, S., Deeba, R., Costentin, C., Gu, Q., & Liu, C. (2022). Electrochemical mechanistic analysis from cyclic voltammograms based on deep learning. ACS Measurement Science Au, 2(6), 595–604.
Jia, Y., Zhang, W., Zhou, D., Gu, Q., & Wang, H. (2021). Learning Neural Contextual Bandits through Perturbed Rewards. International Conference on Learning Representations.
Zhang, W., Zhou, D., & Gu, Q. (2021). Reward-free model-based reinforcement learning with linear function approximation. Advances in Neural Information Processing Systems, 34, 1582–1593.
Bracher, J., Wolffram, D., Deuschel, J., Görgen, K., Ketterer, J. L., Ullrich, A., Abbott, S., Barbarossa, M. V., Bertsimas, D., Bhatia, S., & others. (2021). A pre-registered short-term forecasting study of COVID-19 in Germany and Poland during the second wave. Nature Communications, 12(1), 5173.
Zhang, W., Zhou, D., Li, L., & Gu, Q. (2020). Neural Thompson Sampling. International Conference on Learning Representations.
Ray, E. L., Wattanachit, N., Niemi, J., Kanji, A. H., House, K., Cramer, E. Y., Bracher, J., Zheng, A., Yamana, T. K., Xiong, X., & others. (2020). Ensemble forecasts of coronavirus disease 2019 (COVID-19) in the US. MedRXiv, 2020–2008.
Wu, Y. F., Zhang, W., Xu, P., & Gu, Q. (2020). A finite-time analysis of two time-scale actor-critic methods. Advances in Neural Information Processing Systems, 33, 17617–17628.
Zou, D., Wang, L., Xu, P., Chen, J., Zhang, W., & Gu, Q. (2020). Epidemic model guided machine learning for COVID-19 forecasts in the United States. MedRxiv, 2020–2005.
Liu, S., Zhang, W., Wu, X., Feng, S., Pei, X., & Yao, D. (2018). A simulation system and speed guidance algorithms for intersection traffic control using connected vehicle technology. Tsinghua Science and Technology, 24(2), 160–170.