I earned my bachelor’s degree in the School of Software Engineering at Harbin Institute of Technology (HIT) in 2022. The same year, I was admitted to the School of Computer Science and Technology at University of Science and Technology of China (USTC) to pursue a master’s degree, and in 2024, I transitioned to the PhD program. Currently, I am a first-year Ph.D. candidate jointly supervised by Prof. Prof. Xike Xie, Prof. S. Kevin Zhou, and Prof. MingJun Xiao, and I am affiliated with the Data Darkness Lab (DDL) under the (MIRACLE Center).

My research journey began with an exploration into understanding and leveraging the memory within neural networks to accomplish intriguing tasks. In addition, I am delving into the KV Cache memory mechanisms of LLMs, with the objective of unraveling their intrinsic workings and making application contributions like optimizing inference efficiency.

Please don’t hesitate to reach out for any discussion. You can contact me via email at yfung@mail dot ustc dot edu dot cn. I am always open to engaging in intriguing research endeavors! 😊


Research Interests:

🎯 Teaching neural networks to memorize data streams & Try to learning an algorithm!

  1. Meta-sketch: A neural data structure for estimating item frequencies of data streams. (CCF-A, AAAI23 Oral)
  2. Mayfly: a Neural Data Structure for Graph Stream Summarization. (Top AI Conf., ICLR24 Spotlight)
  3. Learning to Sketch: A Neural Approach to Item Frequency Estimation in Streaming Data. (CCF-A, TPAMI24)

🎯 Understanding KV Cache Memory in LLMs And Compressing them!

  1. Ada-KV: Optimizing kv cache eviction by adaptive budget allocation for efficient llm inference. (Available on Arxiv)

    We introduced the first head-wise adaptive cache compression and open-sourced our code. CloudFlare has integrated our Ada-KV algorithm into vLLM, providing crucial support for real-world deployment, and released a technical report with detailed evaluation. Additionally, we’ve been invited by Huawei and OPPO to share our insights with their teams, aiding in practical applications! We also actively collaborating with the community, hoping to assist future research in this area. Explore our github repo and feel free to raise issues or contact us via email for any inquiries!

🎯 Broader Research Horizons.

  1. Interpretable Memory in LLMs
  2. Multimodal Large Models
  3. Test-time Scaling Laws

Publications

  1. [Arxiv] Yuan Feng, Junlin Lv, Yukun Cao, Xike Xie, and S. Kevin Zhou. “Ada-kv: Optimizing kv cache eviction by adaptive budget allocation for efficient llm inference.” arXiv preprint arXiv:2407.11550 (2024) (paper, code).

  2. [Arxiv] Junlin Lv, Yuan Feng, Xike Xie, Xin Jia, Qirong Peng, and Guiming Xie. “CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs.” arXiv preprint arXiv:2409.12490 (2024) (paper, code).

  3. [ICLR24 Spotlight Paper (Top AI Conf.)] Yuan Feng, Yukun Cao, Wang Hairu, Xike Xie, and S. Kevin Zhou. “Mayfly: a Neural Data Structure for Graph Stream Summarization.” In The Twelfth International Conference on Learning Representations, 2024 Spotlight.(paper, code)

  4. [TPAMI 24-Regular Paper (CCF-A)] Yukun Cao, Yuan FengCo-First, Hairu Wang, Xike Xie, and S. Kevin Zhou. “Learning to Sketch: A Neural Approach to Item Frequency Estimation in Streaming Data.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024 (paper, code).

  5. [AAAI 23-Oral Paper (CCF-A)] Yukun Cao, Yuan Feng, and Xike Xie. “Meta-sketch: A neural data structure for estimating item frequencies of data streams.” In Proceedings of the AAAI Conference on Artificial Intelligence, 2023 Oral. (paper, code)


Honors

  • 2024 PhD’s First Prize Scholarship in USTC

  • 2022/2023 two times Master’s First Prize Scholarship in USTC

  • 2021 Outstanding Student Award in HIT

  • 2020/2021 two times National Scholarship in HIT


Educations

  • 2024.09 - , PhD candidate in Computer Science and Technology, University of Science and Technology of China (USTC).

  • 2022.09 - 2024.06, Master in Computer Science and Technology, University of Science and Technology of China (USTC).

  • 2018.09 - 2022.06, Bachelor in Software Engineering, Harbin Institute of Technology (HIT).