Rong Gu

Ph.D., Research Assistant Professor
Department of Computer Science and Technology
Nanjing University

Nanjing University, Xianlin Campus
163 Xianlin Avenue, Nanjing, Jiang Su, China 210023

Recent News:

Short Bio:

I'm currently a research assistant professor in Department of Computer Science and Technology at Nanjing University. Before that, I received my Ph.D. degree from Nanjing University in 2016, supervised by Prof. Yihua Huang. I got my B.Sc. Degree from Nanjing University of Aeronautics and Astronautics and entered the graduate studies in Nanjing University with the exam exempt in June 2011.


I am interested in Big Data parallel processing systems, distributed machine learning.

Selected Publications


  • Rong Gu, Yufa Zhou, Zhaokang Wang, Chunfeng Yuan, and Yihua Huang. Penguin: Efficient Query-based Framework for Replaying Large Scale Historical Data. IEEE Transactions on Parallel and Distributed Systems (TPDS). Vol.29(10), 2018, pp. 2333-2345.

  • Rong Gu, Yun Tang, Chen Tian, Hucheng Zhou, Guanru Li, Xudong Zheng, and Yihua Huang. Improving Execution Concurrency of Large-Scale Matrix Multiplication on Distributed Data-Parallel Platforms. IEEE Transactions on Parallel and Distributed Systems (TPDS). Vol.28(9), 2017, pp. 2539-2552.

  • Rong Gu, Xiaoliang Yang, Jinshuang Yan, Yuanhao Sun, Bing Wang, Chunfeng Yuan, and Yihua Huang. SHadoop: Improving MapReduce Performance By Optimizing Job Execution Mechanism in Hadoop Clusters. Journal of Parallel and Distributed Computing (JPDC). Vol.74(3), 2014, pp. 2166-2179.

  • Rong Gu, Hongjian Qiu, Wenjia Yang, Wei Hu, Chunfeng Yuan, Yihua Huang. Goldfish: A Large Scale Semantic Data Store and Query System Based on Boolean Matrix Factorization.(in Chinese). Chinese Jouranl of Computers.《计算机学报》 Vol.40(10), 2017, pp. 2212-2230.

  • Rong Gu, Fangfang Wang, Chunfeng Yuan, Yihua Huang. YARM: Efficient and Scalable Semantic Reasoning Engine Using MapReduce.(in Chinese). Chinese Jouranl of Computers.《计算机学报》 Vol.38(1), 2015, pp. 74-85.

  • Rong Gu, Jinshuang Yan, Xiaoliang Yang, Chunfeng Yuan, and Yihua Huang. Performance Optimization for Short Job Execution in Hadoop MapReduce.(in Chinese). Journal of Computer Research and Development.《计算机研究与发展》 Vol.51(6), 2014, pp. 1270-1280.


  • Rong Gu, Shanyong Wang, Fangfang Wang, Chunfeng Yuan, Yihua Huang. Cichlid: Efficient Large Scale RDFS/OWL Reasoning with Spark. Proc. of the IEEE 29th International Parallel & Distributed Processing Symposium(IEEE IPDPS 2015), pp. 700 - 709, Hyderabad, India, May. 25-29, 2015.(acceptance rate: 21.8%)

  • Rong Gu, Furao Shen, and Yihua Huang. A Parallel Computing Platform for Training Large Scale Neural Networks. Proc. of the IEEE International Conference on Big Data (IEEE BigData 2013), pp. 376 - 384, Santa Clara, CA, USA, Oct. 6-9, 2013. (accepted 45 out of 259 submissions, acceptance rate: 17.37%).(Student Travel Award)

  • Rong Gu, Qianhao Dong, Haoyuan Li, Joseph Gonzalez, Zhao Zhang, Shuai Wang, Yihua Huang, Scott Shenker, Ion Stoica and Patrick P. C. Lee. DFS-Perf: A Scalable and Unified Benchmarking Framework for Distributed File Systems. EECS Department, University of California, Berkeley, Technical Report No.UCB/EECS-2016-133,July 27, 2016.

  • Rong Gu, Lei Jin, Yongwei Wu, Jingying Qu, Tao Wang, Xiaojun Wang, Chunfeng Yuan and Yihua Huang. Parallel Training GBRT Based on KMeans Histogram Approximation for Big Data. Proc. of the 15th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP 2015), pp. 52-65, Zhangjiajie, China, Nov. 18-20, 2015.

  • Rong Gu, Wei Hu, and Yihua Huang. Rainbow: A Distributed and Hierarchical RDF Triple Store with Dynamic Scalability. Proc. of the IEEE International Conference on Big Data (IEEE BigData 2014), pp. 561 - 566, Washington, D.C., USA, Oct. 27-30, 2014. (short paper).

  • Rong Gu, Yun Tang, Qianhao Dong, Zhaokang Wang, Zhiqiang Liu, Shuai Wang, Chunfeng Yuan, Yihua Huang Unified Programming Model and Software Framework for Big Data Machine Learning and Data Analytics Proc. of the 39th IEEE Computer Software and Applications Conference (COMPSAC 2015),DSA Workshop,pp. 562-567, Taichung, Taiwan, July 1-5, 2015.

  • Peng Shu, Rong Gu, Qianhao Dong, Chunfeng Yuan, Yihua Huang. Accelerating Big Data Applications on Tiered Storage System with Various Eviction Policies. Proc. of the IEEE International Symposium on Parallel and Distributed Processing with Applications (IEEE ISPA 2016), pp. 1350 - 1357, Tianjin, China, 23-26 August, 2016

Boook Chapters:

  • Several chapters of the book titled Understanding Big Data-Big Data Processing and Programming (in Chinese, published by the China Machine Press(HZ Books),2014, ISBN:9787111473251).

  • The 11th chapter of the book titled Hadoop in Practice: Open the Shortcut to the Cloud Computing(in Chinese, published by the Electronic Industry Press,2011, ISBN:9787121144752).

Research Projects&Grants

  • Research on Big Media Data Content Analysis and Associated Semantic Mining, National Science Foundation Special Research Grant (No.61223003), 2013~2017 (Participant).

  • Research on Key Technologies of Unified Programming Model for Big Data Machine Learning, National Science Foundation Grant (No.61572250), 2015~2019 (Participant).


  • Synthesis Experiments of Big Data Processing, (for undergraduate students, Spring Semester), with Prof. Yihua Huang and Dr. Hao Wang.

  • Big Data Parallel Processing with MapReduce, (for graduate students, Fall Semester), with Prof. Yihua Huang.

Professional Activities and Services

  • Open Source Software Community Services: I am a founding PMC member and maintianer of Alluxio(formly Tachyon) porject, and a Apache Spark contributor.

  • Academic Services: I am a reviewer of Chinese Jouranl of Computers (《计算机学报》) and Journal of Circuits, Systems and Computers (JCSC).

  • Technology Community Services: I am the organizer of the Nanjing Big Data Technology Meetup. Our activities have attracted more than hundreds of attendees who are Big Data researchers or engineers from universities and companies.

  • Talks in Professional Activities (updated before 2016.09):
    2016.08.04 内存为中心的开源虚拟分布式存储系统Alluxio(前Tachyon)入门. Strata+Hadoop World. Beijing, China.
    2016.07.31 Alluxio 缓存策略优化与大规模性能评测. Shanghai Streaming Processing Tech Meetup. Shanghai, China.
    2016.05.21 以内存为中心的大数据存储系统Alluxio的特性与案例介绍. Nanjing Big Data Tech Meetup. Nanjing, China.
    2016.05.14 开源的虚拟大数据存储系统Alluxio的功能与使用案例介绍. Database Technology Conference China. Beijing, China.
    2016.03.18 以内存为中心的大数据存储系统Alluxio的特性与案例介绍. Chian Hadoop Summit. Beijing, China.
    2015.10.25 Tachyon存储系统的基本原理以及与Spark的结合使用. Apache Roadshow 2015-China. Beijing, China.
    2015.07.25 A High-level Unified Programming Model and Platform for Big Data Analytics. Nanjing Big Data Tech Meetup. Nanjing, China.
    2015.06.27 The new features and case studies of Tachyon. Spark Meetup Beijing. Beijing, China.
    2014.12.11 The principles of Tachyon and its performance evaluation platform. Microsoft Research Asia(MSRA). Beijing, China.
    2014.11.09 Marlin: Efficient Large-Scale Distributed Matrix Computation with Spark. Spark Meetup Beijing. Beijing, China, host by Intel Labs.
    2014.07.26 Training Large Scale Deep Neural Networks on the Intel Xeon Phi Many-core Coprocessor. Accurate Marketing and Platforms Building in Big Data, hosted by InfoQ. Nanjing, China
    2014.04.19 Fast Frequent Itemset Mining Algorithm with Spark. Spark Summit China. Beijing,China

  • Work Experience: During my Ph.D program, I've held internships at several technology companies, including Microsoft Research Asia, Intel, Baidu, and Transwarp.

Awards and Honours


  • Awarded the 2015 China National Scholarship (highest level scholarship set by China government).
  • Awarded the 2015 Principal Scholarship of Nanjing University (only two CS Ph.D students are award).
  • Awarded the 2014 Tung OOCL Scholarship.
  • Awarded the 2013 Google Excellence Scholarship.

Contest Award

  • Win the championship of CloudSort in the SortBenchmark competition 2016.
  • Achieved the first prize of the 2015 National Cloud Application and Innovation Contest (the first place of the technique track).
  • Achieved the runner up of large scale route planning track in the 2012 Internet Contest for Cloud & Mobile computing.
  • Achieved the third place of short text classification track in the 2012 Internet Contest for Cloud & Mobile computing.
  • Achieved the forth place of image search track in the 2012 Internet Contest for Cloud & Mobile computing.
  • Achieved the Commendation of Effort Award in the 2011 Microsoft-Morgan Stanley Cup Finance High Performance Computing Contest.