I graduated from University of Delaware with a Ph.D. degree in Computer Science in Summer 2016.

My PhD advisor is Dr. John Cavazos.

My research area focused on energy-efficient high performance computing, machine learning methods to optimize HPC applications for performance and energy efficiency. I had extensive experience with energy management techniques involving DVFS and CPU Clock Modulation. I also applied Polyhedral Compilation and machine learning techniques to optimize and auto-tune applications for energy. Besides, my research involved GPGPU and sampling-based search techniques on large file systems.

I joined Intel Corporation in September 2016 after graduation.


Current Research Topics

Energy Optimization for HPC Applications

HPC applications have different characteristics responding to frequency changes. We propse to execute the applications in hybrid fashion where frequencies are smartly changed with low overhead, resulting in higher energy efficiency. We combine concurrency throttling and power management techniques to further improve application's power signatures with low performance impact.

Predicitve Energy Modeling of Loop Optimizations

We study the power impact of loop optimizations in the polyhedral optimization space. Based on this study, we build predictive energy models that predict the best auto-tuned code version for energy efficency. We are also working on making large applications polyhedral optimizable.


  • [WORKSHOP] Wei Wang, Allan Porterfield, John Cavazos, Sridutt Bhalachandra, "Compiler Transformations Meet CPU Clock Modulation and Power Capping," the 5th International Workshop on Power-aware Algorithms, Systems, and Architectures (PASA'16), in conjunction with ICPP'16.
  • [WORKSHOP] Allan Porterfield, Sridutt Bhalachandra, Wei Wang, Rob Fowler,"Variability: A Tuning Headache”, 1st International Workshop on Variability in Parallel and Distributed Systems (VarSys), in conjunction with IPDPS'16.
  • [POSTER] Wei Wang, Edgar A. León, " Evaluating DVFS and Concurrency Throttling on IBM's Power8 Architecture,” in IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC 15), November 2015.
  • [CONFERENCE] Wei Wang, Allan Porterfield, John Cavazos, Sridutt Bhalachandra, "Using Per-Loop CPU Clock Modulation for Energy Efficiency in OpenMP Applications,” at 2015 International Conference on Parallel Processing (ICPP 2015), Beijing, China, 2015.
  • [POSTER] William Killian, Wei Wang, Eunjung Park, John Cavazos, “Energy Tuning of Polyhedral Kernels on Multicore and Many-Core Architectures,” at SEAK: DAC Workshop on Suite of Embedded Applications and Kernels, SEAK 2014, San Francisco, CA, USA, 2014.
  • [E-JOURNAL] Wei Wang, Lifan Xu, John Cavazos, Howie H. Huang, Matthew Kay,“Fast Acceleration of 2D Wave Propagation Simulations Using Modern Computational Accelerators,” PLoS ONE 9(1): e86484.
  • [WORKSHOP] Wei Wang, John Cavazos, Allan Porterfield, “Energy Autotuning using the Polyhedral Approach,” 4th International Workshop on Polyhedral Compilation Techniques (IMPACT 2014), in conjunction with HiPEAC'14, Vienna, Austria, 2014.
  • [WORKSHOP] Lifan Xu, Wei Wang, Marco A. Alvarez, John Cavazos, Dongping Zhang, "Parallelization of Shortest Path Graph Kernels on Multi-Core CPUs and GPUs,” Programmability Issues for Heterogeneous Multicores (MultiProg 14), in conjunction with HiPEAC'14, Vienna, Austria, 2014.
  • [WORKSHOP] Allan Porterfield, Rob Fowler, Sridutt Bhalachandra, Wei Wang,"OpenMP and MPI Application Energy Measurement Variation,” 1st International Workshop on Energy Efficient Super Computing (E2SC 2013), in conjunction with SC'13.
  • [JOURNAL] Howie H. Huang, Nan Zhang, Wei Wang, Gautam Das, Alex Szalay, "Just-In-Time Analytics on Large File Systems,” IEEE Transactions on Computers, vol. 61, no. 11, pp. 1651-1664, Nov. 2012.
  • [CONFERENCE] Wei Wang, Howie H. Huang, Matthew Kay, John Cavazos, "GPGPU Accelerated Cardiac Arrhythmia Simulations,” 33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC'11).
  • [CONFERENCE] Howie H. Huang, Nan Zhang, Wei Wang, Gautam Das, Alex Szalay, "Just-In-Time Analytics on Large File Systems,” 9th USENIX Conference on File and Storage Technologies (FAST'11).


GPU Accelerated Cardiac Arrhythmia Simulations

We parallelize a cardiac arrhythmia simulation application using CUDA, OpenCL, OpenMP, and OpenACC. We achieved hundreds of times speedup over the sequential. Using OpenACC, we achieved almost the same speedup with minimal application code modification.

An Encoding/Decoding Algorithm for Calculating Tree Similarity

I modified a famous graph kernel (Weisfeiler-Lehman Subtree Kernel) and encoded a forest of directed trees into set of numbers and then derived the histogram of numbers for similarity calculation. Each number could be decoded back to the original tree structure.

Polyhedral Optimization for Large Applications

We provide two larger applications than Polybench that can go through a polyhedral compiler: LULESH and Brdr2d -- the application used in the above project as well.

Just-In-Time File System Analyzer: GLANCE

GLANCE does just-in-time analytics on large file systems. It was tested on a file system with more than 1 billion inodes. It applies random sampling technique and returns approximate number of files/directories in the file system without traversing the whole file system directory.

I have a Github page which hosts many other interesting projects.



University of Delaware

Ph.D. Computer Science — 2016

M.S. Computer Science — 2011

Huazhong University of Science and Technology

B.E. Computer Science — 2009



I welcome you to contact me through one of the methods below.

If you need reach me via traditional mail, please consult my CV