Publications

Publications in reversed chronological order.

2023

  1. ASPLOS
    Exploiting the Regular Structure of Modern Quantum Architectures for Compiling and Optimizing Programs with Permutable Operators
    Yuwei Jin, Fei Hua, Yanhao Chen, and 3 more authors
    In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 4, 2023
  2. HPCA
    A Pulse Generation Framework with Augmented Program-aware Basis Gates and Criticality Analysis
    Yanhao Chen, Yuwei Jin, Fei Hua, and 4 more authors
    In 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2023
  3. ASPLOS
    CaQR: A Compiler-Assisted Approach for Qubit Reuse through Dynamic Circuit
    Fei Hua, Yuwei Jin, Yanhao Chen, and 6 more authors
    In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3, 2023

2021

  1. ASPLOS
    Time-Optimal Qubit Mapping
    Chi Zhang, Ari B. Hayes, Longfei Qiu, and 3 more authors
    In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021
  2. ICPP
    BGPQ: A Heap-Based Priority Queue Design for GPUs
    Yanhao Chen, Fei Hua, Yuwei Jin, and 1 more author
    In Proceedings of the 50th International Conference on Parallel Processing, 2021
  3. MICRO
    AutoBraid: A Framework for Enabling Efficient Surface Code Communication in Quantum Computing
    Fei Hua, Yanhao Chen, Yuwei Jin, and 4 more authors
    In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

2019

  1. CGO
    Decoding CUDA Binary
    Ari B. Hayes, Fei Hua, Jin Huang, and 2 more authors
    In 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2019

2018

  1. USENIX ATC
    Locality-Aware Software Throttling for Sparse Matrix Operation on GPUs
    Yanhao Chen, Ari B. Hayes, Chi Zhang, and 2 more authors
    In Proceedings of the 2018 USENIX Conference on Usenix Annual Technical Conference, 2018

2017

  1. TACO
    LD: Low-Overhead GPU Race Detection Without Access Monitoring
    Pengcheng Li, Xiaoyu Hu, Dong Chen, and 4 more authors
    ACM Trans. Archit. Code Optim., Mar 2017
  2. POMACS
    A Simple Yet Effective Balanced Edge Partition Model for Parallel Computing
    Lingda Li, Robel Geda, Ari B. Hayes, and 4 more authors
    Proc. ACM Meas. Anal. Comput. Syst., Jun 2017
  3. SIGMETRICS
    A Simple Yet Effective Balanced Edge Partition Model for Parallel Computing
    Lingda Li, Robel Geda, Ari B. Hayes, and 4 more authors
    In Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, Jun 2017
  4. USENIX ATC
    GPU Taint Tracking
    Ari B. Hayes, Lingda Li, Mohammad Hedayati, and 3 more authors
    In Proceedings of the 2017 USENIX Conference on Usenix Annual Technical Conference, Jun 2017

2016

  1. Middleware
    Orion: A Framework for GPU Occupancy Tuning
    Ari B. Hayes, Lingda Li, Daniel Chavarrı́a-Miranda, and 2 more authors
    In Proceedings of the 17th International Middleware Conference, Jun 2016
  2. DATE
    Critical points based register-concurrency autotuning for GPUs
    Ang Li, Shuaiwen Leon Song, Akash Kumar, and 3 more authors
    In 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), Jun 2016
  3. HPDC
    New-Sum: A Novel Online ABFT Scheme For General Iterative Methods
    Dingwen Tao, Shuaiwen Leon Song, Sriram Krishnamoorthy, and 5 more authors
    In Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, Jun 2016
  4. ICS
    Tag-Split Cache for Efficient GPGPU Cache Utilization
    Lingda Li, Ari B. Hayes, Shuaiwen Leon Song, and 1 more author
    In Proceedings of the 2016 International Conference on Supercomputing, Jun 2016

2014

  1. ICS
    Unified On-Chip Memory Allocation for SIMT Architecture
    Ari B. Hayes, and Eddy Z. Zhang
    In Proceedings of the 28th ACM International Conference on Supercomputing, Jun 2014
  2. ISMM
    Massive Atomics for Massive Parallelism on GPUs
    Ian J. Egielski, Jesse Huang, and Eddy Z. Zhang
    In Proceedings of the 2014 International Symposium on Memory Management, Jun 2014

2013

  1. JParallel
    An Infrastructure for Tackling Input-Sensitivity of GPU Program Optimizations
    Xipeng Shen, Yixun Liu, Eddy Z. Zhang, and 1 more author
    Int. J. Parallel Program., Dec 2013
  2. PPoPP
    Complexity Analysis and Algorithm Design for Reorganizing Data to Minimize Non-Coalesced Memory Accesses on GPU
    Bo Wu, Zhijia Zhao, Eddy Zheng Zhang, and 2 more authors
    In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Dec 2013

2012

  1. TPDS
    The Significance of CMP Cache Sharing on Contemporary Multithreaded Applications
    Eddy Z. Zhang, Yunlian Jiang, and Xipeng Shen
    IEEE Transactions on Parallel and Distributed Systems, Dec 2012

2011

  1. ASPLOS
    On-the-Fly Elimination of Dynamic Irregularities for GPU Computing
    Eddy Z. Zhang, Yunlian Jiang, Ziyu Guo, and 2 more authors
    In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, Dec 2011
  2. OOPSLA
    A Step towards Transparent Integration of Input-Consciousness into Dynamic Program Optimizations
    Kai Tian, Eddy Zhang, and Xipeng Shen
    In Proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications, Dec 2011
  3. PACT
    Correctly Treating Synchronizations in Compiling Fine-Grained SPMD-Threaded Programs for CPU
    Ziyu Guo, Eddy Zheng Zhang, and Xipeng Shen
    In 2011 International Conference on Parallel Architectures and Compilation Techniques, Dec 2011
  4. PACT
    Enhancing Data Locality for Dynamic Simulations through Asynchronous Data Transformations and Adaptive Control
    Bo Wu, Eddy Z. Zhang, and Xipeng Shen
    In 2011 International Conference on Parallel Architectures and Compilation Techniques, Dec 2011

2010

  1. JPEVA
    Trace data characterization and fitting for Markov modeling
    Giuliano Casale, Eddy Z. Zhang, and Evgenia Smirni
    Performance Evaluation, Dec 2010
  2. JPEVA
    KPC-Toolbox: Best Recipes for Automatic Trace Fitting Using Markovian Arrival Processes
    Giuliano Casale, Eddy Z. Zhang, and Evgenia Smirni
    Perform. Eval., Sep 2010
  3. PPoPP
    Does Cache Sharing on Modern CMP Matter to the Performance of Contemporary Multithreaded Programs?
    Eddy Z. Zhang, Yunlian Jiang, and Xipeng Shen
    In Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Sep 2010
  4. CGO
    Exploiting Statistical Correlations for Proactive Prediction of Program Behaviors
    Yunlian Jiang, Eddy Z. Zhang, Kai Tian, and 4 more authors
    In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, Sep 2010
  5. CC
    Is Reuse Distance Applicable to Data Locality Analysis on Chip Multiprocessors?
    Yunlian Jiang, Eddy Z. Zhang, Kai Tian, and 1 more author
    In Compiler Construction, Sep 2010
  6. ICS
    Streamlining GPU Applications on the Fly: Thread Divergence Elimination through Runtime Thread-Data Remapping
    Eddy Z. Zhang, Yunlian Jiang, Ziyu Guo, and 1 more author
    In Proceedings of the 24th ACM International Conference on Supercomputing, Sep 2010
  7. OOPSLA
    An Input-Centric Paradigm for Program Dynamic Optimizations
    Kai Tian, Yunlian Jiang, Eddy Z. Zhang, and 1 more author
    In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, Sep 2010

2009

  1. The Study and Handling of Program Inputs in the Selection of Garbage Collectors
    Xipeng Shen, Feng Mao, Kai Tian, and 1 more author
    SIGOPS Oper. Syst. Rev., Jul 2009
  2. IPDPS
    A cross-input adaptive framework for GPU program optimizations
    Yixun Liu, Eddy Z. Zhang, and Xipeng Shen
    In 2009 IEEE International Symposium on Parallel & Distributed Processing, Jul 2009
  3. VEE
    Influence of Program Inputs on the Selection of Garbage Collectors
    Feng Mao, Eddy Z. Zhang, and Xipeng Shen
    In Proceedings of the 2009 ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, Jul 2009

2008

  1. QEST
    KPC-Toolbox: Simple Yet Effective Trace Fitting Using Markovian Arrival Processes
    Giuliano Casale, Eddy Z. Zhang, and Evgenia Smirni
    In 2008 Fifth International Conference on Quantitative Evaluation of Systems, Jul 2008