- Warp Scheduling for Fine-Grained Synchronization
- GPGPU Power Modeling for Multi-domain Voltage-Frequency Scaling
- Amdahl's Law in the Datacenter Era: A Market for Fair Processor Allocation
- Amdahl's Law in Big Data Analytics: Alive and Kicking in TPCx-BB (BigBench)
- A Novel Register Renaming Technique for Out-of-Order Processors
- KPart: A Hybrid Cache Partitioning-Sharing Technique for Commodity Multicores
- Routerless Network-on-Chip
- SmarCo: An Efficient Many-Core Processor for High-Throughput Applications in Datacenters
- Enabling Efficient Network Service Function Chain Deployment on Heterogeneous Server Platform
- GraphR: Accelerating Graph Processing Using ReRAM
- Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective
- GraphP: Reducing Communication for PIM-Based Graph Processing with Efficient Data Partition
- Memory System Design for Ultra Low Power, Computationally Error Resilient Processor Microarchitectures
- WIR: Warp Instruction Reuse to Minimize Repeated Computations in GPUs