About me
I am a Senior Researcher at Microsoft where I work on hardware-software co-design for machine learning and high-performance computing applications in the Systems Innovation Group. Before joining Microsoft, I worked as a Silicon Design Engineer at AMD Research for about 4.5 years where my research charter spanned GPU architecture, Network-On-Chips(NoCs), and cryogenic computing. Prior to joining AMD Research, I was part of the Synergy Lab at Georgia Institute of Technology, Atlanta, GA, where I worked on TLBs, Network-on-chips and was also part of the DARPA CHIPS program under the supervision of Prof. Tushar Krishna. I receieved my Master’s in Electrical and Computer Engineering from Georgia Institute of Technology in 2017 with a thesis on distributed TLB architectures. I later pursued my PhD under Prof. Tushar Krishna on design of interconnection systems for multi-chiplet GPUs.
I graduated with a Bachelors in Electrical and Electronics Engineering from BITS Pilani, India, in 2014. I also spent some time working as a full-time engineer with Oracle Solaris team and NVIDIA iGPU team in Bangalore, India. Some of my work from Oracle is also published in the form of blogs.
I am also an active developer within the gem5 community. You can find more info about the tools I developed and maintain in the tools tab.
News
- May 2024: Paper on efficient execution of attention kernels for long context lengths. Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers
- Apr 2024: I presented our work on Fine-Grain DVFS at ASPLOS 2024. Predict; Do not React for Enabling Efficient Fine Grain DVFS in GPUs
- Jun 2023: Paper on Fine-Grain DVFS accepted at ASPLOS 2024. Predict; Do not React for Enabling Efficient Fine Grain DVFS in GPUs
- Sep 2022: Documentation for HeteroGarnet, now available on gem5 documentation website.
- Apr 2022: I joined Microsoft Research as a Senior Researcher!
- Apr 2022: Paper on Fine-Grain DVFS released on arxiv: Predict; Do not React for Enabling Efficient Fine Grain DVFS in GPUs
- May 2021: Paper on underclocking accepted in NOCS 2021: DUB: Dynamic Underclocking and Bypassing in NoCs for Heterogeneous GPU Workloads
- Apr 2021: My first book now available on several e-stores: Network-on-Chip Security and Privacy (Amazon Springer)
- Oct 2020: New paper describing gem5 version-20: The gem5 Simulator: Version 20.0+
- June 2020: Paper on chiplet topologies accepted in DAC 2020: Kite: A Family of Heterogeneous Interposer Topologies Enabled via Accurate Interconnect Modeling
- Jun 2020: Released a new simulation tool - HeteroGarnet.
- Jun 2019: Paper on simulating ML applications accepted in IISWC 2019: Optimizing GPU cache policies for MI workloads [https://ieeexplore.ieee.org/document/9041977]
- Jun 2018: Paper on TLB architectures accept in MICRO 2018: Scalable Distributed Last-Level TLBs Using Low-Latency Interconnects