Ankush Mandal

Ankush Mandal

Ph.D. in Computer Science

Georgia Institute of Technology

About

I have recently completed Ph.D. in Computer Science at Georgia Tech. I have been working with Dr. Vivek Sarkar in the Habanero Extreme Scale Software Research Lab and Dr. Anshumali Shrivastava in the RUSH Lab. My current research focuses on an intersection of Parallel Computing, Performance Optimization, and Machine Learning for Big Data.

Quick links: CV | Resume

Interests

  • Parallel Computing
  • Parallel Randomized Algorithms for Big Data
  • Compiler Optimizations
  • Performance Optimization of Approximate Algorithms on Modern Architectures (e.g. Multi-core, Many-core, SIMD, GPU processors)
  • High Performance Libraries for Machine Learning Kernels

Education

  • PhD in Computer Science, 2020

    Georgia Institute of Technology

  • M.S. (Thesis) in Computer Science, 2017

    Rice University

  • B.E. in Electronics and Telecommunication Engineering, 2012

    Jadavpur University, India

Professional Experience

 
 
 
 
 

Intern for R&D of Energy and Performance Analysis

Intel

May 2018 – Jul 2018 Austin, TX, USA
Energy and performance analysis of convolutions in popular Convolutional Neural Networks on x86 CPUs (Broadwell and Skylake architectures).
 
 
 
 
 

Graduate Intern

Intel Labs

Jan 2017 – May 2017 Santa Clara, CA, USA
Performance optimization of direct convolution kernel in open source LIBXSMM library for convolutions in popular Convolutional Neural Networks on x86 CPUs targeting High-Performance Computing, specifically Intel Xeon Phi Knights Landing CPU.
 
 
 
 
 

Intern

AMD

Jun 2016 – Aug 2016 Austin, TX, USA
Analysis and performance improvement of an auto-tuning GEMM framework for Caffe (popular Deep Learning framework) related problems on GPU architecture.

Recent Publications

NinjaVec: Learning Word Embeddings with Word2Vec at Lightning Speed. (In Submission), 2020.

Matryoshka: a Nested Sketching Strategy for GPU-scale Parallelism and Skewed Data. (In Submission), 2020.

Topkapi: parallel and fast sketches for finding top-K frequent elements. Advances in Neural Information Processing Systems (NeurIPS), 2018.

PDF Code

Using Dynamic Compilation to Achieve Ninja Performance for CNN Training on Many-Core Processors. European Conference on Parallel Processing (Euro-Par), 2018.

PDF Code

An adaptive differential evolution algorithm for global optimization in dynamic environments. IEEE Transactions on Cybernetics, 2013.

PDF