Skip to: Site menu | Main content


Alexey Tumanov


Assistant Professor
School of Computer Science
College of Computing
Georgia Institute of Technology

KACB 3346
atumanov[at]gatech[dot]edu

[Google Scholar] [arXiv] [DBLP] [GitHub] [Twitter] [ACM]

Brief Bio

I am a tenure-track Assistant Professor in the School of Computer Science at Georgia Tech since August 2019. I completed my postdoc at the University of California, Berkeley, working with Ion Stoica and collaborating closely with Joseph Gonzalez. I completed my Ph.D. at Carnegie Mellon University, advised by Gregory Ganger. At Carnegie Mellon, I was honored by the prestigious NSERC Alexander Graham Bell Canada Graduate Scholarship (NSERC CGS-D3) and partially funded by the Intel Science and Technology Centre for Cloud Computing and the Parallel Data Lab industry consortium. I came to Carnegie Mellon from the University of Toronto, where I worked with Eyal de Lara on agile stateful VM replication with para-virtualization. My interest in cloud computing, datacenter operating systems, and programming the cloud brought me to the University of Toronto from industry, where I had been developing cluster middleware for distributed datacenter resource management. I build novel distributed systems and contribute to open source. My earliest open source contributions were to Intel Cluster Ready Open Cluster Stack, where I was one of the key contributors.

Research Synopsis

My current research interest is in systems support and resource management for distributed machine learning frameworks and applications. Specifically, I am currently working on distributed systems and scheduling algorithms for soft-real time Machine Learning inference and co-scheduling ML inference and online training. This builds on the body of research and development at Carnegie Mellon modeling, designing, and developing abstractions, primitives, algorithms and systems for a general resource management framework with support for static and dynamic heterogeneity, hard and soft placement constraints, time-varying resource capacity guarantees, and combinatorial constraints in heterogeneous resource contexts. Cost- and latency-efficient resource management is fundamental to commoditizing Machine Learning.

SAIL: Systems for Artificial Intelligence Lab @ GT

CURRENT

PhD Students
Alind Khare : SCS PhD student
Payman Behnam : ECE PhD student
Amey Agrawal : SCS PhD student
Debopam Sanyal : SCS PhD student
Dhruv Garg : SCS PhD student, co-advised with Ada Gavrilovska
Jin Heo : SCS PhD student, co-advised with Ada Gavrilovska

Masters Students
Aditya Annavajjala : MSCS GRA
Animesh Agrawal : BSMS CS student
Monish Ramadoss : MSCS student
Sameer Reddy : MSCS student
Pranav Gadikar : MSCS student
Anshul Ahluwalia : MSCS student
Rohit Das : MSCS student
Srihas Yarlagadda : MSCS student
Brian Model : BSMS CS student

Undergraduate Students
Aditi Arun : Undergraduate Research Assistant, ML 4 Health
Varun Mehrotra : Undergraduate Research Assistant

ALUMNI

Yanbo Xu : CSE PhD student co-advised with Chao Zhang, now at Microsoft
Elton Pinto : MSCS GRA with SAIL, now on a startup tour with Sutter Hill
Snigdha Grandhi : MSCS with SAIL, now SDE-3, cloud tech division with Adobe
Pranavi Bajjuri : MSCS GRA with SAIL, now SWE@Confluent
Irene Lee : Software Engineering Intern at AnyScale
Meghavarnika Budati : MSCS, now a Software Engineer at Snowflake
Manas Sahni : MSCS, now at NVIDIA, Systems for ML Software Engineer
Shreya Varshini : MSCS, now at Facebook, Hardware Insights Engineer
Luis Pastrana : MSCS, now at Citadel Securities
Khang Vu : Undergrad, now EW/Avionics Software Researcher at Georgia Tech Research Institute
Shiva Devarajan : Undergrad
Shreyas Casturi : Undergrad

Publications

[OSDI24] Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve
Amey Agrawal, Nitin Kedia, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, Bhargav Gulavani, Alexey Tumanov, Ramachandran Ramjee
Conditionally accepted to 18th Usenix Symposium on Operating Systems Design and Implementation (OSDI'24), Santa Clara, USA, July 2024.
[Abstract] [PDF] [BibTeX]
TBD

[MLSYS24] Vidur: A Large-scale Simulation Framework for LLM Inference
Amey Agrawal, Nitin Kedia, Jayashree Mohan, Ashish Panwar, Nipun Kwatra, Bhargav S. Gulavani, Ramachandran Ramjee, Alexey Tumanov
To appear In 7th Annual Conference on Machine Learning Systems (MLSys'24), Santa Clara, USA, May 2024.
[Abstract] [PDF] [BibTeX]
TBD

[IPDPS24] Harmonica: Hybrid Accelerator to Overcome Imperfections of Mixed-signal DNN Accelerators
P. Behnam, U. Kamal, A. Shafiee, Alexey Tumanov, Saibal Mukhopadhyay
To appear In 38'th IEEE International Parallel and Distributed Processing Symposium (IPDPS), USA, 2024.
[Abstract] [PDF] [BibTeX]

        

[ML4H23] TransEHR: Self-Supervised Transformer for Clinical Time Series Data
Yanbo Xu, Shangqing Xu, Manav Ramprasad, Alexey Tumanov, Chao Zhang
To appear In Proc. of Machine Learning for Health (ML4H'23), Dec 10, 2023.
[Abstract] [PDF] [Code] [BibTeX]
  @misc{transehr-ml4h23,
    title={TransEHR: Self-Supervised Transformer for Clinical Time Series Data},
    author={Yanbo Xu and Shangqing Xu and Manav Ramprasad and
            Alexey Tumanov and Chao Zhang},
    year = {2023},
    booktitle = {Proceedings of the 3rd Machine Learning for Health Symposium},
    series = {ML4H'23},
    location = {New Orleans, United States}
}

[IEEEMicro23] Hardware-Software Co-design for Real-time Latency-accuracy Navigation in TinyML Applications
Payman Behnam*, Jianming Tong*, Alind Khare, Yangyu Chen, Pranav Gadikar, Abhimanyu Bambhaniya, Tushar Krishna, Alexey Tumanov
IEEE Micro Special Issue on TinyML, 2023.
[Abstract] [PDF] [BibTeX]
@article{sushi-micro23,
  author = {P. Behnam and J. Tong and A. Khare and Y. Chen and Y. Pan and P. Gadikar and A. Bambhaniya 
            and T. Krishna and A. Tumanov},
  journal = {IEEE Micro},
  title = {Hardware-Software co-design for real-time latency-accuracy navigation in tinyML applications},
  month = {sep},
  year = {2023},
  number = {01},
  issn = {1937-4143},
  pages = {1-7},
  keywords = {kernel;training;real-time systems;optimization;neural networks;system-on-chip;software},
  doi = {10.1109/MM.2023.3317243},
  publisher = {IEEE Computer Society},
  address = {Los Alamitos, CA, USA},
}

[arXiv23] Pareto-Secure Machine Learning (PSML): Fingerprinting and Securing Inference Serving Systems
Debopam Sanyal, Jui-Tse Hung, Manav Agrawal, Prahlad Jasti, Shahab Nikkhoo, Somesh Jha, Tianhao Wang, Sibin Mohan, Alexey Tumanov
arXiv, Aug 2023.
[Abstract] [PDF] [BibTeX]
  @misc{psml-arxiv23,
      title={Pareto-Secure Machine Learning (PSML): Fingerprinting and Securing Inference Serving Systems}, 
      author={Debopam Sanyal and Jui-Tse Hung and Manav Agrawal and Prahlad Jasti and Shahab Nikkhoo and 
              Somesh Jha and Tianhao Wang and Sibin Mohan and Alexey Tumanov},
      year={2023},
      eprint={2307.01292},
      archivePrefix={arXiv},
      primaryClass={cs.CR}
}

[arXiv23] DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization
Amey Agrawal, Sameer Reddy, Satwik Bhattamishra, Venkata Prabhakara Sarath Nookala, Vidushi Vashishth, Kexin Rong, Alexey Tumanov
arXiv, Jun 2023.
[Abstract] [PDF] [BibTeX]
  @misc{dynaquant-arxiv23,
      title={DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization}, 
      author={Amey Agrawal and Sameer Reddy and Satwik Bhattamishra and Venkata Prabhakara Sarath Nookala 
              and Vidushi Vashishth and Kexin Rong and Alexey Tumanov},
      year={2023},
      eprint={2306.11800},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

[MLSYS23] SubGraph Stationary Hardware-Software Inference Co-design
Payman Behnam*, Jianming Tong*, Alind Khare, Yangyu Chen, Yue Pan, Pranav Gadikar, Abhimanyu Bambhaniya, Tushar Krishna, Alexey Tumanov
In Proc. of 6th Conference on Machine Learning and Systems (MLSys'23), Jun 2023.
[Abstract] [PDF] [Slides] [Poster] [BibTeX]
@inproceedings{sushi-mlsys23,
    author = {Behnam, Payman and Tong, Jianming and Khare, Alind and Chen, Yangyu and
    Pan, Yue and Gadikar, Pranav and Bambhaniya, Abhimanyu and Krishna, Tushar and Tumanov, Alexey},
    title = {SubGraph Stationary Hardware-Software Inference Co-design},
    year = {2023},
    url = {https://proceedings.mlsys.org/paper_files/paper/2023/hash/0e65cd5d77a5e8b2d1f9ab31ba50b49b-Abstract-mlsys2023.html},
    booktitle = {Proceedings of the 6th Conference on Machine Learning and Systems},
    location = {Miami, Florida},
    series = {MLSys'23}
}

[ODIW@MLSYS23] Signed-Binary Networks: Improving Efficiency of Binary Networks by Exploiting Sparsity
Sachit Kuhar, Alexey Tumanov, Judy Hoffman
In Proc. of 3rd On-Device Intelligence Workshop, Machine Learning and Systems (MLSys'23), Jun 2023.
[Abstract] [PDF] [BibTeX]
@misc{kuhar2022signed,
  title={Signed-Binary Networks: 
	Improving Efficiency of Binary Networks by Exploiting Sparsity}, 
  author={Sachit Kuhar and Alexey Tumanov and Judy Hoffman},
  year={2022},
  eprint={2211.13838},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

[arXiv23] SuperFed: Weight Shared Federated Learning
Alind Khare, Animesh Agrawal, Myungjin Lee, Alexey Tumanov
arXiv, Jan 26, 2023.
[Abstract] [PDF] [BibTeX]
@misc{superfed-arxiv23,
  doi = {10.48550/ARXIV.2301.10879},
  url = {https://arxiv.org/abs/2301.10879},
  author = {Khare, Alind and Agrawal, Animesh and Lee, Myungjin and Tumanov, Alexey},
  title = {SuperFed: Weight Shared Federated Learning},
  publisher = {arXiv},
  year = {2023},
  month = {Jan},
}

[NeurIPS22] UnfoldML: A Cost-Aware 2-D Dynamic Prediction Pipeline for Multi-Stage Classification
Yanbo Xu, Alind Khare, Glenn Matlin, Monish Ramadoss, Rishi Kamaleswaran, Chao Zhang, Alexey Tumanov
In Proc. of 36'th Conference on Neural Information Processing Systems (NeurIPS), Nov 2022.
[Abstract] [PDF] [Poster] [OpenReview] [BibTeX]
@inproceedings{unfoldml-neurips22,
 author = {Xu, Yanbo and Khare, Alind and Matlin, Glenn and Ramadoss, Monish and 
           Kamaleswaran, Rishikesan and Zhang, Chao and Tumanov, Alexey},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh},
 pages = {4598--4611},
 publisher = {Curran Associates, Inc.},
 title = {UnfoldML: Cost-Aware and Uncertainty-Based Dynamic 2D Prediction for Multi-Stage Classification},
 url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/1d8f05e4da49a4e1e1b052a3046bceac-Paper-Conference.pdf},
 volume = {35},
 year = {2022}
}

[arXiv22] Signed Binary Weight Networks: Improving Efficiency of Binary Weight Networks by Exploiting Sparsity
Sachit Kuhar, Alexey Tumanov, Judy Hoffman
arXiv, Nov 25, 2022.
[Abstract] [PDF] [BibTeX]
@misc{https://doi.org/10.48550/arxiv.2211.13838,
  doi = {10.48550/ARXIV.2211.13838},
  url = {https://arxiv.org/abs/2211.13838},
  author = {Kuhar, Sachit and Tumanov, Alexey and Hoffman, Judy},
  keywords = {Computer Vision and Pattern Recognition (cs.CV), 
	      Distributed, Parallel, and Cluster Computing (cs.DC), Performance (cs.PF), 
	      FOS: Computer and information sciences, 
              FOS: Computer and information sciences},
  title = {Signed Binary Weight Networks: Improving Efficiency of 
	   Binary Weight Networks by Exploiting Sparsity},
  publisher = {arXiv},
  year = {2022}
}

[SOCC22] ESCHER: Expressive Scheduling with Ephemeral Resources
Romil Bhardwaj, Alexey Tumanov, Stephanie Wang, Richard Liaw, Philipp Moritz, Robert Nishihara, Ion Stoica
In Proc. of 13th ACM Symposium on Cloud Computing (SoCC), Nov 2022.
[Abstract] [PDF] [BibTeX]
@inproceedings{escher-socc22,
    author = {Bhardwaj, Romil and Tumanov, Alexey and Wang, Stephanie and 
    Liaw, Richard and Moritz, Philipp and Nishihara, Robert and Stoica, Ion},
    title = {ESCHER: Expressive Scheduling with Ephemeral Resources},
    year = {2022},
    isbn = {9781450394147},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3542929.3563498},
    doi = {10.1145/3542929.3563498},
    booktitle = {Proceedings of the 13th Symposium on Cloud Computing},
    pages = {47–62},
    numpages = {16},
    location = {San Francisco, California},
    series = {SoCC '22}
}

[ICCD22] CoDG-ReRAM: An Algorithm-Hardware Co-design to Accelerate Semi-Structured GNNs on ReRAM
Yixuan Luo*, Payman Behnam*, Kiran Thorat, Zhuo Liu, Hongwu Peng, Shaoyi Huang, Shu Zhou, Omer Khan, Alexey Tumanov, Caiwen Ding, Tong Geng
In Proc. of 40'th IEEE International Conference on Computer Design (ICCD), Oct 23-26 2022.
[Abstract] [PDF] [BibTeX]
@inproceedings{behnamgnn-iccd22,
  title = {CoDG-ReRAM: An Algorithm-Hardware Co-design 
           to Accelerate Semi-Structured GNNs on ReRAM},
  author = {Luo*, Yixuan  and Behnam*, Payman  and Thorat, Kiran and Liu, Zhuo and
            Peng, Hongwu  and Huang, Shaoyi and Zhou, Shu  and  Khan, Omer and 
            Tumanov, Alexey  and Ding, Caiwen and Geng, Tong},
  booktitle={40'th IEEE International Conference on Computer Design (ICCD)},
  year={2022},
  organization={IEEE}
}

[ACSMD@ISCA22] Enabling Real-time DNN Switching via Weight-Sharing
Jianming Tong, Yangyu Chen, Yue Pan, Abhimanyu Bambhaniya, Alind Khare, Taekyung Heo, Alexey Tumanov, Tushar Krishna
In Proc. of 2nd Architecture, Compiler, and System Support for Multi-model DNN Workloads Workshop at ISCA'22 (ACSMD@ISCA'22), June 2022.
[Abstract] [PDF] [BibTeX]

[EuroPar22] Automatic Parallelization of Python Programs for Distributed Heterogeneous Computing
Jun Shirako, Akihiro Hayashi, Sri Raj Paul, Alexey Tumanov, Vivek Sarkar
In Proc. of 28th International European Conference on Parallel and Distributed Computing (EuroPar'22), Aug 2022.
[Abstract] [PDF] [arXiv] [BibTeX]
@InProceedings{automphc-europar22,
  author="Shirako, Jun
    and Hayashi, Akihiro
    and Paul, Sri Raj
    and Tumanov, Alexey
    and Sarkar, Vivek",
  editor="Cano, Jos{\'e}
    and Trinder, Phil",
  title="Automatic Parallelization of Python Programs for Distributed Heterogeneous Computing",
  booktitle="Euro-Par 2022: Parallel Processing",
  year="2022",
  publisher="Springer International Publishing",
  pages="350--366",
  isbn="978-3-031-12597-3"
}

[ICLR21] CompOFA – Compound Once-For-All Networks for Faster Multi-Platform Deployment
Manas Sahni, Shreya Varshini, Alind Khare, Alexey Tumanov
In Proc. of International Conference on Learning Representations (ICLR'21), May 4 2021.
[Video] [Abstract] [PDF] [BibTeX]
@inproceedings{compofa-iclr21,
  author    = {Manas Sahni and Shreya Varshini and Alind Khare and
               Alexey Tumanov},
  title     = {{C}omp{OFA}: Compound Once-For-All Networks for Faster Multi-Platform Deployment},
  month     = {May},
  booktitle = {Proc. of the 9th International Conference on Learning Representations},
  series = {ICLR '21},
  year = {2021},
  url       = {https://openreview.net/forum?id=IgIk8RRT-Z}
}

[EuroSys21] Rubberband: Cloud-based Hyperparameter Tuning
R. Liaw, U. Misra, L. Dunlap, J. Gonzalez, I. Stoica, Alexey Tumanov, K. Kandasamy, R. Bhardwaj
In Proc. of EuroSys'21, Apr 26-29, 2021.
[Abstract] [PDF] [BibTeX]
@inbook{rubberband-eusys21-liaw,
author = {Misra, Ujval and Liaw, Richard and Dunlap, Lisa and Bhardwaj, Romil and Kandasamy, Kirthevasan 
          and Gonzalez, Joseph E. and Stoica, Ion and Tumanov, Alexey},
title = {RubberBand: Cloud-Based Hyperparameter Tuning},
year = {2021},
isbn = {9781450383349},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3447786.3456245},
booktitle = {Proceedings of the 16th European Conference on Computer Systems},
pages = {327–342},
numpages = {16}
}

[SOCC20] InferLine: Latency-Aware Provisioning and Scaling for Prediction Serving Pipelines
Daniel Crankshaw, Gur-Eyal Sela, Xiangxi Mo, Corey Zumar, Ion Stoica, Joseph E. Gonzalez, Alexey Tumanov
In Proc. of Symposium on Cloud Computing (SoCC'20), Nov 2020.
[Abstract] [PDF] [BibTeX]
@inproceedings{inferline-socc20,
  author    = {Daniel Crankshaw and
               Gur-Eyal Sela and
               Xiangxi Mo and
               Corey Zumar and
               Ion Stoica and
               Joseph E. Gonzalez and
               Alexey Tumanov},
  title     = {{I}nfer{L}ine: {ML} Inference Pipeline Composition Framework},
  month     = {Nov},
  booktitle = {Proc. of the 11th ACM Symposium on Cloud Computing},
  series = {SOCC '20},
  year = {2020},
  Publisher = {ACM},
  url       = {http://arxiv.org/abs/1812.01776}
}

[PVLDB20] Cloudburst: Stateful Functions-as-a-Service
Vikram Sreekanti, Chenggang Wu, Charles Lin, Johann Schleier-Smith, Joseph Gonzalez, Joseph Hellerstein, Alexey Tumanov
In Proc. of PVLDB, 13(11):2438-2452, July 2020.
[Abstract] [PDF] [BibTeX]
@article{cloudburst2020,
  author    = {Vikram Sreekanti and  Chenggang Wu and Xiayue Charles Lin and Johann Schleier-Smith and
               Joseph E. Gonzalez and Joseph M. Hellerstein and Alexey Tumanov
               },
  title     = {Cloudburst: Stateful Functions-as-a-Service},
  journal   = {PVLDB},
  volume    = {13},
  year      = {2020},
  url       = {https://arxiv.org/abs/2001.04592},
}

[KDD20] HOLMES: Health OnLine Model Ensemble Serving for Deep Learning Models in Intensive Care Units
Shenda Hong, Yanbo Xu, Alind Khare, Satria Priambada, Kevin Maher, Alaa Aljiffry, Jimeng Sun, Alexey Tumanov.
In Proc. of Knowledge Discovery and Data Mining (KDD'20), Aug 2020.
[Video] [arXiv] [PDF] [BibTeX]
@inproceedings{holmes-kdd20,
author = {Hong, Shenda and Xu, Yanbo and Khare, Alind and Priambada, Satria and 
          Maher, Kevin and Aljiffry, Alaa and Sun, Jimeng and Tumanov, Alexey},
title = {HOLMES: Health OnLine Model Ensemble Serving for 
         Deep Learning Models in Intensive Care Units},
year = {2020},
isbn = {9781450379984},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3394486.3403212},
doi = {10.1145/3394486.3403212},
booktitle = {Proceedings of the 26th ACM SIGKDD International Conference 
             on Knowledge Discovery & Data Mining},
pages = {1614–1624},
numpages = {11},
keywords = {health informatics, data mining system, healthcare, software},
series = {KDD '20}
}

[SoCC19] HyperSched: Dynamic Resource Reallocation for Model Development on a Deadline.
Richard Liaw, Romil Bhardwaj, Lisa Dunlap, Yitian Zou, Joseph Gonzalez, Ion Stoica, Alexey Tumanov.
In Proc. of the 10th ACM Symposium on Cloud Computing, SoCC'19, Nov 2019.
[arXiv] [PDF] [BibTeX]
@inproceedings{Liaw:2019:HDR:3357223.3362719,
 author = {Liaw, Richard and Bhardwaj, Romil and Dunlap, Lisa and Zou, Yitian and 
           Gonzalez, Joseph E. and Stoica, Ion and Tumanov, Alexey},
 title = {HyperSched: Dynamic Resource Reallocation for Model Development on a Deadline},
 booktitle = {Proceedings of the ACM Symposium on Cloud Computing},
 series = {SoCC '19},
 year = {2019},
 isbn = {978-1-4503-6973-2},
 location = {Santa Cruz, CA, USA},
 pages = {61--73},
 numpages = {13},
 url = {http://doi.acm.org/10.1145/3357223.3362719},
 doi = {10.1145/3357223.3362719},
 acmid = {3362719},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {Distributed Machine Learning, Hyperparameter Optimization, Machine Learning Scheduling},
}

[SoCC19] Cirrus: A Serverless Framework for End-to-end ML Workflows.
Joao Carreira, Pedro Fonseca, Alexey Tumanov, Andrew Zhang, Randy Katz.
In Proc. of the 10th ACM Symposium on Cloud Computing, SoCC'19, Nov 2019.
[Abstract] [PDF] [BibTeX]
@inproceedings{serverlessml-socc19,
    author = {Joao Carreira and Pedro Fonseca and Alexey Tumanov
              and Andrew Zhang and Randy Katz},
    title = {Cirrus: A Serverless Framework for End-to-end ML Workflows},
    booktitle = {Proc. of the 10th ACM Symposium on Cloud Computing},
    series = {SOCC '19},
    year = {2019},
    Publisher = {ACM},
}

[SOSP19] Lineage Stash: Fault Tolerance Off the Critical Path.
Stephanie Wang, John Liagouris, Robert Nishihara, Philipp Moritz, Ujval Misra, Alexey Tumanov, Ion Stoica.
In Proc. of the Symposium on Operating Systems Principles, SOSP'19, Oct 2019.
[Abstract] [PDF] [BibTeX]
@inproceedings{lineage-sosp19,
    title = {Lineage Stash: Fault Tolerance Off the Critical Path},
    booktitle = {Proc. of SOSP'19},
    series = {SOSP '19},
    year = {2019},
    Publisher = {ACM},
}

[CIDR19] Serverless Computing: One Step Forward, Two Steps Back
Joseph M. Hellerstein, Jose Faleiro, Joseph E. Gonzalez, Johann Schleier-Smith, Vikram Sreekanti, Alexey Tumanov, Chenggang Wu.
In Proc. of Conference on Innovative Data Systems Research (CIDR'19), Jan, 2019.
[arXiv] [PDF] [HackerNews] [BibTeX]

        

[MLSYS18] A Case for Serverless Machine Learning
Joao Carreira, Pedro Fonseca, Alexey Tumanov, Andrew Zhang, Randy Katz.
In Proc. of Workshop on Systems for ML at NIPS 2018 (MLSYS'18), Dec 2018.
[Abstract] [PDF] [BibTeX]

        

[MLSYS18] Dynamic Space-Time Scheduling for GPU Inference
Paras Jain, Xiangxi Mo, Ajay Jain, Harikaran Subbaraj, Rehan Sohail Durrani, Alexey Tumanov, Joseph Gonzalez, Ion Stoica.
In Proc. of Workshop on Systems for ML at NIPS 2018 (MLSYS'18), Dec 2018.
[arXiv] [PDF] [BibTeX]
@article{gpusched-mlsys18,
  author    = {Paras Jain and Xiangxi Mo and Ajay Jain and 
               Harikaran Subbaraj and Rehan Sohail Durrani and
               Alexey Tumanov and Joseph Gonzalez and Ion Stoica},
  title     = {Dynamic Space-Time Scheduling for GPU Inference},
  journal   = {CoRR},
  volume    = {abs/1901.00041},
  year      = {2018},
  url       = {https://arxiv.org/abs/1901.00041},
  month     = {Dec},
  archivePrefix = {arXiv}
}

[OSDI18] Ray: A Distributed Framework for Emerging AI Applications
Philipp Moritz*, Robert Nishihara*, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I. Jordan, Ion Stoica.
In Proc. of Usenix OSDI'18, Oct, 2018.
[Abstract] [arXiv] [PDF] [BibTeX]
@inproceedings{ray-osdi18,
  title = {{Ray}: A Distributed Framework for Emerging {AI} Applications},
  author = {Philipp Moritz and Robert Nishihara and Stephanie Wang and Alexey Tumanov and 
            Richard Liaw and Eric Liang and Melih Elibol and Zongheng Yang and 
            William Paul and Michael I. Jordan and Ion Stoica}
  booktitle = {13th {USENIX} Symposium on Operating Systems Design and Implementation 
             ({OSDI} 18)},
  year = {2018},
  address = {Carlsbad, CA},
  url = {https://www.usenix.org/conference/osdi18/presentation/nishihara},
  publisher = {{USENIX} Association},
}

[UAI18] IDK Cascades: Fast Deep Learning by Learning not to Overthink
Xin Wang, Yujia Luo, Daniel Crankshaw, Alexey Tumanov, Fisher Yu, Joseph E. Gonzalez. In Proc. of Uncertainty in Artificial Intelligence (UAI'18), August, 2018. [arXiv] [PDF]

[ATC18] Tributary: spot-dancing for elastic services with latency SLOs
Aaron Harlap, Andrew Chung, Alexey Tumanov, Gregory R. Ganger, Phillip B. Gibbons
In Proc. of USENIX ATC'18, Boston, MA, July 2018.
[Abstract] [PDF] [BibTeX]
@inproceedings{tributary-atc18,
author = {Aaron Harlap and Andrew Chung and Alexey Tumanov and 
          Gregory R. Ganger and Phillip B. Gibbons},
title = {Tributary: spot-dancing for elastic services with latency SLOs},
booktitle = {2018 {USENIX} Annual Technical Conference ({USENIX} {ATC} 18)},
year = {2018},
isbn = {978-1-931971-44-7},
address = {Boston, MA},
pages = {1--14},
url = {https://www.usenix.org/conference/atc18/presentation/harlap},
publisher = {{USENIX} Association},
}

[EuroSys18] 3Sigma: distribution-based cluster scheduling for runtime uncertainty
Jun Woo Park, Alexey Tumanov, Angela Jiang, Michael A. Kozuch, Gregory R. Ganger.
In Proc. of EuroSys'18, Porto, Portugal, April 2018.
[Abstract] [PDF] [BibTeX]
@inproceedings{3sigma-eurosys18,
 author = {Park, Jun Woo and Tumanov, Alexey and Jiang, Angela and 
           Kozuch, Michael A. and Ganger, Gregory R.},
 title = {3Sigma: Distribution-based Cluster Scheduling for Runtime Uncertainty},
 booktitle = {Proceedings of the Thirteenth EuroSys Conference},
 series = {EuroSys '18},
 year = {2018},
 isbn = {978-1-4503-5584-1},
 location = {Porto, Portugal},
 pages = {2:1--2:17},
 articleno = {2},
 numpages = {17},
 url = {http://doi.acm.org/10.1145/3190508.3190515},
 doi = {10.1145/3190508.3190515},
 acmid = {3190515},
 publisher = {ACM},
 address = {New York, NY, USA},
}

[HotOS17] Real-Time Machine Learning: the Missing Pieces
Robert Nishihara, Philipp Moritz, Stephanie Wang, Alexey Tumanov, William Paul, Johann Schleier-Smith, Richard Liaw, Michael I. Jordan, Ion Stoica
In Proc. of HotOS XVI, May 2017.
[Abstract] [PDF] [BibTeX]
@inproceedings{ray-hotos17,
 author = {Nishihara, Robert and Moritz, Philipp and Wang, Stephanie and Tumanov, Alexey 
           and Paul, William and Schleier-Smith, Johann and Liaw, Richard 
           and Niknami, Mehrdad and Jordan, Michael I. and Stoica, Ion},
 title = {Real-Time Machine Learning: The Missing Pieces},
 booktitle = {Proceedings of the 16th Workshop on Hot Topics in Operating Systems},
 series = {HotOS '17},
 year = {2017},
 isbn = {978-1-4503-5068-6},
 location = {Whistler, BC, Canada},
 pages = {106--110},
 numpages = {5},
 url = {http://doi.acm.org/10.1145/3102980.3102998},
 doi = {10.1145/3102980.3102998},
 acmid = {3102998},
 publisher = {ACM},
 address = {New York, NY, USA},
}

[EuroSys17] Proteus: agile ML elasticity through tiered reliability in dynamic resource markets
Aaron Harlap, Alexey Tumanov, Andrew Chung, Gregory R. Ganger, Phil Gibbons
In Proc. of EuroSys'17, Apr 2017.
[Abstract] [PDF] [BibTeX]
@inproceedings{proteus-eurosys17,
 author = {Harlap, Aaron and Tumanov, Alexey and Chung, Andrew and 
           Ganger, Gregory R. and Gibbons, Phillip B.},
 title = {Proteus: Agile ML Elasticity Through Tiered Reliability 
          in Dynamic Resource Markets},
 booktitle = {Proceedings of the Twelfth European Conference on Computer Systems},
 series = {EuroSys '17},
 year = {2017},
 isbn = {978-1-4503-4938-3},
 location = {Belgrade, Serbia},
 pages = {589--604},
 numpages = {16},
 url = {http://doi.acm.org/10.1145/3064176.3064182},
 doi = {10.1145/3064176.3064182},
 acmid = {3064182},
 publisher = {ACM},
 address = {New York, NY, USA},
}

[OSDI16] Morpheus: Towards Automated SLOs for Enterprise Clusters
S. Jyothi, C. Curino, I. Menache, S. Narayanamurthy, A. Tumanov, J. Yaniv, R. Mavlyutov, I. Goiri, S. Krishnan, J. Kulkarni, S. Rao
In Proc. of USENIX OSDI'16, Nov 2016.
[Abstract] [PDF] [Slides] [BibTeX]
@inproceedings{morpheus-osdi16,
  author = {Sangeetha Abdu Jyothi and Carlo Curino 
and Ishai Menache and Shravan Matthur Narayanamurthy 
and Alexey Tumanov and Jonathan Yaniv and Ruslan Mavlyutov 
and Inigo Goiri and Subru Krishnan and Janardhan Kulkarni and Sriram Rao},
  title = {Morpheus: Towards Automated SLAs for Enterprise Clusters},
  booktitle = {Proc. of the 12th USENIX OSDI (OSDI'16)},
  year = {2016},
  address = {GA},
  url = {https://www.usenix.org/conference/osdi16/technical-sessions/presentation/jyothi},
  publisher = {USENIX Association},
}

[EuroSys16] TetriSched: global rescheduling with adaptive plan-ahead in dynamic heterogeneous clusters. [best student paper]
Alexey Tumanov, Timothy Zhu, Jun Woo Park, Michael A. Kozuch, Mor Harchol-Balter, Gregory R. Ganger.
In Proc. of EuroSys'16, London, UK, April 2016.
[Abstract] [PDF] [Slides] [BibTeX]
@inproceedings{tetrisched,
 author = {Alexey Tumanov and Timothy Zhu and Jun Woo Park and Michael A. Kozuch
and Mor Harchol-Balter and Gregory R. Ganger},
 title = {{T}etri{S}ched: global rescheduling with adaptive plan-ahead in dynamic
heterogeneous clusters},
 booktitle = {Proc. of the 11th European Conference on Computer Systems},
 series = {EuroSys '16},
 year = {2016},
 month = {Apr},
 location = {London, UK},
 Publisher = {ACM},
}

[SoCC14] PriorityMeister: Tail Latency QoS for Shared Networked Storage.
Timothy Zhu, Alexey Tumanov, Michael A. Kozuch, Mor Harchol-Balter, Gregory R. Ganger.
In Proc. of the 5th ACM Symposium on Cloud Computing, SoCC'14, Nov 2014.
[Abstract] [PDF] [BibTeX]
@inproceedings{pm-socc14,
    author = {Timothy Zhu and Alexey Tumanov and Michael A. Kozuch and
              Mor Harchol-Balter and Gregory R. Ganger},
    title = {{P}riority{M}eister: Tail Latency QoS for Shared Networked Storage},
    booktitle = {Proc. of the 5th ACM Symposium on Cloud Computing},
    series = {SOCC '14},
    year = {2014},
    location = {Seattle, WA},
    Publisher = {ACM},
}

[SoCC14] Exploiting iterative-ness for parallel ML computations.
Henggang Cui, Alexey Tumanov, Jinliang Wei, Lianghong Xu, Wei Dai, Jesse Haber-Kucharsky, Qirong Ho, Gregory R. Ganger, Phil B. Gibbons, Garth A. Gibson, Eric P. Xing.
In Proc. of the 5th ACM Symposium on Cloud Computing, SoCC'14, Nov 2014.
[Abstract] [PDF] [BibTeX]

        

[TOS14] Agility and performance in elastic distributed storage.
Lianghong Xu, James Cipar, Elie Krevat, Alexey Tumanov, Nitin Gupta, Michael A. Kozuch, and Gregory R. Ganger.
Trans. Storage, 10(4):16:1– 16:27, October 2014.
[Abstract] [PDF] [BibTeX]
@article{Xu:2014,
 author = {Xu, Lianghong and Cipar, James and Krevat, Elie and Tumanov, Alexey 
           and Gupta, Nitin and Kozuch, Michael A. and Ganger, Gregory R.},
 title = {Agility and Performance in Elastic Distributed Storage},
 journal = {Trans. Storage},
 issue_date = {October 2014},
 volume = {10},
 number = {4},
 month = oct,
 year = {2014},
 issn = {1553-3077},
 pages = {16:1--16:27},
 articleno = {16},
 numpages = {27},
 url = {http://doi.acm.org/10.1145/2668129},
 doi = {10.1145/2668129},
 acmid = {2668129},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {Cloud storage, agility, distributed file systems, elastic storage, power, write offloading},
}

[FAST14] SpringFS: Bridging Agility and Performance in Elastic Distributed Storage
Lianghong Xu, James Cipar, Elie Krevat, Alexey Tumanov, Nitin Gupta, Michael A. Kozuch, Gregory R. Ganger.
In Proc. of Usenix FAST’14, Feb 2014.
[Abstract] [PDF] [BibTeX][acceptance: 18%]
@inproceedings{springfs,
    author = {Lianghong Xu and James Cipar and Elie Krevat and Alexey Tumanov
              and Nitin Gupta and Michael A. Kozuch and Gregory R. Ganger},
    title = {SpringFS: Bridging Agility and Performance in Elastic Distributed Storage},
    booktitle = {Proc. of the 12th USENIX FAST},
    year = {2014},
    isbn = {ISBN 978-1-931971-08-9},
    location = {Santa Clara, CA},
    pages = {243-255},
    publisher = {USENIX},
    address = {Berkeley, CA}
}

[TR13] TetriSched: Space-Time Scheduling for Heterogeneous Datacenters.
Alexey Tumanov, Timothy Zhu, Michael A. Kozuch, Mor Harchol-Balter, Gregory R. Ganger.
Carnegie Mellon University PDL Technical Report CMU-PDL-13-112, Dec 2013.
[Abstract] [PDF] [BibTeX]
@techreport{tetrischedTR,
    Author = { Alexey Tumanov and Timothy Zhu and Michael A. Kozuch and 
               Mor Harchol-Balter and Gregory R. Ganger },
    Title = {{T}etri{S}ched: Space-Time Scheduling for Heterogeneous Datacenters},
    Institution = {Carnegie Mellon University},
    Year = {2013},
    Month = {Dec},
    URL = {http://www.pdl.cmu.edu/PDL-FTP/CloudComputing/CMU-PDL-13-112_abs.shtml},
    Number = {CMU-PDL-13-112},
}

[SFMA13] Asymmetry-aware execution placement on manycore chips.
Alexey Tumanov, Joshua Wise, Onur Mutlu, Gregory R. Ganger.
In Proc. of the 3rd Workshop on Systems for Future Multicore Architectures (SFMA'13), EuroSys'13, April 2013.
[Abstract] [PDF] [BibTeX]
@inproceedings{atumanov-sfma13,
    author = {Alexey Tumanov and Joshua Wise and Onur Mutlu and Gregory R. Ganger},
    title = {Asymmetry-aware execution placement on manycore chips},
    booktitle = {Proc. of the 3rd Workshop on Systems for 
                 Future Multicore Architectures (SFMA'13)},
    series = {SFMA '13},
    year = {2013},
    location = {Prague, Czech Republic},
}

[SoCC12] alsched: Algebraic Scheduling of Mixed Workloads in Heterogeneous Clouds"
Alexey Tumanov, James Cipar, Michael A. Kozuch, Gregory R. Ganger.
In Proc. of the 3rd ACM Symposium on Cloud Computing, SoCC'12, Oct 2012.
[Abstract] [PDF] [BibTeX] [acceptance: 15%]
@inproceedings{alsched-socc12,
    author = {Alexey Tumanov and James Cipar and Michael A. Kozuch and 
              Gregory R. Ganger},
    title = {{a}lsched: algebraic scheduling of mixed workloads in heterogeneous clouds},
    booktitle = {Proc. of the 3rd ACM Symposium on Cloud Computing},
    series = {SOCC '12},
    year = {2012},
    location = {San Jose, CA},
    Publisher = {ACM},
}

[SoCC12] Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis [Test of Time, SoCC'21]
Charles Reiss, Alexey Tumanov, Gregory R. Ganger, Randy H. Katz, Michael A. Kozuch.
In Proc. of the 3rd ACM Symposium on Cloud Computing, SoCC'12, Oct 2012.
[Abstract] [PDF] [BibTeX] [acceptance: 15%]
@inproceedings{gtrace-socc12,
 author     = {Charles Reiss and Alexey Tumanov and Gregory R. Ganger and
            Randy H. Katz and Michael A. Kozuch},
 title      = {Heterogeneity and Dynamicity of Clouds at Scale: {G}oogle Trace
          Analysis},
 booktitle  = {Proc. of the 3nd ACM Symposium on Cloud Computing},
 series     = {SOCC '12},
 year       = {2012},
 location   = {San Jose, CA},
}

[EuroSys11] Kaleidoscope: Cloud Micro-Elasticity via VM State Coloring
Roy Bryant, Alexey Tumanov, Olga Irzak, Adin Scannell, Kaustubh Joshi, Matti Hiltunen, H. Andrés Lagar-Cavilla, Eyal de Lara.
In Proc. of EuroSys'11, Salzburg, Austria, April 2011.
[Abstract][PDF][BibTeX][acceptance: 15%]
@inproceedings{BryantEurosys11,
  author =       "Roy Bryant and Alexey Tumanov and Olga Irzak and 
                  Adin Scannell and Kaustubh Joshi and Matti Hiltunen
                  and H. Andr\'es Lagar-Cavilla and Eyal de Lara",
  title =        "{Kaleidoscope: Cloud Micro-Elasticity via VM State Coloring}",
  booktitle =    "{Proc. of Eurosys 2011}",
  address =      "{Salzburg, Austria}",
  month =        apr,
  year =         2011
}

[IEEEVR07] Variability-Aware Latency Amelioration in Distributed Environments
Alexey Tumanov, Robert Allison, Wolfgang Stuerzlinger.
In Proc. of IEEE Virtual Reality Conference, 2007, pp. 123-130, March 2007.
[Abstract][PDF][BibTeX][acceptance: 20%]
@INPROCEEDINGS{Tumanov-ieeevr07,
  author={Alexey Tumanov and Robert Allison and Wolfgang Stuerzlinger},
  booktitle={Proc. of IEEE Virtual Reality Conference, 2007.},
  series = {IEEE VR'2007},
  title={Variability-Aware Latency Amelioration in Distributed Environments},
  year={2007},
  month={March},
  volume={},
  number={},
  pages={123--130},
  location={Charlotte, NC},
  doi={10.1109/VR.2007.352472},
}

Variability-Aware Latency Amelioration in Distributed Interactive Virtual Environments
Alexey Tumanov
M.Sc. Thesis, York University, Toronto, Canada, April 2006.
[PDF][BibTeX]
@mastersthesis{Tumanov-mscthesis,
  author = {Alexey Tumanov},
  title = {Variability-aware latency amelioration in distributed interactive
           virtual environments},
  school = {York University},
  address = {Toronto, Canada},
  year = {2006},
  month = {April},
}

Preprints

The OoO VLIW JIT Compiler for GPU Inference
Paras Jain, Xiangxi Mo, Ajay Jain, Alexey Tumanov, Joseph E. Gonzalez, Ion Stoica
arXiv CoRR preprint, abs/1901.10008, 2019.
[Abstract] [PDF] [BibTeX]
@article{gpu-vliw-scheduler,
  author    = {Paras Jain and
               Xiangxi Mo and
               Ajay Jain and
               Alexey Tumanov and
               Joseph E. Gonzalez and
               Ion Stoica},
  title     = {The OoO {VLIW} {JIT} Compiler for {GPU} Inference},
  journal   = {CoRR},
  volume    = {abs/1901.10008},
  year      = {2019},
  url       = {http://arxiv.org/abs/1901.10008},
  archivePrefix = {arXiv},
  eprint    = {1901.10008},
  timestamp = {Sat, 02 Feb 2019 16:56:00 +0100},
  biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1901-10008},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

JamaisVu: Robust Scheduling with Auto-Estimated Job Runtimes
Alexey Tumanov, Angela Jiang, Jun Woo Park, Michael A. Kozuch, Gregory R. Ganger.
Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-16-104. September 2016. [Abstract] [PDF] [BibTeX]
@techreport{jamaisvu-tr201609,
    author = {Alexey Tumanov and Angela Jiang and Jun Woo Park and 
              Michael A. Kozuch and Gregory R. Ganger},
    title = {{JamaisVu: Robust Scheduling with Auto-Estimated Job Runtimes}},
    institution = {{Carnegie Mellon University}},
    year = {2016},
    month = {September},
    number = {CMU-PDL-16-104}
}

Patents

Tagging a copy of memory of a virtual machine with information for fetching of relevant portions of the memory
Horacio Andres Lagar-cavilla, Roy Bryant, Matti Hiltunen, Olga Irzak, Kaustubh Joshi, Adin Matthew Scannell, Alexey Tumanov, Eyal De Lara.
US Patent 9250969. Granted Feb 2, 2016.

Teaching

AY23-24

AY22-23

AY21-22

AY20-21

AY19-20

Academic Service

Organizing Committee:

Program Committee: SOSP24, OSDI24, MLSys24, ICLR24, Neurips23, OSDI21, MLSys2021, ACM SoCC 2020, USENIX ATC 2020, ACM SoCC 2017.

Reviewer: ACM SoCC 2013, ACM SIGMETRICS 2014, IEEE/ACM MICRO 2014, SoCC 2017, SOSP 2019.