Rajeev Thakur's Publications
- Jiajun Huang, Sheng Di, Xiaodong Yu, Yujia Zhai, Jinyang Liu,
Zizhe Jian, Xin Liang, Kai Zhao, Xiaoyi Lu, Zizhong Chen, Franck
Cappello, Yanfei Guo, and Rajeev Thakur, "hZCC: Accelerating
Collective Communication with Co-designed Operation-supported
Compression," in Proc. of SC24: Int'l
Conference for High Performance Computing, Networking, Storage and
Analysis, November 2024.
- Hui Zhou, Robert Latham, Ken Raffenetti, Yanfei Guo, and
Rajeev Thakur, "MPI Progress for All," in Proc. of the
Workshop on Extreme Scale MPI (ExaMPI) at SC24, November 2024.
- Logan Ward, Gregory Pauloski, Valerie Hayot-Sasson, Yadu Babuji,
Alexander Brace, Ryan Chard, Kyle Chard, Rajeev Thakur, and Ian
Foster, "Employing Artificial Intelligence to Steer Exascale
Workflows with Colmena," Int'l Journal of High
Performance Computing Applications, November 2024.
- Michael Heroux, Lois Curfman McInnes, James Ahrens, Todd Gamblin, Timothy
Germann, Xiaoye Sherry, Kathryn Mohror, Todd Munson, Sameer Shende, Rajeev
Thakur, Jeffrey Vetter, and James Willenbring, "ECP Libraries and
Tools: An Overview," Int'l Journal of High
Performance Computing Applications, Vol. 38, Issue 5, September 2024.
(pdf)
- Hui Zhou, Ken Raffenetti, Yanfei Guo, Thomas Gillis, Robert
Latham, Rajeev Thakur, "Designing and prototyping extensions to the
Message Passing Interface in MPICH," Int'l Journal of
High Performance Computing Applications, Vol. 38, Issue 5, August 2024.
(pdf)
- Jiajun Huang, Sheng Di, Xiaodong Yu, Yujia Zhai, Jinyang Liu, Yafan
Huang, Ken Raffenetti, Hui Zhou, Kai Zhao, Xiaoyi Lu, Zizhong Chen, Franck
Cappello, Yanfei Guo, and Rajeev Thakur, "gZCCL:
Compression-Accelerated Collective Communication Framework for GPU
Clusters," in Proc. of the 38th ACM Int'l
Conference on Supercomputing (ICS '24), June 2024.
(pdf)
- Jiajun Huang, Sheng Di, Xiaodong Yu, Yujia Zhai, Zhaorui Zhang,
Jinyang Liu, Xiaoyi Lu, Ken Raffenetti, Hui Zhou, Kai Zhao, Zizhong
Chen, Franck Cappello, Yanfei Guo, and Rajeev Thakur, "An Optimized
Error-controlled MPI Collective Framework Integrated with Lossy
Compression," in Proc. of 38th IEEE Int'l
Parallel & Distributed Processing Symposium (IPDPS 2024), May
2024.
(pdf)
- Murali Emani, Sam Foreman, Varuni Sastry, Zhen Xie, Siddhisanket
Raskar, William Arnold, Rajeev Thakur, Venkatram Vishwanath, Michael
Papka, Sanjif Shanmugavelu, Darshan Gandhi, Hengyu Zhao, Dun Ma,
Kiran Ranganath, Rick Weisner, Jiunn-yeu Chen, Yuting Yang, Natalia
Vassilieva, Bin Zhang, Sylvia Howland, Alexander Tsyplikhin,
"Toward a Holistic Performance Evaluation of Large Language Models
Across Diverse AI Accelerators," in Proc. of the 33rd
Heterogeneity in Computing Workshop (HCW) at IPDPS 2024), May 2024.
(pdf)
- Michael Wilkins, Hanming Wang, Peizhi Liu, Bangyen Pham, Yanfei
Guo, Rajeev Thakur, Nikos Hardavellas,
Peter Dinda, "Generalized Collective Algorithms for the Exascale Era," in Proc. of the IEEE Int'l
Conference on Cluster Computing (Cluster 2023), October 2023.
(pdf)
- Jiajun Huang, Kaiming Ouyang, Yujia Zhai, Jinyang Liu, Min Si,
Ken Raffenetti, Hui Zhou, Atsushi Hori, Zizhong Chen, Yanfei Guo,
Rajeev Thakur, "PiP-MColl: Process-in-Process-based Multi-object
MPI Collectives," in Proc. of the IEEE Int'l
Conference on Cluster Computing (Cluster 2023), October 2023.
(pdf)
- Hui Zhou, Ken Raffenetti, Junchao Zhang, Yanfei Guo, Rajeev
Thakur, "Frustrated With MPI+Threads? Try MPIxThreads!," in Proc. of EuroMPI 2023, September 2023. (pdf)
- Thomas Gillis, Ken Raffenetti, Hui Zhou, Yanfei Guo, Rajeev
Thakur, "Quantifying the Performance Benefits of Partitioned
Communication in MPI," in Proc. of the 52nd Int'l Conference on Parallel Processing (ICPP), August 2023. (pdf)
- Logan Ward, J. Gregory Pauloski, Valerie Hayot-Sasson, Ryan
Chard, Yadu Babuji, Ganesh Sivaraman, Sutanay Choudhury, Kyle Chard,
Rajeev Thakur, Ian Foster, "Cloud Services Enable Efficient
AI-Guided Simulation Workflows across Heterogeneous Resources," in
Proc. of the Heterogeneity in Computing (HCW) Workshop at IPDPS
2023, May 2023. (pdf)
- M. Emani et al, "A Comprehensive Evaluation of Novel AI Accelerators for Deep Learning Workloads," in Proc. of 13th IEEE Int'l Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS22) (held in
conjunction with SC22), November 2022. (pdf)
- Michael Wilkins, Yanfei Guo, Rajeev Thakur, Nikos Hardavellas, Peter Dinda, "ACCLAiM: Advancing the Practicality of MPI Collective Communication Autotuning Using Machine Learning," in Proc. of the IEEE Int'l Conference on Cluster Computing (Cluster 2022), September 2022. (pdf)
- Hui Zhou, Ken Raffenetti, Yanfei Guo, Rajeev Thakur, "MPIX Stream: An Explicit Solution to Hybrid MPI+X Programming," in Proc. of EuroMPI/USA 2022, September 2022. (pdf)
- Michael Wilkins, Yanfei Guo, Rajeev Thakur, Nikos
Hardavellas, Peter Dinda, Min Si, "A FACT-based Approach: Making
Machine Learning Collective Autotuning Feasible on Exascale Systems,"
in Proc. of ExaMPI21: Workshop on Exascale MPI (held in
conjunction with SC21), November 2021. (pdf)
- Logan Ward, Ganesh Sivaraman, Gregory Pauloski, Yadu Babuji,
Ryan Chard, Naveen Dandu, Paul Redfern, Rajeev Assary, Kyle Chard, Larry Curtiss,
Rajeev Thakur, Ian Foster, "Colmena: Scalable Machine-Learning-Based
Steering of Ensemble Simulations for High Performance Computing," in
Proc. of 7th Workshop on Machine Learning in High Performance
Computing Environments (MLHPC) (held in conjunction with SC21),
November 2021. (pdf)
- Anshu Dubey, Lois Curfman Mcinnes, Rajeev Thakur, Erik Draeger,
Thomas Evans, Timothy Germann, William Hart, "Performance Portability
in the Exascale Computing Project: Exploration Through a Panel
Series," Computing in Science & Engineering, (23)5:46–54,
September/October 2021. (pdf)
- F. Alexander et al, "Co-design Center for Exascale Machine Learning Technologies
(ExaLearn)," Int'l Journal of High Performance Computing
Applications, (35)6:598-616, November 2021. (pdf)
- William Gropp, Rajeev Thakur, and Pavan Balaji, "Translational research in the MPICH project," Journal of Computational Science, Vol. 52, May 2021.
(pdf)
- Michael Heroux, Lois Curfman McInnes, Rajeev Thakur, Jeffrey
Vetter, Sherry Li, James Ahrens, Todd Munson, Kathryn Mohror, "ECP
Software Technology Capability Assessment Report," Version 2.5,
November 2020. (pdf)
- Anthony Kougkas, Hassan Eslami, Xian-He Sun, Rajeev Thakur, and
William Gropp, "Rethinking Key–Value Store for Parallel I/O
Optimization," Int'l Journal of High Performance Computing
Applications, 31(4):335-356, July 2017.
(pdf)
- James Dinan, Pavan Balaji, Darius Buntinas, David Goodell, William
Gropp, Rajeev Thakur, "An Implementation and Evaluation of the MPI
3.0 One-Sided Communication Interface," Concurrency
and Computation: Practice and Experience, 28(17):4385-4404,
December 2016. (pdf)
- Yong Chen, Chao Chen, Yanlong Yin, Xian-He Sun, Rajeev Thakur,
William Gropp, "Rethinking High Performance Computing System
Architecture for Scientific Big Data Applications," in
Proc. of the 14th IEEE Int'l Symposium on Parallel
and Distributed Processing with Applications (ISPA-16), August 2016. (pdf)
- Ashwin M. Aji, Lokendra S. Panwar, Feng Ji, Karthik Murthy,
Milind Chabbi, Pavan Balaji, Keith R. Bisset, James Dinan, Wu-chun
Feng, John Mellor-Crummey, Xiaosong Ma, and Rajeev Thakur, "MPI-ACC:
Accelerator-Aware MPI for Scientific Applications," IEEE
Transactions on Parallel and Distributed Systems, 27(5):1401-1414,
May 2016. (pdf)
- William Gropp and Rajeev Thakur, "Message Passing Interface,"
Programming Models for Parallel Computing, MIT Press, Ch. 1,
pp. 1-21, 2015. (link)
- Hassan Eslami, Anthony Kougkas, Maria Kotsifakou, Theodoros
Kasampalis, Kun Feng, Yin Lu, William Gropp, Xian-He Sun, Yong Chen,
Rajeev Thakur, "Efficient Disk-to-Disk Sorting: A Case Study in the Decoupled
Execution Paradigm," in Proc. of the 2015 Int'l Workshop on
Data-Intensive Scalable Computing Systems (DISCS) (held in
conjunction with SC15), November 2015. (pdf)
- Swann Perarnau, Rajeev Thakur, Kamil Iskra, Ken Raffenetti,
Franck Cappello, Rinku Gupta, Pete Beckman, Marc Snir, Henry
Hoffmann, Martin Schulz, and Barry Rountree, "Distributed
Monitoring and Management of Exascale Systems in the Argo Project,"
in Proc. of the 15th IFIP Int'l Conference on
Distributed Applications and Interoperable Systems (DAIS), June
2015. (pdf)
- Torsten Hoefler, James Dinan, Rajeev Thakur, Brian Barrett, Pavan
Balaji, and William Gropp, "Remote Memory Access Programming in
MPI-3," ACM Transactions on Parallel Computing, Vol. 7, No. 2,
pp. 9:1-9:26, July 2015. (pdf)
- Yong Chen, Yin Lu, Prathamesh Amritkar, Rajeev Thakur, and Yu
Zhuang, "Performance Model Directed Data Sieving for High
Performance I/O," Journal of Supercomputing, 71(6):2066-2090, June 2015. (pdf)
- Seong Jo Kim, Yuanrui Zhang, Seung Woo Son, Mahmut Kandemir,
Wei-keng Liao, Rajeev Thakur, Alok Choudhary, "IOPro: A Parallel
I/O Profiling and Visualization Framework for High-Performance
Storage Systems," Journal of Supercomputing, 71(3):840-870,
March 2015. (pdf)
- Yin Lu, Yong Chen, Yu Zhuang, Jialin Liu, Rajeev
Thakur, "Collective Input/Output under Memory Constraints,"
Int'l Journal of High Performance Computing
Applications, 29(1):21-36, February 2015.
(pdf)
- James Dinan, Ryan E. Grant, Pavan Balaji, David Goodell, Douglas
Miller, Marc Snir, and Rajeev Thakur, "Enabling Communication
Concurrency Through Flexible MPI Endpoints," Int'l Journal of
High Performance Computing Applications, 28(4):390-405, November 2014.
(pdf)
- Yanlong Yin, Antonios Kougkas, Kun Feng, Hassan Eslami, Yin Lu,
Xian-He Sun, Rajeev Thakur, and William Gropp, "Rethinking Key-Value
Store for Parallel I/O Optimization," in Proc. of the
3rd Int'l Workshop on Data-Intensive Scalable Computing
Systems (DISCS) (held in conjunction with SC14), November 2014. (pdf)
- Wei-keng Liao and Rajeev Thakur, "MPI-IO,"
High Performance Parallel I/O, CRC Press,
Ch. 13, pp. 157-169, 2014.
- John Jenkins, James Dinan, Pavan Balaji, Tom
Peterka, Nagiza F. Samatova, and Rajeev Thakur, "Processing MPI Derived
Datatypes on Noncontiguous GPU-Resident Data," IEEE
Transactions on Parallel and Distributed Systems, 25(10):2627-2637,
October 2014. (pdf)
- Anshu Dubey, Steve R. Brandt, Richard Brower, Merle Giles, Paul
Hovland, Donald Q. Lamb, Frank Loffler, Boyana Norris, Brian
W. O'Shea, Claudio Rebbi, Marc Snir, Rajeev Thakur, and Petros
Tzeferacos, "Software Abstractions and Methodologies for HPC
Simulation Codes on Future Architectures," Journal of Open
Research Software, 2(1):e14, pp. 1-5, 2014. (pdf)
- Chao Chen, Yong Chen, Kun Feng, Yanlong Yin, Hassan Eslami,
Rajeev Thakur, Xian-He Sun, and William D. Gropp, "Decoupled I/O
for Data-Intensive High Performance Computing," in Proceedings
of the 7th Int'l Workshop on Parallel Programming Models
and Systems Software for High-End Computing (P2S2), September
2014. (pdf)
- Torsten Hoefler, James Dinan, Darius Buntinas, Pavan Balaji,
Brian Barrett, Ron Brightwell, William Gropp, Vivek Kale, and
Rajeev Thakur, "MPI+MPI: A New, Hybrid Approach to Parallel
Programming with MPI Plus Shared Memory Computing,"
Computing, 95(12):1121–1136, December 2013. (pdf)
- Xin Zhao, Pavan Balaji, William Gropp, and Rajeev Thakur,
"Optimization Strategies for MPI-Interoperable Active Messages,"
in Proc. of the 13th IEEE Int'l Conference on Scalable Computing and
Communication (ScalCom 2013), December 2013. (pdf) (Best Paper Award)
- Xin Zhao, Pavan Balaji, William Gropp, and Rajeev Thakur,
"MPI-Interoperable Generalized Active Messages," in
Proc. of 19th IEEE Int'l Conference on Parallel and
Distributed Systems (ICPADS'13), December 2013. (pdf)
- James Dinan, Pavan Balaji, David Goodell, Douglas Miller, Marc
Snir, and Rajeev Thakur, "Enabling MPI Interoperability Through
Flexible Communication Endpoints," in Proc. of the 20th
European MPI Users' Group Meeting (EuroMPI 2013), September 2013. (pdf)
- Antonio J. Peña, Ralf G. Correa Carvalho, James Dinan, Pavan
Balaji, Rajeev Thakur, and William Gropp, "Analysis of
Topology-Dependent MPI Performance on Gemini Networks," in
Proc. of the 19th European MPI Users' Group Meeting (EuroMPI
2013), September 2013. (pdf)
- Palden Lama, Yan Li, Ashwin M. Aji, Pavan Balaji, James Dinan,
Shucai Xiao, Yunquan Zhang, Wu-chun Feng, Rajeev Thakur, and Xiaobo
Zhou, "pVOCL: Power-Aware Dynamic Placement and Migration in
Virtualized GPU Environments," in Proc. of the 33rd
Int'l Conference on Distributed Computing Systems
(ICDCS), July 2013. (pdf)
- Ashwin M. Aji, Lokendra S. Panwar, Feng Ji, Milind Chabbi,
Karthik Murthy, Pavan Balaji, Keith R. Bisset, James Dinan, Wu-chun Feng,
John Mellor-Crummey, Xiaosong Ma, and Rajeev Thakur,
"On the Efficacy of GPU-Integrated MPI for Scientific
Applications," in Proc. of the 22nd ACM Symposium on
High-Performance Parallel and Distributed Computing (HPDC'13), June 2013.
(pdf)
- Yin Lu, Yong Chen, Rajeev Thakur, and Yu Zhuang,
"Memory-Conscious Collective I/O for Extreme Scale HPC Systems,"
in Proc. of the Int'l Workshop on Runtime and Operating Systems for
Supercomputers (ROSS 2013) at ICS 2013, June 2013.
(pdf)
- Ashwin Aji, Pavan Balaji, James Dinan, Wuchun Feng, and Rajeev
Thakur, "Synchronization and Ordering Semantics in Hybrid MPI+GPU
Programming," in Proc. of the 3rd Int'l Workshop on
Accelerators and Hybrid Exascale Systems (AsHES) at IPDPS 2013, May 2013.
(pdf)
- Yanlong Yin, Jibing Li, Jun He, Xian-He Sun, and Rajeev Thakur,
"Pattern-Direct and Layout-Aware Replication Scheme for Parallel
I/O Systems," in Proc. of the 27th IEEE Int'l
Parallel and Distributed Processing Symposium (IPDPS 2013), May 2013.
(pdf)
- Xin Zhao, Darius Buntinas, Judicael Zounmevo, James Dinan, David
Goodell, Pavan Balaji, Rajeev Thakur, Ahmad Afsahi, and William
Gropp, "Towards Asynchronous, MPI-Interoperable Active Messages,"
in Proc. of the 13th IEEE Int'l Symposium on
Cluster Computing and the Grid (CCGrid 2013), May 2013.
(pdf)
- Seong Jo Kim, Seung Woo Son, Wei-keng Liao, Mahmut Kandemir,
Rajeev Thakur, and Alok Choudhary, "IOPin: Runtime Profiling
of Parallel I/O in HPC Systems," in Proc. of the
7th Parallel Data Storage Workshop (PDSW) (held in conjunction with
SC12), November 2012. (pdf)
- Yong Chen, Chao Chen, Xian-He Sun, William D. Gropp, Rajeev
Thakur, "A Decoupled Execution Paradigm for Data-Intensive High-End
Computing," in Proc. of the IEEE Int'l
Conference on Cluster Computing (Cluster 2012), September 2012.
(pdf)
- Jun He, Xian-He Sun, Rajeev Thakur, "KNOWAC: I/O Prefetch via
Accumulated Knowledge," in Proc. of the IEEE
Int'l Conference on Cluster Computing (Cluster 2012),
September 2012. (pdf)
- John Jenkins, James Dinan, Pavan Balaji, Nagiza F. Samatova,
Rajeev Thakur, "Enabling Fast, Noncontiguous GPU Data Movement in
Hybrid MPI+GPU Environments," in Proc. of the IEEE Int'l
Conference on Cluster Computing (Cluster 2012), September 2012.
(pdf)
- James Dinan, David Goodell, William Gropp, Rajeev Thakur and
Pavan Balaji, "Efficient Multithreaded Context ID Allocation in
MPI," in Proc. of the 19th European MPI Users' Group Meeting
(EuroMPI 2012), September 2012. (pdf)
- Torsten Hoefler, James Dinan, Darius Buntinas, Pavan Balaji,
Brian Barrett, Ron Brightwell, William Gropp, Vivek Kale and Rajeev
Thakur, "Leveraging MPI's One-Sided Communication Interface for
Shared-Memory Programming," in Proc. of the 19th
European MPI Users' Group Meeting (EuroMPI 2012), September
2012. (pdf)
- Hui Jin, Jiayu Ji, Xian-He Sun, Yong Chen, Rajeev Thakur, "CHAIO:
Enabling HPC Applications on Data-Intensive File Systems," in
Proc. of the 2012 Int'l Conference on Parallel Processing, September 2012.
(pdf)
- Ashwin Aji, James Dinan, Darius Buntinas, Pavan Balaji, Wu-chun
Feng, Keith Bisset, Rajeev Thakur, "MPI-ACC: An Integrated and
Extensible Approach to Data Movement in Accelerator-Based Systems," in
Proc. of the 14th IEEE Int'l Conference on High Performance
Computing and Communications (HPCC-2012), June 2012.
(pdf)
- Feng Ji, Ashwin M. Aji, James Dinan, Darius Buntinas, Pavan
Balaji, Rajeev Thakur, Wu-Chun Feng, Xiaosong Ma, "DMA-Assisted,
Intranode Communication in GPU Accelerated Systems," in Proc. of
the 14th IEEE Int'l Conference on High Performance Computing and
Communications (HPCC-2012), June 2012. (pdf)
- Yin Lu, Yong Chen, Prathamesh Amritkar, Rajeev Thakur, and Yu
Zhuang, "A New Data Sieving Approach for High Performance I/O," in
Proc. of the 7th Int'l Conference on Future Information
Technology (FutureTech'12), June 2012. (pdf) (Best Paper Award)
- Huaiming Song, Hui Jin, Jun He, Xian-He Sun, and Rajeev Thakur,
"A Server-Level Adaptive Data Layout Strategy for Parallel File
Systems," in Proc. of the 2012 Int'l Workshop on High
Performance Data Intensive Computing (HPDIC 2012) (held in
conjunction with IPDPS 2012), May 2012. (pdf)
- Shucai Xiao, Pavan Balaji, Qian Zhu, Rajeev Thakur, Susan
Coghlan, Heshan Lin, Gaojin Wen, Jue Hong and Wu-chun Feng, "VOCL:
An Optimized Environment for Transparent Virtualization of Graphics
Processing Units," in Proc. of 2012 Innovative Parallel
Computing: Foundations & Applications of GPU, Manycore, and
Heterogeneous Systems (InPar 2012), May 2012. (pdf)
- Shucai Xiao, Pavan Balaji, James Dinan, Qian Zhu, Rajeev Thakur,
Susan Coghlan, Heshan Lin, Gaojin Wen, Jue Hong, and Wu-Chun Feng,
"Transparent Accelerator Migration in a Virtualized GPU Environment,"
in Proc. of the 12th IEEE Int'l Symposium on Cluster Computing and
the Grid (CCGrid 2012), May 2012. (pdf)
- Yanlong Yin, Surendra Byna, Huaiming Song, Xian-He Sun, and
Rajeev Thakur, "Boosting Application-Specific Parallel I/O
Optimization Using IOSIG," in Proc. of the 12th IEEE Int'l
Symposium on Cluster Computing and the Grid (CCGrid 2012), May
2012. (pdf)
- Ganesh Gopalakrishnan, Robert M. Kirby, Stephen Siegel, Rajeev Thakur,
William Gropp, Ewing Lusk, Bronis R. de Supinski, Martin Schulz, and Greg
Bronevetsky, "Formal Analysis of MPI-Based Parallel Programs," Communications of the ACM, 54(12):82-91, December 2011.
- Huaiming Song, Yanlong Yin, Xian-He Sun, Rajeev Thakur, Samuel
Lang, "Server-Side I/O Coordination for Parallel File Systems," in Proc.
of SC11: Int'l Conference on High Performance Computing, Networking,
Storage, and Analysis, November 2011. (pdf)
- William Gropp, Torsten Hoefler, Rajeev Thakur, and Jesper
Larsson Träff, "Performance Expectations and Guidelines for MPI
Derived Datatypes," in Proc. of the 18th European MPI Users' Group Meeting
(EuroMPI 2011), September 2011. (pdf)
- David Goodell, William Gropp, Xin Zhao, and Rajeev Thakur,
"Scalable Memory Use in MPI: A Case Study with MPICH2,"
in Proc. of the 18th European MPI Users' Group Meeting
(EuroMPI 2011), September 2011. (pdf)
- Huaiming Song, Yanlong Yin, Xian-He Sun, Rajeev Thakur and
Samuel Lang, "A Segment-Level Adaptive Data Layout Scheme for
Improved Load Balance in Parallel File Systems," in Proc. of
the 11th IEEE Int'l
Symposium on Cluster Computing and the Grid (CCGrid 2011), May 2011. (pdf)
- Yong Chen, Xian-He Sun, Rajeev Thakur, Philip C. Roth, and William
D. Gropp, "LACIO: A New Collective I/O Strategy for Parallel I/O
Systems," in Proc. of the 25th IEEE Int'l
Parallel and Distributed Processing Symposium (IPDPS 2011), May 2011. (pdf)
- Seung Woo Son, Samuel Lang, Robert Latham, Robert Ross, and
Rajeev Thakur, "Reliable MPI-IO through Layout-Aware Replication,"
in Proc. of the 7th IEEE Int'l Workshop on
Storage Network Architecture and Parallel I/O (SNAPI 2011), May
2011. (pdf)
- Pavan Balaji, Darius Buntinas, David Goodell, William Gropp,
Torsten Hoefler, Sameer Kumar, Ewing Lusk, Rajeev Thakur, and Jesper
Larsson Träff, "MPI on Millions of Cores," Parallel
Processing Letters, 21(1):45-60, March 2011. (pdf)
- Torsten Hoefler, Rolf Rabenseifner, Hubert Ritzdorf, Bronis R. de
Supinski, Rajeev Thakur, and Jesper Larsson Träff, "The Scalable
Process Topology Interface of MPI 2.2," Concurrency and
Computation: Practice and Experience, 23(4):293-310, March 2011. (pdf)
- P. Balaji, W. Feng, H. Lin, J. Archuleta, S. Matsuoka, A. Warren,
J. Setubal, E. Lusk, R. Thakur, I. Foster, K. Shinpaugh, S. Coghlan,
and D. Reed, "Global-scale Distributed I/O with ParaMEDIC,"
Concurrency and Computation: Practice and Experience,
22(16):2266-2281, November 2010. (pdf)
- David Goodell, Pavan Balaji, Darius Buntinas, Gabor Dozsa,
William Gropp, Sameer Kumar, Bronis R. de Supinski, and Rajeev
Thakur, "Minimizing MPI Resource Contention in Multithreaded
Multicore Environments," in Proc. of the IEEE
Int'l Conference on Cluster Computing (Cluster 2010),
September 2010. (pdf)
- Yong Chen, Xian-He Sun, Rajeev Thakur, Huaiming Song, and Hui Jin,
"Improving Parallel I/O Performance with Data Layout Awareness," in
Proc. of the IEEE Int'l Conference on Cluster
Computing (Cluster 2010), September 2010. (pdf)
- Gabor Dozsa, Sameer Kumar, Pavan Balaji, Darius Buntinas, David
Goodell, William Gropp, Joseph Ratterman, and Rajeev Thakur,
"Enabling Concurrent Multithreaded MPI Communication on Multicore
Petascale Systems," in Proc. of the 17th European MPI
Users' Group Meeting (EuroMPI 2010), September 2010. (pdf)
- Torsten Hoefler, William Gropp, Rajeev Thakur, and Jesper Larsson
Träff, "Toward Performance Models of MPI Implementations for
Understanding Application Scaling Issues," in Proc. of
the 17th European MPI Users' Group Meeting (EuroMPI 2010), September 2010. (pdf)
- Pavan Balaji, Darius Buntinas, David Goodell, William Gropp, Jayesh
Krishna, Ewing Lusk, and Rajeev Thakur, "PMI: A Scalable Parallel
Process Management Interface for Extreme-Scale Systems,"
in Proc. of the 17th European MPI Users' Group Meeting
(EuroMPI 2010), September 2010. (pdf)
- Jayesh Krishna, Pavan Balaji, Ewing Lusk, Rajeev Thakur, and
Fabian Tillier, "Implementing MPI on Windows: Comparison with Common
Approaches on Unix," in Proc. of the 17th European MPI
Users' Group Meeting (EuroMPI 2010), September 2010. (pdf)
- Zhiling Lan, Jiexing Gu, Ziming Zheng, Rajeev Thakur, and Susan
Coghlan, "A Study of Dynamic Meta-Learning for Failure Prediction in
Large-Scale Systems," Journal of Parallel and Distributed
Computing, 70(6): 630-643, June 2010. (pdf)
- James Dinan, Pavan Balaji, Ewing Lusk, P. Sadayappan, and Rajeev
Thakur, "Hybrid Parallel Programming with MPI and Unified Parallel C,"
in Proc. of the 2010 ACM Int'l Conference on
Computing Frontiers, May 2010. (pdf)
- Seung Woo Son, Samuel Lang, Philip Carns, Robert Ross, Rajeev
Thakur, Berkin Ozisikyilmaz, Prabat Kumar, Wei-Keng Liao, and Alok
Choudhary, "Enabling Active Storage on Parallel I/O Software
Stacks," in Proc. of the 26th IEEE Symposium on Massive
Storage Systems and Technologies, May 2010. (pdf)
- Jesper Larsson Träff, William D. Gropp, and Rajeev Thakur,
"Self-Consistent MPI Performance Guidelines," IEEE
Transactions on Parallel and Distributed Systems, (21)5:698-709,
May 2010. (pdf)
- Pavan Balaji, Anthony Chan, William Gropp, Rajeev Thakur, and Ewing Lusk, "The
Importance of Non-Data-Communication Overheads in MPI,"
Int'l Journal of High Performance Computing Applications, 24(1):5-15, Spring 2010. (pdf)
- Pavan Balaji, Darius Buntinas, David Goodell, William Gropp, and
Rajeev Thakur, "Fine-Grained Multithreading Support for Hybrid Threaded
MPI Programming," Int'l Journal of High Performance
Computing Applications, 24(1):49-57, Spring 2010. (pdf)
- Jesper Larsson Träff, Andreas Ripke, Christian Siebert,
Pavan Balaji, and Rajeev Thakur, and William Gropp, "A Pipelined
Algorithm for Large, Irregular All-Gather Problems," Int'l Journal of
High Performance Computing Applications, 24(1):58-68, Spring 2010. (pdf)
- Salman Pervez, Ganesh Gopalakrishnan, Robert M. Kirby, Rajeev Thakur,
and William Gropp, "Formal Methods Applied to High Performance
Computing Software Design: A Case Study of MPI One-Sided Communication
Based Locking," Software: Practice and Experience, 40(1):23-43,
January 2010. (pdf)
- Rajeev Thakur and William Gropp, "Test Suite for Evaluating
Performance of Multithreaded MPI Communication," Parallel Computing,
35(12):608-617, December 2009. (pdf)
- Tom Peterka, David Goodell, Robert Ross, Han-Wei Shen, and Rajeev Thakur,
"A Configurable Algorithm for Parallel Image Compositing
Applications," in Proc.
of SC09: Int'l Conference on High Performance Computing, Networking,
Storage, and Analysis, November 2009. (pdf)
- Pavan Balaji, Darius Buntinas, David Goodell, William Gropp,
Sameer Kumar, Ewing Lusk, Rajeev Thakur, and Jesper Larsson
Träff, "MPI on a Million Processors," in
Proc. of the 16th European PVM/MPI Users' Group Meeting (Euro
PVM/MPI 2009), September 2009. (pdf) (selected as outstanding
paper)
- Robert Ross, Robert Latham, William Gropp, Ewing Lusk, and
Rajeev Thakur, "Processing MPI Datatypes outside MPI," in
Proc. of the 16th European PVM/MPI Users' Group Meeting
(Euro PVM/MPI 2009), September 2009. (pdf) (selected as outstanding paper)
- Anh Vo, Sarvani Vakkalanka, Jason Williams, Ganesh Gopalakrishnan,
Robert Kirby, and Rajeev Thakur, "Sound and Efficient Dynamic
Verification of MPI Programs with Probe Non-Determinism," in
Proc. of the 16th European PVM/MPI Users' Group Meeting
(Euro PVM/MPI 2009), September 2009. (pdf)
- Sriram Aananthakrishnan, Michael DeLisi, Sarvani Vakkalanka,
Anh Vo, Ganesh Gopalakrishnan, Robert Kirby, and Rajeev Thakur,
"How Formal Dynamic Verification Tools Facilitate Novel Concurrency
Visualizations," in
Proc. of the 16th European PVM/MPI Users' Group Meeting
(Euro PVM/MPI 2009), September 2009. (pdf)
- Saba Sehrish, Jun Wang, and Rajeev Thakur, "Conflict Detection
Algorithm to Minimize Locking For MPI-IO Atomicity," in
Proc. of the 16th European PVM/MPI Users' Group Meeting
(Euro PVM/MPI 2009), September 2009. (pdf)
- Vinod Tipparaju, William Gropp, Hubert Ritzdorf, Rajeev Thakur,
and Jesper Larsson Träff, "Investigating High Performance RMA
Interfaces for the MPI-3 Standard," in Proc. of the 2009
Int'l Conference on Parallel Processing, September 2009. (pdf)
- Alok Choudhary, Wei-keng Liao, Kui Gao, Arifa Nisar,
Robert Ross, Rajeev Thakur, and Robert Latham, "Scalable I/O and
Analytics," Journal of Physics: Conference Series (SciDAC
2009), Vol 180, 2009. (pdf)
- P. Lai, P. Balaji, R. Thakur, and D. K. Panda, "ProOnE: A General
Purpose Protocol Onload Engine for Multi- and Many-Core
Architectures," Computer Science -- Research and
Development23(3-4):133-142, June 2009. (pdf)
- Pavan Balaji, Anthony Chan, Rajeev Thakur, William Gropp, and Ewing Lusk, "Toward Message
Passing for a Million Processes: Characterizing MPI on a Massive
Scale Blue Gene/P," Computer Science -- Research
and Development, 24(1-2):11-19, September 2009. (pdf) (Best Paper Award at the Int'l Supercomputing Conference (ISC) 2009)
- G. Santhanaraman, P. Balaji, K. Gopalakrishnan, R. Thakur, W. Gropp
and D. K. Panda. "Natively Supporting True One-sided Communication in
MPI on Multi-core Systems with InfiniBand," in Proc. of
the 9th IEEE Int'l
Symposium on Cluster Computing and the Grid (CCGrid 2009), May 2009.
(pdf)
- Anh Vo, Sarvani Vakkalanka, Michael Delisi, Ganesh Gopalakrishnan,
Robert M. Kirby, Rajeev Thakur, "Formal Verification of Practical MPI
Programs," in Proc. of the 14th ACM SIGPLAN Symposium on
Principles and Practice of Parallel Programming (PPoPP 2009), February 2009. (pdf)
- Pavan Balaji, Sitha Bhagvat, Rajeev Thakur, and Dhabaleswar Panda, "Sockets
Direct Protocol for Hybrid Network Stacks: A Case Study with iWARP
over 10G Ethernet," in Proc. of the 15th Int'l
Conference on High Performance Computing (HiPC 2008), December 2008. (pdf)
- Anthony Chan, Pavan Balaji, William Gropp, and Rajeev Thakur, "Communication
Analysis of Parallel 3D FFT for Flat Cartesian Meshes on Large Blue
Gene Systems," in Proc. of the 15th Int'l
Conference on High Performance Computing (HiPC 2008), December 2008. (pdf)
- Yong Chen, Surendra Byna, Xian-He Sun, Rajeev Thakur, and
William Gropp, "Hiding I/O Latency with Pre-execution Prefetching
for Parallel Applications," in Proc.
of SC08: Int'l Conference on High Performance Computing, Networking,
Storage, and Analysis, November 2008. (pdf)
- Surendra Byna, Yong Chen, Xian-He Sun, Rajeev Thakur, and
William Gropp, "Parallel I/O Prefetching Using MPI File Caching and
I/O Signatures," in Proc.
of SC08: Int'l Conference on High Performance Computing, Networking,
Storage, and Analysis, November 2008. (pdf)
- P. Balaji, A. Chan, W. Gropp, R. Thakur, and
E. Lusk, "Non-Data-Communication Overheads in MPI: Analysis on
Blue Gene/P," in
Proc. of the 15th European PVM/MPI Users' Group Meeting (Euro
PVM/MPI 2008), September 2008. (pdf) (selected as outstanding paper)
- Pavan Balaji, Darius Buntinas, David Goodell, William Gropp, and
Rajeev Thakur, "Toward Efficient Support for Multithreaded MPI
Communication," in
Proc. of the 15th European PVM/MPI Users' Group Meeting (Euro
PVM/MPI 2008), September 2008. (pdf)
- William Gropp, Dries Kimpe, Robert Ross, Rajeev Thakur and
Jesper Larsson Träff, "Self-Consistent MPI-IO Performance
Requirements and Expectations," in
Proc. of the 15th European PVM/MPI Users' Group Meeting (Euro
PVM/MPI 2008), September 2008. (pdf)
- Jesper Larsson Träff, Andreas Ripke, Christian Siebert,
Pavan Balaji, and Rajeev Thakur, and William Gropp, "A Simple,
Pipelined Algorithm for Large, Irregular All-Gather Problems," in
Proc. of the 15th European PVM/MPI Users' Group Meeting (Euro
PVM/MPI 2008), September 2008. (pdf)
- Subodh Sharma, Sarvani Vakkalanka, Ganesh Gopalakrishnan, Robert M.
Kirby, Rajeev Thakur, and William Gropp, "A Formal Approach to
Detect Functionally Irrelevant Barriers in MPI Programs," in
Proc. of the 15th European PVM/MPI Users' Group Meeting (Euro
PVM/MPI 2008), September 2008. (pdf)
- Sarvani Vakkalanka, Michael DeLisi, Ganesh Gopalakrishnan, Robert M.
Kirby, Rajeev Thakur, and William Gropp, "Implementing Efficient
Dynamic Formal Verification Methods for MPI Programs," in
Proc. of the 15th European PVM/MPI Users' Group Meeting (Euro
PVM/MPI 2008), September 2008. (pdf)
- P. Balaji, W. Feng, H. Lin, J. Archuleta, S. Matsuoka, A. Warren,
J. Setubal, E. Lusk, R. Thakur, I. Foster, D. S. Katz, S. Jha,
K. Shinpaugh, S. Coghlan, and D. Reed, "Distributed I/O with
ParaMEDIC: Experiences with a Worldwide Supercomputer," in
Proc. of the Int'l Supercomputing Conference (ISC'08),
June 2008. (pdf) (Best Paper Award)
- P. Balaji, W. Feng, J. Archuleta, H. Lin, R. Kettimuttu, R. Thakur, and
X. Ma, "Semantics-based Distributed I/O for mpiBLAST", in
Proc. of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel
Programming (PPoPP 2008), February 2008. (short paper) (pdf)
- P. Balaji, W. Feng, S. Bhagvat, D. K. Panda, R. Thakur,
W. Gropp, "Analyzing the Impact of Supporting Out-of-Order
Communication on In-Order Performance with iWARP," in Proc.
of SC07, November 2007. (pdf)
- Jesper Larsson Träff, William Gropp, and Rajeev Thakur, "Self-Consistent MPI
Performance Requirements," in Proc. of the 14th European
PVM/MPI Users' Group Meeting (Euro PVM/MPI 2007), September 2007, pp. 36-45.
(pdf) (selected as outstanding paper)
- Rajeev Thakur and William Gropp, "Test Suite for Evaluating
Performance of MPI Implementations That Support
MPI_THREAD_MULTIPLE," in Proc. of the 14th European
PVM/MPI Users' Group Meeting (Euro PVM/MPI 2007), September 2007, pp. 46-55.
(pdf) (selected as outstanding paper)
- William Gropp and Rajeev Thakur, "Revealing the Performance of
MPI RMA Implementations," in Proc. of the 14th European
PVM/MPI Users' Group Meeting (Euro PVM/MPI 2007), September 2007, pp. 272-280.
(pdf)
- Salman Pervez, Ganesh Gopalakrishnan, Robert M. Kirby, Robert
Palmer, Rajeev Thakur, and William Gropp, "Practical Model Checking
Method for Verifying Correctness of MPI Programs,"
Proc. of the 14th European PVM/MPI Users' Group Meeting
(Euro PVM/MPI 2007), September 2007, pp. 344-353.
(pdf)
- Robert Latham, William Gropp, Robert Ross, and Rajeev
Thakur, "Extending the MPI-2 Generalized Request Interface," in
Proc. of the 14th European PVM/MPI Users' Group Meeting
(Euro PVM/MPI 2007), September 2007, pp. 223-232.
(pdf)
- Rajeev Thakur and William Gropp, "Open Issues in MPI
Implementation," in Proc. of the 12th Asia-Pacific
Computer Systems Architecture Conference (ACSAC 2007), August 2007, pp. 327-338.
(pdf)
- William Gropp and Rajeev Thakur, "Thread Safety in an MPI
Implementation: Requirements and Analysis," Parallel
Computing, 33(9):595-604, September 2007. (pdf)
- Prashasta Gujrati, Yawei Li, Zhiling Lan, Rajeev Thakur, and John
White, "A Meta-Learning Failure Predictor for Blue Gene/L Systems,"
in Proc. of the 2007 Int'l Conference on Parallel
Processing, September 2007. (pdf)
- Pavan Balaji, Sitha Bhagvat, Dhabaleswar Panda, Rajeev Thakur, and
William Gropp, "Advanced Flow-control Mechanisms for the Sockets Direct Protocol
over InfiniBand," in Proc. of the 2007 Int'l
Conference on Parallel Processing, September 2007. (pdf)
- Robert Latham, Robert Ross, and Rajeev Thakur, "Implementing
MPI-IO Atomic Mode and Shared File Pointers Using MPI One-Sided
Communication," Int'l Journal of High Performance
Computing Applications, 21(2):132-143, Summer 2007. (pdf)
- Subhash Saini, Dale Talcott, Rajeev Thakur, Panagiotis Adamidis,
Rolf Rabenseifner, and Robert Ciotti, "Parallel I/O Performance
Characterization of Columbia and NEC SX-8 Superclusters," in
Proc. of the 21st IEEE Int'l Parallel and Distributed
Processing Symposium (IPDPS 2007), March 2007. (pdf)
- P. Balaji, D. Buntinas, S. Balay, B. Smith, R. Thakur, W. Gropp,
"Nonuniformly Communicating Noncontiguous Data: A Case Study with
PETSc and MPI," in Proc. of the 21st IEEE Int'l
Parallel and Distributed Processing Symposium (IPDPS 2007), March 2007. (pdf)
- Kenin Coloma, Avery Ching, Alok Choudhary, Wei-keng Liao, Rob
Ross, Rajeev Thakur, and Lee Ward, "A New Flexible MPI Collective
I/O Implementation" in Proc. of the IEEE Int'l
Conference on Cluster Computing (Cluster 2006), September
2006. (pdf)
- William Gropp and Rajeev Thakur, "Issues in Developing a
Thread-Safe MPI Implementation," in Proc. of the 13th European
PVM/MPI Users' Group Meeting (Euro PVM/MPI 2006), September
2006, pp. 12-21. (pdf) (selected as outstanding paper)
- Salman Pervez, Ganesh Gopalakrishnan, Robert M. Kirby, Rajeev
Thakur, and William Gropp, "Formal Verification of Programs That Use
MPI One-Sided Communication," in Proc. of the 13th European
PVM/MPI Users' Group Meeting (Euro PVM/MPI 2006), September 2006, pp. 30-39. (pdf) (selected as outstanding paper)
- Robert Latham, Robert Ross, and Rajeev Thakur, "Can MPI Be Used
for Persistent Parallel Services?," in Proc. of the 13th European
PVM/MPI Users' Group Meeting (Euro PVM/MPI 2006), September 2006,
pp. 275-284. (pdf)
- Surendra Byna, Xian-He Sun, Rajeev Thakur, and William Gropp,
"Automatic Memory Optimizations for Improving MPI Derived Datatype
Performance," in Proc. of the 13th European
PVM/MPI Users' Group Meeting (Euro PVM/MPI 2006), September 2006, pp. 238-246. (pdf)
- Ernie Chan, Robert van de Geijn, William Gropp, Rajeev Thakur,
"Collective Communication on Architectures that Support Simultaneous
Communication over Multiple Links," in Proc. of the ACM SIGPLAN
2006 Symposium on Principles and Practice of Parallel Programming
(PPoPP 2006), March 2006. (pdf)
- Jonghyun Lee, Robert Ross, Scott Atchley, Micah Beck, and Rajeev
Thakur, "MPI-IO/L: Efficient Remote I/O for MPI-IO via Logistical
Networking," in Proc. of the 20th IEEE Int'l
Parallel and Distributed Processing Symposium (IPDPS 2006), April 2006. (pdf)
- H. Yu, R. K. Sahoo, C. Howson, G. Almasi, J. G. Castanos,
M. Gupta J. E. Moreira, J. J. Parker, T. E. Engelsiepen, R. Ross,
R. Thakur, R. Latham, and W. D. Gropp, "High Performance File I/O for
the BlueGene/L Supercomputer," in Proc. of the 12th
Int'l Symposium on High-Performance Computer Architecture
(HPCA-12), February 2006. (pdf)
- Murali Vilayannur, Anand Sivasubramaniam, Mahmut Kandemir,
Rajeev Thakur, and Robert Ross, "Discretionary Caching for I/O on
Clusters," in Cluster Computing, 9(1):29-44, January 2006. (ps, pdf)
- William Gropp and Rajeev Thakur, "An Evaluation of Implementation Options for MPI One-Sided Communication," in Proc. of the 12th European
PVM/MPI Users' Group Meeting (Euro PVM/MPI 2005), Recent Advances in
Parallel Virtual Machine and Message Passing Interface, Lecture
Notes in Computer Science, LNCS 3666, Springer, September 2005, pp. 415-424. (ps, pdf)
- Rajeev Thakur, Robert Ross, and Robert Latham, "Implementing
Byte-Range Locks Using MPI One-Sided Communication," in Proc. of
the 12th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2005), Recent Advances in
Parallel Virtual Machine and Message Passing Interface, Lecture
Notes in Computer Science, LNCS 3666, Springer, September 2005, pp. 119-128. (ps, pdf) (Note: We subsequently discovered a bug in this algorithm that can lead to deadlock. See this paper published in Euro PVM/MPI 2006 for details and proposed fixes.)
- Robert Latham, Robert Ross, Rajeev Thakur, and Brian Toonen, "Implementing MPI-IO Shared File Pointers without File System Support," in Proc. of
the 12th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2005), Recent Advances in
Parallel Virtual Machine and Message Passing Interface, Lecture
Notes in Computer Science, LNCS 3666, Springer, September 2005,
pp. 84-93. (ps, pdf)
- Robert Ross, Rajeev Thakur, and Alok Choudhary, "Achievements
and Challenges for I/O in Computational Science," Journal of
Physics: Conference Series (SciDAC 2005), 16:501-509, 2005. (pdf)
- Rajeev Thakur, William Gropp, and Brian Toonen, "Optimizing the Synchronization Operations in MPI One-Sided Communication," Int'l Journal of High Performance Computing Applications, 19(2):119-128, Summer 2005. (ps, pdf)
- Robert Ross, Robert Latham, William Gropp, Rajeev Thakur, and
Brian Toonen, "Implementing MPI-IO Atomic Mode Without File System
Support," in Proc. of the 5th IEEE/ACM Int'l Symposium on
Cluster Computing and the Grid (CCGrid 2005), May 2005. (pdf)
- Rajeev Thakur, Rolf Rabenseifner, and William Gropp, "Optimization
of Collective Communication Operations in MPICH," Int'l Journal of High Performance Computing Applications, 19(1):49-66, Spring 2005. (ps, pdf)
- Rob Latham, Rob Ross, and Rajeev Thakur, "The Impact of File Systems on MPI-IO Scalability," in Proc. of the 11th European
PVM/MPI Users' Group Meeting (Euro PVM/MPI 2004), Recent Advances in
Parallel Virtual Machine and Message Passing Interface, Lecture
Notes in Computer Science, LNCS 3241, Springer, September 2004, pp. 87-96. (pdf)
- Weihang Jiang, Jiuxing Liu, Hyun-Wook Jin, Dhabaleswar K. Panda,
Darius Buntinas, Rajeev Thakur, and William Gropp, "Efficient
Implementation of MPI-2 Passive One-Sided Communication on
InfiniBand Clusters," in Proc. of the 11th European
PVM/MPI Users' Group Meeting (Euro PVM/MPI 2004), Recent Advances in
Parallel Virtual Machine and Message Passing Interface, Lecture
Notes in Computer Science, LNCS 3241, Springer, September
2004, pp. 68-76. (pdf)
- Rajeev Thakur, William Gropp, and Brian Toonen, "Minimizing Synchronization Overhead in the Implementation of MPI One-Sided Communication," in Proc. of the 11th European
PVM/MPI Users' Group Meeting (Euro PVM/MPI 2004), Recent Advances in
Parallel Virtual Machine and Message Passing Interface, Lecture
Notes in Computer Science, LNCS 3241, Springer, September 2004, pp. 57-67. (ps, pdf)
- Jonghyun Lee, Xiaosong Ma, Robert Ross, Rajeev Thakur, and
Marianne Winslett, "RFS: Efficient and Flexible Remote File Access for
MPI-IO," in Proc. of the IEEE Int'l Conference on Cluster
Computing (Cluster 2004), September 2004. (pdf)
- Surendra Byna, Xian-He Sun, William Gropp, and Rajeev Thakur,
"Predicting Memory-Access Cost Based on Data-Access Patterns,"
in Proc. of the IEEE Int'l Conference on Cluster Computing
(Cluster 2004), September 2004. (pdf)
- Weihang Jiang, Jiuxing Liu, Hyun-Wook Jin, Dhabaleswar K. Panda,
William Gropp, and Rajeev Thakur, "High Performance MPI-2 One-Sided
Communication over Infiniband," in Proc. of the 4th IEEE/ACM
Int'l Symposium on Cluster Computing and the Grid (CCGrid
2004), April 2004. (pdf)
- Tarek El-Ghazawi, Francois Cantonnet, Proshanta Saha, Rajeev
Thakur, Rob Ross, and Dan Bonachea, "UPC-IO: A Parallel I/O API for UPC,"
Version 1.0, Technical Report, High Performance Computing
Laboratory, George Washington University, July 2004. (pdf)
- Murali Vilayannur, Robert B. Ross, Philip H. Carns, Rajeev Thakur,
Anand Sivasubramaniam, and Mahmut Kandemir, "On the Peformance of the
POSIX I/O Interface to PVFS," in Proc. of the 12th Euromicro
Conference on Parallel, Distributed, and Network-based Processing,
February 2004, pp. 332-339. (ps, pdf)
- Surendra Byna, William Gropp, Xian-He Sun, and Rajeev Thakur,
"Improving the Performance of MPI Derived Datatypes by Optimizing
Memory-Access Cost," in Proc. of the IEEE Int'l Conference on Cluster
Computing (Cluster 2003),
December 2003, pp. 412-419. (ps, pdf)
- Jianwei Li, Wei-keng Liao, Alok Choudhary, Robert Ross, Rajeev
Thakur, William Gropp, Rob Latham, Andrew Siegel, Brad Gallagher,
and Michael Zingale, "Parallel netCDF: A
High-Performance Scientific I/O Interface," in Proc. of SC2003,
November 2003. (pdf)
- Rajeev Thakur and William Gropp, "Improving the Performance of
Collective Operations in MPICH," in Proc. of the 10th European
PVM/MPI Users' Group Meeting (Euro PVM/MPI 2003), Recent Advances in
Parallel Virtual Machine and Message Passing Interface, Lecture
Notes in Computer Science, LNCS 2840, Springer, September
2003, pp. 257-267. (ps, pdf)
- Jaechun No, Rajeev Thakur, and Alok Choudhary,
"High-Performance Scientific Data Management System,"
Journal of Parallel and Distributed Computing, 64(4):434-447,
April 2003. (ps, pdf)
- William Gropp, Torsten Hoefler, Rajeev Thakur, and Ewing Lusk, Using Advanced MPI: Modern
Features of the Message-Passing Interface, MIT Press, 2014.
- William Gropp, Ewing Lusk, and Rajeev Thakur, Using
MPI-2: Advanced Features of the Message-Passing Interface, MIT
Press, 1999.
Older Publications
- Rajeev Thakur, William Gropp, and Ewing Lusk, "Optimizing
Noncontiguous Accesses in MPI-IO," Parallel
Computing, 28(1):83-105, January 2002. (ps, pdf)
- Rajeev Thakur, William Gropp, and Ewing Lusk, "On
Implementing MPI-IO Portably and with High Performance," in
Proc. of the Sixth Workshop on I/O in Parallel and Distributed
Systems, May 1999, pp. 23-32. (ps, pdf)
- Rajeev Thakur, William Gropp, and Ewing Lusk, "Data Sieving and Collective I/O in ROMIO," in Proc. of the 7th Symposium on the Frontiers of Massively Parallel Computation, February 1999, pp. 182-189. (ps, pdf)
- Rajeev Thakur, William Gropp, and Ewing Lusk, "A Case for Using MPI's
Derived Datatypes to Improve I/O Performance," in
Proc. of SC98: High Performance Networking and Computing,
November 1998. (html)
- Rajeev Thakur, William Gropp, and Ewing Lusk, "An Abstract-Device
Interface for Implementing Portable Parallel-I/O Interfaces," in
Proc. of the 6th Symposium on the Frontiers of Massively
Parallel Computation, October 1996, pp. 180-187. (ps, pdf)
- Rajeev Thakur, Robert Ross, Ewing Lusk, and William Gropp, "Users
Guide for ROMIO: A High-Performance, Portable MPI-IO
Implementation," Technical Memorandum ANL/MCS-TM-234, Mathematics
and Computer Science Division, Argonne National Laboratory, Revised
May 2004. (ps, pdf)
- Rajeev Thakur and William Gropp, "Parallel I/O,"
Sourcebook of Parallel Computing, Morgan Kaufmann Publishers,
Ch. 11, pp. 331-355, 2002.
- Alok Choudhary, Mahmut Kandemir, Sachin More, Jaechun No, and
Rajeev Thakur, "Collective I/O and Large-Scale Data Management,"
Scalable Input/Output: Achieving System Balance, MIT Press,
Ch. 2, pp. 35-75, 2004.
- Rajeev Thakur, William Gropp, and Ewing Lusk, "ADIO: A
Framework for High-Performance, Portable Parallel I/O,"
Scalable Input/Output: Achieving System Balance, MIT Press,
Ch. 4, pp. 111-134, 2004.
- Murali Vilayannur, Anand Sivasubramaniam, Mahmut Kandemir,
Rajeev Thakur, and Robert Ross, "Discretionary Caching for I/O on
Clusters," in Proc. of the 3rd IEEE/ACM
Int'l Symposium on Cluster Computing and the Grid (CCGrid
2003), May 2003, pp. 96-103. (ps, pdf)
- Phillip Dickens and Rajeev Thakur, "Evaluation of Collective I/O
Implementations on Parallel Architectures," Journal of Parallel and
Distributed Computing, 61(8):1052-1076, August 1, 2001. (ps, pdf)
- Philip H. Carns, Walter B. Ligon III, Robert B. Ross, and Rajeev
Thakur, "PVFS: A
Parallel File System for Linux Clusters," in Proc. of the
4th Annual Linux Showcase and Conference, Atlanta,
October 2000, pp. 317-327. (Best Paper Award.) (ps, pdf)
- Phillip Dickens and Rajeev Thakur, "Improving
Collective I/O Performance Using Threads," in Proc. of the
13th Int'l Parallel Processing Symposium and 10th Symposium on
Parallel and Distributed Processing, April 1999, pp. 38-45. (ps, pdf)
- Rajeev Thakur, Ewing Lusk, and William Gropp, "I/O in Parallel
Applications: The Weakest Link," The Int'l Journal
of High Performance Computing Applications, 12(4):389-395, Winter
1998. (ps, pdf)
- Rajeev Thakur, William Gropp, and Ewing Lusk, "An Experimental
Evaluation of the Parallel I/O Systems of the IBM SP and Intel Paragon
Using a Production Application," in Proc. of the 3rd
Int'l Conf. of the Austrian Center for Parallel Computation (ACPC)
with special emphasis on Parallel Databases and Parallel I/O,
September 1996. Lecture Notes in Computer Science 1127,
Springer-Verlag, pp. 24-35. (ps, pdf)
- Rajeev Thakur and Alok Choudhary, "An
Extended Two-Phase Method for Accessing Sections of Out-of-Core
Arrays," Scientific Programming, 5(4):301-317, Winter 1996. (ps, pdf)
- Rajeev Thakur, Ewing Lusk, and William Gropp, "I/O
Characterization of a Portable Astrophysics Application on the IBM SP
and Intel Paragon," Preprint MCS-P534-0895, Mathematics
and Computer Science Division, Argonne National Laboratory, Revised October
1995. (ps, pdf)
- Rajeev Thakur, Alok Choudhary, Rajesh Bordawekar, Sachin More, and
Sivaramakrishna Kuditipudi, "Passion: Optimized I/O for Parallel Applications,"
IEEE Computer, 29(6):70-78, June 1996.
- Rajeev Thakur and Alok Choudhary, "Runtime Support for Out-of-Core
Parallel Programs," chapter in Input/Output in Parallel and
Distributed Computer Systems, Kluwer Academic Publishers, 1996.
- Alok Choudhary, Rajesh Bordawekar, Sachin More, K. Sivaram, and Rajeev Thakur, "PASSION
Runtime Library for the Intel Paragon," in Proc. of
the Intel Supercomputer User's Group Conference, June 1995. (ps, pdf)
- Rajeev Thakur, Rajesh Bordawekar, Alok Choudhary, Ravi Ponnusamy,
and Tarvinder Singh, "PASSION
Runtime Library for Parallel I/O," in Proc. of the Scalable
Parallel Libraries Conference, October 1994, pp. 119-128. (ps, pdf)
- Rajeev Thakur, Rajesh Bordawekar, and Alok Choudhary, "Compiler
and Runtime Support for Out-of-core HPF Programs," in
Proc. of 8th ACM Int. Conf. on Supercomputing, July 1994, pp. 382-391. (ps, pdf)
- Alok Choudhary, Rajesh Bordawekar, Michael Harry, Rakesh Krishnaiyer,
Ravi Ponnusamy, Tarvinder Singh, and Rajeev Thakur, "PASSION:
Parallel and Scalable Software for Input-Output," NPAC Technical
Report SCCS-636, Syracuse University, September 1994. Also
available as CRPC Technical Report CRPC-TR94483-S. (ps, pdf)
- Rajesh Bordawekar, Alok Choudhary, and Rajeev Thakur, "Data Access
Reorganizations in Compiling Out-of-Core Data Parallel Programs on
Distributed Memory Machines," NPAC Technical Report SCCS-622,
Syracuse University, September 1994. (ps, pdf)
- Dan Bonachea, Phillip Dickens, and Rajeev Thakur, "High-Performance File I/O in Java: Existing Approaches and Bulk I/O
Extensions," Concurrency: Practice and Experience, 13(8-9):713-736, 2001. (ps, pdf)
- Phillip Dickens and Rajeev Thakur, "An
Evaluation of Java's I/O Capabilities for High-Performance
Computing," in Proc. of the ACM 2000 Java Grande Conference,
June 2000, pp. 26-35. (ps, pdf)
- Jaechun No, Rajeev Thakur, and Alok Choudhary,
"High-Performance Scientific Data Management System,"
Journal of Parallel and Distributed Computing, 64(4):434-447,
April 2003. (ps, pdf)
- Jaechun No, Rajeev Thakur, Dinesh Kaushik, Lori Freitag, and
Alok Choudhary, "A
Scientific Data Management System for Irregular
Applications," in Proc. of the Eighth Int'l
Workshop on Solving Irregular Problems in Parallel (Irregular 2001),
April 2001. (ps, pdf)
- Jaechun No, Rajeev Thakur, and Alok Choudhary, "Integrating
Parallel File I/O and Database Support for High-Performance Scientific
Data Management," in Proc. of SC2000: High Performance
Networking and Computing, November 2000. (ps, pdf)
- A. Choudhary, M. Kandemir, H. Nagesh, J. No,
X. Shen, V. Taylor, S. More, and R. Thakur, "Data
Management for Large-Scale Scientific Computations in High Performance
Distributed Systems," in Proc. of the Eighth IEEE
Int'l Symposium on High Performance Distributed Computing,
August 1999, pp. 263-272.
- A. Choudhary, M. Kandemir, J. No, G. Memik,
X. Shen, W. Liao, H. Nagesh, S. More, V. Taylor, R.
Thakur, and R. Stevens, "Data Management for Large-Scale Scientific
Computations in High Performance Distributed Systems," Cluster
Computing, 3(1):45-60, 2000. (ps, pdf)
- Rajeev Thakur, Alok Choudhary, and J. Ramanujam, "Efficient
Algorithms for Array Redistribution," IEEE Transactions on
Parallel and Distributed Systems, 7(6):587-594, June 1996. (ps, pdf)
- Rajeev Thakur and Alok Choudhary, "All-to-All
Communication on Meshes with Wormhole Routing," in Proc. of 8th
Int. Parallel Processing Symposium, April 1994, pp. 561-565. (ps, pdf)
- Rajeev Thakur, Alok Choudhary, and Geoffrey Fox, "Runtime Array
Redistribution in HPF Programs," in Proc. of Scalable
High Performance Computing Conference, May 1994, pp. 309-316. (ps, pdf)
- Rajeev Thakur, Ravi Ponnusamy, Alok Choudhary, and Geoffrey Fox, "Complete
Exchange on the CM-5 and Touchstone Delta,"
The Journal of Supercomputing, 8(4):305-328, 1995.
- Ravi Ponnusamy, Rajeev Thakur, Alok Choudhary, Kishore Velamakanni, and Geoffrey
Fox, "Experimental Performance Evaluation of the CM-5,"
Journal of Parallel and Distributed Computing, November 1993, pp.
192-202.
- Alok Choudhary and Rajeev Thakur, "Connected Component Labeling on
Coarse Grain Parallel Computers: An Experimental Study," Journal of
Parallel and Distributed Computing, January 1994, pp. 78-83.
- Ravi Ponnusamy, Rajeev Thakur, Alok Choudhary, and Geoffrey Fox, "Scheduling
Regular and Irregular Communication Patterns on the CM-5," in Proc. of
Supercomputing 92, Nov. 1992, pp. 394-402. (Best Student
Paper Award in the category of Performance Measurement)
- Kevin Roe, Rajeev Thakur, Thong Dang, and Edward Bogucz, "Implementation of a
3D Mixing Layer Code on Parallel Computers," in
Proc. of AIAA 6th Int'l Aerospace Plane and Hypersonics Technologies
Conference, April 1995.
- Rajeev Thakur, Rajesh Bordawekar, and Alok Choudhary, "Compilation of
Out-of-Core Data Parallel Programs for Distributed Memory Machines,"
in Proc. of the Workshop on I/O in Parallel
Computer Systems at IPPS '94, April 1994.
- Zeki Bozkus, Alok Choudhary, Geoffrey Fox, Thomas Haupt, Sanjay Ranka, Rajeev Thakur
and J.C. Wang, "Scalable Libraries for High Performance Fortran," in
Proc. of the Scalable Parallel Libraries Conference, October
1993, pp. 67-75.
- Alok Choudhary and Rajeev Thakur, "Evaluation of Connected Component
Labeling Algorithms on Shared and Distributed Memory
Multiprocessors," in Proc. of 6th Int. Parallel Processing
Symposium, March 1992, pp. 362-365.
- Ishfaq Ahmad, Rajesh Bordawekar, Zeki Bozkus, Alok Choudhary, Geoffrey Fox, Kanchana
Parasuram, Ravi Ponnusamy, Rajeev Thakur, and Sanjay Ranka, "Fortran 90D Intrinsic
Functions on Distributed Memory Machines: Implementation and
Scalability," in Proc. of 26th Hawaii Int. Conf.
on System Sciences, Jan. 1993.
- Rajeev Thakur, Alok Choudhary, and Geoffrey Fox, "Complete Exchange on a Wormhole
Routed Mesh," in Proc. of MASCOTS '94, Jan. 1994.