Interaction with the User in the SAPFOR System

Main Article Content

Abstract

Automation of parallel programming is important at any stage of parallel program development. These stages include profiling of the original program, program transformation, which allows us to achieve higher performance after program parallelization, and, finally, construction and optimization of the parallel program. It is also important to choose a suitable parallel programming model to express parallelism available in a program. On the one hand, the parallel programming model should be capable to map the parallel program to a variety of existing hardware resources. On the other hand, it should simplify the development of the assistant tools and it should allow the user to explore the parallel program the assistant tools generate in a semi-automatic way. The SAPFOR (System FOR Automated Parallelization) system combines various approaches to automation of parallel programming. Moreover, it allows the user to guide the parallelization if necessary. SAPFOR produces parallel programs according to the high-level DVMH parallel programming model which simplify the development of efficient parallel programs for heterogeneous computing clusters. This paper focuses on the approach to semi-automatic parallel programming, which SAPFOR implements. We discuss the architecture of the system and present the interactive subsystem which is useful to guide the SAPFOR through program parallelization. We used the interactive subsystem to parallelize programs from the NAS Parallel Benchmarks in a semi-automatic way. Finally, we compare the performance of manually written parallel programs with programs the SAPFOR system builds.

Article Details

References

1. Grosser T., Groesslinger A., Lengauer C. Polly – performing polyhedral optimizations on a low-level intermediate representation // Parallel Processing Letters. 2012. Vol. 22, No. 04. 1250010. https://doi.org/10.1142/S0129626412500107.
2. Grosser T., Hoefler T. Polly-ACC Transparent compilation to heterogeneous hardware // ICS '16: Proceedings of the 2016 International Conference on 373 Supercomputing June 2016. P. 1–13. https://doi.org/10.1145/2925426.2926286.
3. Bondhugula U., Hartono A., Ramanujam J., Sadayappan P. A practical automatic polyhedral parallelizer and locality optimizer // ACM SIGPLAN Notices. 2008. Vol. 43, No. 6. P. 101–113. https://doi.org/10.1145/1379022.1375595.
4. Vandierendonck H., Rul S., De Bosschere K. The Paralax Infrastructure: Automatic Parallelization with a Helping Hand // Proceedings of the 19th international conference on Parallel architectures and compilation techniques (PACT'10). 2010. P. 389–400. https://doi.org/10.1145/1854273.1854322.
5. Baghdadi R., Beaugnon U., Cohen A., Grosser T., Kruse M., Reddy C., Verdoolaege S., Betts A., Donaldson A.F., Ketema J., Absar J., Haastregt S., Kravets A., Lokhmotov A., David R., Hajiyev E. Pencil: A platform-neutral compute intermediate language for accelerator programming // Proceedings of the 2015 International Conference on Parallel Architecture and Compilation (PACT), PACT’15. IEEE Computer Society. Washington, DC, USA, 2015. P. 138–149. https://doi.org/10.1109/PACT.2015.17.
6. Kim M., Kim H., Luk C.-K. Prospector: A Dynamic Data-Dependence Profiler To Help Parallel Programming // 2nd USENIX Workshop on Hot Topics in Parallelism (HotPar'10), 2010. P. 1–6.
7. Intel Parallel Studio. URL: https://software.intel.com/en-us/parallel-studioxe.
8. Клинов М.С., Крюков В.А. Автоматическое распараллеливание Фортран-программ. Отображение на кластер // Вестник Нижегородского университета им. Н.И. Лобачевского. 2009. № 2. С. 128–134.
9. Бахтин В.А., Жукова О.Ф., Катаев Н.А., Колганов А.С., Крюков В.А., Кузнецов М.Ю., Поддерюгина Н.В., Притула М.Н., Савицкая О.А., Смирнов А.А. Распараллеливание программных комплексов. Проблемы и перспективы // Труды XX Всероссийской научной конференции «Научный сервис в сети Интернет», Новороссийск, Россия, 17–22 сентября 2018 г. М.: ИПМ им. М.В. Келдыша, 2018. С. 63–72.
URL: http://keldysh.ru/abrau/2018/theses/33.pdf. https:// doi.org/10.20948/abrau-2018-33
10. Hwu W.-m., Ryoo S., Ueng S.-Z., Kelm J.H., Gelado I., Stone S.S., Kidd R.E., Baghsorkhi S.S., Mahesri A.A., Tsao S.C., Navarro N., Lumetta S.S., Frank M.I., Patel S.J. Implicitly parallel programming models for thousand-core microprocessors // Proceedings of the 44th annual Design Automation Conference (DAC '07), ACM, New York, NY, USA. 2007. P. 754–759. https://doi.org/10.1145/1278480.1278669.
11. Blume W., Eigenmann R. Performance analysis pf parallelizing compilers on the Perfect Benchmarks programs // IEEE Transactions on Parallel and Distributed Systems. 1992. Vol. 3, Issue 6. P. 643–656. https://doi.org/10.1109/71.180621.
12. Wolfe M. Scalar vs. parallel optimizations // CSETech. 210. 1990. URL: https://classes.cs.uoregon.edu/16S/cis410parallel/Documents/scalar-paralleloptimizations-wolfe.pdf
13. Konovalov N.A., Krukov V.A, Mikhajlov S.N., Pogrebtsov A.A. Fortan DVM: a Language for Portable Parallel Program Development // Programming and Computer Software. 1995. V. 21, No. 1. P. 35–38.
14. Бахтин В.А., Клинов М.С., Крюков В.А., Поддерюгина Н.В., Притула М.Н., Сазанов Ю.Л. Расширение DVM-модели параллельного программирования для кластеров с гетерогенными узлами // Вестник Южно-Уральского государственного университета, серия «Математическое моделирование и программирование». 2012. №18 (277), выпуск 12. Челябинск: Издательский центр ЮУрГУ. C. 82–92.
15. Kulkarni P., Zhao W., Moon H., Cho K., Whalley D., Davidson J., Bailey M., Paek Y., Gallivan K. Finding effective optimization phase sequences // Proceedings of the 2003 ACM SIGPLAN Conference on Languages, Tools, and Compilers for Embedded Systems. 2003. P. 12–23. https://doi.org/10.1145/780731.780735.
16. NAS Parallel Benchmarks. URL: https://www.nas.nasa.gov/publications/ npb.html.
17. Lattner C., Adve V. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation // Proc. of the 2004 International Symposium on Code Generation and Optimization (CGO’04), Palo Alto, California. 2004.
https://doi.org/10.1109/CGO.2004.1281665.
18. Kataev N.A. Application of the LLVM Compiler Infrastructure to the Program Analysis in SAPFOR // Voevodin V., Sobolev S. (eds). Supercomputing. RuSCDays 2018. Communications in Computer and Information Science. 2018. V. 965. Springer, Cham. P. 487–499. doi:10.1007/978-3-030-05807-4_41.
19. Катаев Н.А., Смирнов А.А., Жуков А.Д. Определение зависимостей по данным средствами динамического анализа системы SAPFOR // Электронные библиотеки. Тематический выпуск «Научный сервис в сети Интернет». Часть 1. 2020. Том 23. № 3. С. 473–493. https://doi.org/10.26907/1562-5419-2020-23-3-473-493.
20. Visual Studio Code. URL: https://code.visualstudio.com/, last accessed 2020/11/25.
21. OpenMP Application Programming Interface.
https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5-1.pdf, last accessed 2020/11/25.
22. Катаев Н.А., Василькин В.Н. Восстановление многомерной формы обращений к линеаризованным массивам в системе SAPFOR // Электронные библиотеки. Тематический выпуск «Научный сервис в сети Интернет». Часть 2. 2020. Том 23. № 4. С. 770–787. https://doi.org/10.26907/1562-5419-2020-23-4-770-787.
23. Seo S., Jo G., Lee J. Performance Characterization of the NAS Parallel Benchmarks in OpenCL // 2011 IEEE International Symposium on. Workload Characterization (IISWC), 2011. P. 137–148. https://doi.org/10.1109/IISWC.2011.6114174.
24. SAPFOR. https://github.com/dvm-system.