Iulia ȘTIRB, Public Dissertation of PhD Thesis
Thesis Title: " Reducing the energy consumption and execution time through the optimization of the communication between threads and through balanced data locality at the execution of parallel programs, on NUMA systems "
Author: Iulia ȘTIRB
- Chair: Professor Dr. Eng. Radu-Emil PRECUP (Politehnica University Timisoara)
- PhD Supervisor: Professor Dr. Eng. Horia CIOCÂRLIE(Politehnica University Timisoara)
- Scientific Referees:
The motivation of this thesis was to create an algorithm called NUMA-BTLP , which assigns at compile-time one type to each thread in the input code, the classification of the threads being based on static criteria that we defined in the thesis, and another algorithm called NUMA-BTDM  that maps threads (mapping establishes the cores that threads will run on) at compile-time according to their type, aiming to improve the balanced data locality on NUMA systems. NUMA-BTDM  takes into account the static behavior of the code when performing the mapping and eliminates the disadvantages of dynamic mapping (execution time and extra energy consumption during running) and some important disadvantages of static mapping: unpredictability at compile-time of the number of threads and unpredictability of the latencies of memory operations.
The research aims to optimize the parallel C / C ++ applications that use the PThreads Library  for the management of threads, through two algorithms, one for static classification of the threads and the other for their static mapping. Algorithms eliminate some of the disadvantages of not knowing the dynamic behavior at compile-time, such as not knowing the number of threads. The algorithms optimize the execution time and power consumption of the applications by improving the balanced data locality when running these applications on NUMA systems.
Although the NUMA-BTLP  algorithm inserts, at compile-time, additional function calls that set the CPU affinity of each thread, the NUMA-BTLP  algorithm does not degrade either the runtime or the power consumption of NUMA or UMA systems, for tested applications, but improves the runtime and the power consumption for small number of autonomous threads with up to 2% in both cases and only the power consumption for large number of autonomous threads and a small number of side-by-side threads with up to 15%.