Hybrid implementation in CUDA and OpenMP for the solution of the convection diffusion equation with the local Modified SOR method

Postgraduate Thesis uoadl:1320502 516 Read counter

Unit:
Κατεύθυνση Ηλεκτρονικός Αυτοματισμός (Η/Α, με πρόσθετη εξειδίκευση στην Πληροφορική και στα πληροφοριακά συστήματα)
Library of the School of Science
Deposit date:
2014-11-05
Year:
2014
Author:
Γιαννακόπουλος Πέτρος
Supervisors info:
Ιωάννης Κοτρώνης Επίκ. Καθηγητής
Original Title:
Υβριδική υλοποίηση σε CUDA και OpenMP για την επίλυση της εξίσωσης διάχυσης θερμότητας με χρήση της τοπικής Τροποποιημένης μεθόδου SOR
Languages:
Greek
Translated title:
Hybrid implementation in CUDA and OpenMP for the solution of the convection diffusion equation with the local Modified SOR method
Summary:
The subject of the present Thesis is a hybrid parallel implementation of the of
the SOR method for the numerical solution of the Convection Diffusion equation
which will take advantage of both the CPU and GPU for the acceleration of the
time required to reach a solution, using hybrid OpenMP – CUDA code. The start
point is two preexisting separate implementations of LMSOR, in OpenMP and CUDA,
employing red-black ordering using two sets of parameters ωij and ω2ij for the
5 point stencil. Grid lines are distributed statically between the CPU and GPU
and each performs the calculations on its own segment with border rows exchange
and computation of the aggregate convergence taking place on every iteration.
The CPU implementation (OpenMP) also employs SSE2 extensions, something that
was also incorporated in the hybrid implementation for the part of the
computations pertaining to the CPU. For the grid lines computed on the GPU, 3
variations of the kernels have been implemented for the use of global, shared
or texture memory, in accordance with the initial GPU-only implementation. The
performance achieved for computations with this hybrid implementation was, for
large and medium-sized problems, satisfactory and quite close to the aggregate
performance of the initial CPU-only and GPU-only implementations.
Keywords:
Iterative methods, OpenMP-CUDA hybrid, SSE2 extensions, GPU computing, Successive Over-Relaxation method
Index:
Yes
Number of index pages:
7-8
Contains images:
Yes
Number of references:
8
Number of pages:
41
document.pdf (1 MB) Open in new window