OpenMP编程学习笔记三

最新推荐文章于 2024-09-04 15:16:31 发布

原创最新推荐文章于 2024-09-04 15:16:31 发布 · 1.5k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#编程 #thread #parallel #binding #测试 #c

本文介绍了OpenMP的sections特性，通过C/C++代码示例展示了如何将任务划分为独立section并由不同线程执行。在计算sum之前，OpenMP自动同步所有section，确保正确性。文中还讨论了并行程序中模块划分的重要性以及线程绑定行为。

今天学习了OpenMP sections。sections的主要功能是用户可以将一个任务分成独立的几个section，每个section由不同的thread来处理。

C/C++测试代码：

int a = 2;
int b = 20;
int c = 200;
int d = 2000;
int sum;

omp_set_num_threads( 4 );

#pragma omp parallel

{

#pragma omp sections

{

#pragma omp section

{

a = 1;

printf("execute thread ID is %d/n", omp_get_thread_num());

}

#pragma omp section

{

b = 10;

printf("execute thread ID is %d/n", omp_get_thread_num());

}

#pragma omp section

{

c = 100;

printf("execute thread ID is %d/n", omp_get_thread_num());

}

#pragma omp section

{

d = 1000;

printf("execute thread ID is %d/n", omp_get_thread_num());

}

sum = a + b + c +d;

printf("sum = %d/n", sum);

}

测试结果为：

execute thread ID is 3
execute thread ID is 1
execute thread ID is 2
execute thread ID is 0
sum = 1111
sum = 1111
sum = 1111
sum = 1111

可以看出，四个section分配给了四个thread分别执行，由于OpenMP sections是自动同步各个section的，所以sum的结果是

所期望的。为了更直观的了解section是自动同步的(在计算sum之前，会等待所有的thread执行结束),修改代码如下：

int a = 2;
int b = 20;
int c = 200;
int d = 2000;
int sum;

omp_set_num_threads( 4 );

#pragma omp parallel

{

#pragma omp sections

{

#pragma omp section

{

a = 1;

printf("execute thread ID is %d/n", omp_get_thread_num());

for(int i = 0; i < 0x7ffff; i++)
{
    a = 1;
    b = 2;
    c = 3;
    d = 4;
}

}

#pragma omp section

{

b = 10;

printf("execute thread ID is %d/n", omp_get_thread_num());

for(int i = 0; i < 0x7fffff; i++)
{
    a = 10;
    b = 20;
    c = 30;
    d = 40;
}

}

#pragma omp section

{

c = 100;

printf("execute thread ID is %d/n", omp_get_thread_num());

for(int i = 0; i < 0x7ffffff; i++)
{
    a = 100;
    b = 200;
    c = 300;
    d = 400;
}

}

#pragma omp section

{

d = 1000;

printf("execute thread ID is %d/n", omp_get_thread_num());

for(int i = 0; i < 0x7fffffff; i++)
{
    a = 1000;
    b = 2000;
    c = 3000;
    d = 4000;
}

}

sum = a + b + c +d;

printf("sum = %d/n", sum);

}

测试结果为：

execute thread ID is 0
execute thread ID is 2
execute thread ID is 1
execute thread ID is 3
sum = 10000
sum = 10000
sum = 10000
sum = 10000

第四个section的执行时间最长，所以sum结果的要等到第四个section被执行完毕后开始计算。

并行程序一个重要的问题就是将待处理问题划分成不同的模块，这些模块之间是独立的，可以通过sections分给不同的处理器去执行。

注意到sum = 10000被输出了4次，这是因为sum = a + b + c +d;代码在parallel region, 所有的线程都会执行一次。

摘抄：The binding thread set for a sections region is the current team. A sections region binds to the innermost enclosing

parallel region. Only the threads of the team executing the binding parallel region participate in the execution of the

structured blocks and (optional) implicit barrier of the sections region.