您只需要在 kernels 指令上添加一个“present(grid)”子句。
这是您的程序示例,其中包含修复程序以及其他一些内容,例如更新数据以便可以将其打印在主机上。
% cat test.f90
program Test
implicit none
type dt
integer :: n
real, dimension(:), allocatable :: xm
end type dt
type(dt) :: grid
integer :: i
grid%n = 10
allocate(grid%xm(grid%n))
!$acc enter data copyin(grid)
!$acc enter data create(grid%xm)
!$acc kernels present(grid)
do i = 1, grid%n
grid%xm(i) = i * i
enddo
!$acc end kernels
!$acc update host(grid%xm)
print*,grid%xm
!$acc exit data delete(grid%xm, grid)
deallocate(grid%xm)
end program Test
% pgf90 -acc test.f90 -Minfo=accel -ta=tesla -V16.10; a.out
test:
16, Generating enter data copyin(grid)
17, Generating enter data create(grid%xm(:))
18, Generating present(grid)
19, Loop is parallelizable
Accelerator kernel generated
Generating Tesla code
19, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
23, Generating update self(grid%xm(:))
1.000000 4.000000 9.000000 16.00000
25.00000 36.00000 49.00000 64.00000
81.00000 100.0000
请注意,PGI 17.7 将在 Fortran 中包含对真正深拷贝的 beta 支持。与上面的手动深拷贝相反。下面是一个使用真正深拷贝的例子:
% cat test_deep.f90
program Test
implicit none
type dt
integer :: n
real, dimension(:), allocatable :: xm
end type dt
type(dt) :: grid
integer :: i
grid%n = 10
allocate(grid%xm(grid%n))
!$acc enter data copyin(grid)
!$acc kernels present(grid)
do i = 1, grid%n
grid%xm(i) = i * i
enddo
!$acc end kernels
!$acc update host(grid)
print*,grid%xm
!$acc exit data delete(grid)
deallocate(grid%xm)
end program Test
% pgf90 -acc test_deep.f90 -Minfo=accel -ta=tesla:deepcopy -V17.7 ; a.out
test:
16, Generating enter data copyin(grid)
17, Generating present(grid)
18, Loop is parallelizable
Accelerator kernel generated
Generating Tesla code
18, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
22, Generating update self(grid)
1.000000 4.000000 9.000000 16.00000
25.00000 36.00000 49.00000 64.00000
81.00000 100.0000