cudaMallocPitch – 向GPU分配存储器

时间：2015-03-31 17:16:01 阅读：116 评论：0 收藏：0 [点我收藏+]

标签：

概要 cudaError_t cudaMallocPitch( void** devPtr，size_t* pitch，size_t widthInBytes，size_t height )

说明向设备分配至少widthInBytes*height字节的线性存储器，并以*devPtr的形式返回指向所分配存储器的指针。该函数可以填充所分配的存储器，以确保在地址从一行更新到另一行时，给定行的对应指针依然满足对齐要求。cudaMallocPitch()以*pitch的形式返回间距，即所分配存储器的宽度，以字节为单位。间距用作存储器分配的一个独立参数，用于在2D数组内计算地址。如果给定一个T类型数组元素的行和列，可按如下方法计算地址：

T* pElement = (T*)((char*)BaseAddress + Row * pitch) + Column;

对于2D数组的分配，建议程序员考虑使用cudaMallocPitch()来执行间距分配。由于硬件中存在间距对齐限制，如果应用程序将在设备存储器的不同区域之间执行2D存储器复制（无论是线性存储器还是CUDA数组），这种方法将非常有用。

例子：为EmuDebug 原来《CUDA编程指南》上给出的pitch的类型为int，在实际运行时与cudaMallocPitch()类型不匹配。

 1 /************************************************************************/
 2 /*  This is a example of the CUDA program. 
 3 /************************************************************************/
 4 
 5 #include <stdio.h>
 6 #include <stdlib.h>
 7 #include <cuda_runtime.h>
 8 #include <cutil.h>
 9 
10 /************************************************************************/ 
11 /* myKernel                                                           */ 
12 /************************************************************************/ 
13 __global__ void myKernel(float* devPtr,int height,int width,int pitch) 
14 { 
15     for(int r=0;r    { 
16         float* row=(float*)((char*)devPtr+r*pitch); 
17         for (int c=0;c        { 
18             float element=row[c]; 
19             printf("%f/n",element);//模拟运行 
20         } 
21     } 
22 }
23 
24 /************************************************************************/ 
25 /* Main CUDA                                                            */ 
26 /************************************************************************/ 
27 int main(int argc, char* argv[]) 
28 { 
29     size_t width=10; 
30     size_t height=10; 
31 
32     float* decPtr; 
33    //pitch的值应该为size_t在整形的时，与函数参数不匹配 
34     size_t pitch; 
35     cudaMallocPitch((void**)&decPtr,&pitch,width*sizeof(float),height);  
36     myKernel<<<1,1>>>(decPtr,10,10,pitch); 
37     cudaFree(decPtr);
38 
39     printf("%d/n",pitch);
40 
41     //CUT_EXIT(argc, argv);
42 
43     return 0; 
44 }

cudaMallocPitch – 向GPU分配存储器

标签：

原文地址：http://www.cnblogs.com/liangliangdetianxia/p/4381168.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行