Dynamic 3D array argument in cuda -
I am trying to use dynamic 3D array arguments in the kernel function in the kernel function, but I can not do good.
__GLOBAL_WIRED Kernel (3D array pointer) {// do something} int main () {integer NUM_OF_ARRAY; Const int ROW; Const intCAL; ANR AR [NUM_OF_ARRAY] [ROW] [KAL]; // Maybe I should use CudaMalloc3D or CudaMalloc3DArray Dim3 Grid (,,,); Dim 3 blocks (,,,); Colonel & lt; & Lt; & Lt; Grid, Block & gt; & Gt; & Gt; (?); }
I looked at Robert's answer but I think my case is slightly different.
If the array's line and cal are set at runtime, how can I allocate that memory to the pointer of the Cuda and kernel functions?
I tried using cudaMalloc3D or cudaMalloc3DArray, but I can not do well because I have never used before.
Is there any dynamic 3D array logic?
This would be helpful for me thanks.
For all the reasons suggested in previous connected answers and elsewhere, it is not necessary to handle 3D arrays A good way to make a better way is to level the array and use the pointer arithmetic to simulate the use of 3D.
But to show only that the previous example does not really require hard coded dimensions, here is an example that is modified to show variable (run-time) dimension usage:
#include & lt; Iostream & gt; Inline zero GPUassert (cudaError_t code, char * file, int row, bool abort = true) {if (code! = 0) {fprintf (stderr, "GPUassert:% s% s% d \ n", cudaGetErrorString (code) file , Line); If (abort) exit (code); }} #defined GPUhrchk (ans) {GPUassert ((ans), __FILE__, __LINE__); } __global__ zero zero (int *** a, int sz_x, int sz_y, int sz_z) {for (int i = 0; i & lt; sz_z; i ++) for (int j = 0; j & lt; For sz_y; j ++) (int k = 0; k & lt; sz_x; k ++) a [i] [j] [k] = ij + k; } Int main () {unsigned sx; Unsigned SI; Unsigned SJ; Std :: cout & lt; & Lt; Std :: endl & lt; & Lt; Enter "X Dimension (third subscript):"; Std :: cin & gt; & Gt; Sx; Std :: cout & lt; & Lt; Std :: endl & lt; & Lt; Enter the "Y Dimension (2 Subscript):"; Std :: cin & gt; & Gt; SY Std :: cout & lt; & Lt; Std :: endl & lt; & Lt; Enter the "zoom dimension (first subscript):"; Std :: cin & gt; & Gt; SZ; Int *** h_c = (int ***) malloc (sz * sizeof (int **)); For (int i = 0; i & lt; sz; i ++) {h_c [i] = (int **) mlok (sy * sizeof (int *)); For (Int J = 0; J & lt; C; J ++) GPRRCrK (Koodamlock (Zero **) and HCC [I] [J], SX * size (int)); } Int *** h_c1 = (int ***) malloc (sz * sizeof (int **)); For (Int i = 0; I
I have modified the data stored by the kernel for I-j + K
instead of I j + K
+. In addition, I have created a [z] [y] [x]
to subscribe, because the form of calculation of [threadIdx.z] [threadIdx thread index Will suggest the use of .i] [thread id x. X]
which would be most suited for millennium access. However, this type of multi-subscripted array in the kernel will still be disabled due to the indicator-pursuit to solve the final location of the data.
Comments
Post a Comment