C column of Pointer <2> malloc() free()

本文详细介绍了C语言中如何使用malloc()函数分配固定大小的内存块,并通过实例演示了如何初始化和释放这些内存块。此外,还讲解了指针操作及注意事项。

Ok, let us go on the pointer, previous chapters we get the definition of pointer and how the difference variables allocate in the memory,this chapter i will introduce how to allocate a fixed size of block in the memory(called heap) and then move the pointer in this fixed space.

In "C column of Pointer <0>", we have introduced a function called malloc() to allocate fixed block in the heap , if you don't know just refer to that chapter .Here we go, the prototype is

/* size_t is unsigned integer type
 * you'd better not define 0 or larger number.
 *  return a void type pointer
 */
void *malloc(size_t size);


This malloc() function is lib function defined in <stdlib.h> header, if you don't include this header ,then the compiler may warn you a undefined pointer .On success , a pointer to the memory block allocated, but failure, a null pointer is returned. So checking the returned pointer is a good strategy as below.

int *sp = malloc(4*sizeof(int));

if(sp == NULL)
{
   //Allocated failure
  printf("Out of memory");
  exit(1); 
}
/*go on the program manipulation*/

My pc is 32bit, and integer type is 32 bits. So this segment of program allocate a 4*4 bytes block in the memory (called heap). The newly allocated block is not initialized. So any data acquired from this block has nothing means, on the contrary, these data maybe confused you if you ignored this issue. So the best way is to initialize them by hand and added this sentence

memset(sp,0,4*sizeof(4));//Initialize zero

Here i draw a schematic to help you understands this process as below


When return pointer sp is not null, so allocate successfully, this position is the show by "Arrow 0" ,if you want reload a integer number to this address. Just like using the pointer assignment

*sp = 100;
So the first value has been assigned in the allocated block, if you want to set the second , third etc, just move the pointer sp++, then assign the new value to the new address.

sp++;     //point to “Arrow 1”
*p = 200;//
sp++;    //point to "Arrow 2"
*p = 300;

But here, we make a mistake, if you don't need this block and want to free , what should we do? 

/*free() release the memory allocated by calloc(), malloc(), realloc()
*
*@param   ptr point to a memory block previously allocted.
*/

void free(void *ptr);

Whether we can just used this function to free the block has been allocated by above program or not? It's "free(sp) " ok?

The answer is no, because we have move the pointer point to another address. So a good manner to allocated a required block is using a temporary pointer to hold the return pointer, then use this pointer to free the block, we don't need , the example as below

/* Allocated a requested block then free it
*/

#include <stdlib.h>

int main(void)
{
  int *pf,*sp;
  pf = malloc(4*sizeof(int));
  if( pf == NULL)
  {
     printf("Out of memory!\n");
     exit(1);
  }
  sp = pf;
  *sp = 100;
  *sp++ = 200;//As operate precedence *sp++ = *(sp++)
  *sp++ = 300;
  sp = pf;
  printf("The content is %d %d %d\n",*sp,*sp++,*sp++);
  
  /*Do not need the 4*4 bytes*/
  free(pf);
  return 0;
}

Ok, till now , we have introduced the memory allocation or deallocation , and increment the pointer to assign the new data which you want to store in this allocated memory. Now we focus on the pointer again, when the sp is a integer pointer variable, each  increment operate sp++ moves 4 bytes address (Really depend on our pc). let  us list them before


This table really depend on your computer type, especially in embedded device. we must focus on what the size of each type of data be held in memory. Then we can use increment or decrement operator to point to the difference address. But note that , never try to point to the none allocated memory which you don't know .


Time limited . Have a nice day!
#include "cuda_runtime.h" #include "device_launch_parameters.h" #include <iostream> #include <ctime> using namespace std; cudaError_t CudaCheck(float* matrix_1, float* matrix_2, float* matrix_3, int row, int column); void MatrixSumOnCPU(float* matrix_1, float* matrix_2, float* matrix_3, int row, int column); __global__ void MatrixSumOnGPU(float* matrix1, float* matrix2, float* matrix3, int row, int column); int main() { int row = 1 << 8; int column = 1 << 6; int byte_num = row * column * sizeof(float); float* matrix_1 = (float*)malloc(byte_num); float* matrix_2 = (float*)malloc(byte_num); float* matrix_3_GPU = (float*)malloc(byte_num); float* matrix_3_CPU = (float*)malloc(byte_num); float* matrix_1_pointer = matrix_1; float* matrix_2_pointer = matrix_2; srand(time(NULL)); for (int j = 0; j < row; j++) { for (int i = 0; i < column; i++) { matrix_1_pointer[i] = (float)rand(); matrix_2_pointer[i] = (float)rand(); } matrix_1_pointer += column; matrix_2_pointer += column; } cudaError_t cudaStatus = CudaCheck(matrix_1, matrix_2, matrix_3_GPU, row, column); if (cudaStatus != cudaSuccess) { std::cerr << "CudaCheck failed" << std::endl; return 1; } cudaStatus = cudaDeviceReset(); if (cudaStatus != cudaSuccess) { std::cerr << "CudaCheck failed" << std::endl; return 1; } MatrixSumOnCPU(matrix_1, matrix_2, matrix_3_CPU, row, column); free(matrix_1); free(matrix_2); free(matrix_3_GPU); free(matrix_3_CPU); free(matrix_1_pointer); free(matrix_2_pointer); return 0; } void MatrixSumOnCPU(float* matrix_1, float* matrix_2, float* matrix_3, int row, int column) { float* matrix_a = matrix_1; float* matrix_b = matrix_2; float* matrix_c = matrix_3; clock_t cpu_start = clock(); for (int j = 0; j < row; j++) { for (int i = 0; i < column; i++) { matrix_c[i] = matrix_a[i] + matrix_b[i]; } matrix_a += column; matrix_b += column; matrix_c += column; } clock_t cpu_end = clock(); double total_time = double(cpu_end - cpu_start) / CLOCKS_PER_SEC; std::cout << "the execution time of CPU is " << total_time << std::endl; free(matrix_a); free(matrix_b);//在这报错 已在 CudaMatrixSum.exe 中执行断点指令(__debugbreak()语句或类似调用) free(matrix_c); return; } __global__ void MatrixSumOnGPU(float* matrix_1, float* matrix_2, float* matrix_3, int row, int column) { /* int block_id = blockIdx.x + blockIdx.y * gridDim.x + blockIdx.z * gridDim.y * gridDim.x; int thread_id = threadIdx.x + threadIdx.y * blockDim.x + threadIdx.z * blockDim.y * blockDim.x; int thread_num_per_block = blockDim.x * blockDim.y * blockDim.z; int idx = thread_id + thread_num_per_block * block_id; */ int ix = threadIdx.x + blockIdx.x * blockDim.x; int iy = threadIdx.y + blockIdx.y * blockDim.y; int idx = ix + iy * row; if (ix < column && iy < row) { matrix_3[idx] = matrix_1[idx] + matrix_2[idx]; } } cudaError_t CudaCheck(float* matrix_1, float* matrix_2, float* matrix_3, int row, int column) { float* device_matrix_1 = 0; float* device_matrix_2 = 0; float* device_matrix_3 = 0; int byte_num = row * column * sizeof(float); cudaError_t cudaStatus; cudaStatus = cudaSetDevice(0); if (cudaStatus != cudaSuccess) { std::cerr << "cudaSetDevice failed" << std::endl; goto Error; } cudaStatus = cudaMalloc((float**)&device_matrix_1, byte_num); if (cudaStatus != cudaSuccess) { std::cerr << "cudaMalloc failed" << std::endl; goto Error; } cudaStatus = cudaMalloc((float**)&device_matrix_2, byte_num); if (cudaStatus != cudaSuccess) { std::cerr << "cudaMalloc failed" << std::endl; goto Error; } cudaStatus = cudaMalloc((float**)&device_matrix_3, byte_num); if (cudaStatus != cudaSuccess) { std::cerr << "cudaMalloc failed" << std::endl; goto Error; } cudaStatus = cudaMemcpy(device_matrix_1, matrix_1, byte_num, cudaMemcpyHostToDevice); if (cudaStatus != cudaSuccess) { std::cerr << "cudaMemcpy failed" << std::endl; goto Error; } cudaStatus = cudaMemcpy(device_matrix_2, matrix_2, byte_num, cudaMemcpyHostToDevice); if (cudaStatus != cudaSuccess) { std::cerr << "cudaMemcpy failed" << std::endl; goto Error; } dim3 block(32, 32); dim3 grid((column - 1) / block.x + 1, (row - 1) / block.y + 1); clock_t gpu_start = clock(); MatrixSumOnGPU <<<grid, block >>> (device_matrix_1, device_matrix_2, device_matrix_3, row, column); clock_t gpu_end = clock(); double total_time = double(gpu_end - gpu_start) / CLOCKS_PER_SEC; std::cout << "the execution time of GPU is " << total_time << std::endl; cudaStatus = cudaGetLastError(); if (cudaStatus != cudaSuccess) { std::cerr << "cudaGetLastError failed" << std::endl; goto Error; } cudaStatus = cudaDeviceSynchronize(); if (cudaStatus != cudaSuccess) { std::cerr << "cudaDeviceSynchronize failed" << std::endl; goto Error; } cudaStatus = cudaMemcpy(matrix_3, device_matrix_3, byte_num, cudaMemcpyDeviceToHost); if (cudaStatus != cudaSuccess) { std::cerr << "cudaMemcpy failed" << std::endl; goto Error; } Error: cudaFree(device_matrix_1); cudaFree(device_matrix_2); cudaFree(device_matrix_3); return cudaStatus; }
10-22
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值