Tensoflow sess.run导致的内存溢出

最新推荐文章于 2025-09-30 13:36:53 发布

原创最新推荐文章于 2025-09-30 13:36:53 发布 · 1.5k 阅读

0 ·

CC 4.0 BY-SA版权

深度学习专栏收录该内容

23 篇文章

订阅专栏

博客分享了调用模型进行批量测试时出现溢出问题的排查与解决过程。起初怀疑是数据读入方式问题，尝试多种方式后发现是sess.run处随着循环变慢、内存飙升。解决办法是将部分代码放for循环外，且迭代中sess.run的for循环勿加tensorflow操作，建议用opencv读取数据。

下面是调用模型进行批量测试的代码(出现溢出)，开始以为导致溢出的原因是数据读入方式问题引起的，用了tf , PIL和cv等方式读入图片数据，发现越来越慢，内存占用飙升，调试时发现是sess.run这里出了问题（随着for循环进行速度越来越慢）。

    # Creates graph from saved GraphDef
    create_graph(pb_path)

    # Init tf Session
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    sess = tf.Session(config=config)
    init = tf.global_variables_initializer()
    sess.run(init)


    input_image_tensor = sess.graph.get_tensor_by_name("create_inputs/batch:0") 
    output_tensor_name = sess.graph.get_tensor_by_name("conv6/out_1:0")  


    for filename in os.listdir(image_dir):
        image_path = os.path.join(image_dir, filename)
  
        start = time.time()
        image_data = cv2.imread(image_path)
        image_data = cv2.resize(image_data, (w, h))
        image_data_1 = image_data - IMG_MEAN
        input_image = np.expand_dims(image_data_1, 0)

        raw_output_up = tf.image.resize_bilinear(output_tensor_name, size=[h, w], align_corners=True) 
        raw_output_up = tf.argmax(raw_output_up, axis=3)
        

        predict_img = sess.run(raw_output_up, feed_dict={input_image_tensor: input_image})       # 1，height，width
        predict_img = np.squeeze(predict_img)     #  height， width 

        voc_palette = visual.make_palette(3)
        masked_im = visual.vis_seg(image_data, predict_img, voc_palette)
        cv2.imwrite("%s_pred.png" % (save_dir + filename.split(".")[0]), masked_im)


        print(time.time() - start)

    print(">>>>>>Done")

下面是解决溢出问题的代码（将部分代码放在for循环外）

    # Creates graph from saved GraphDef
    create_graph(pb_path)

    # Init tf Session
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    sess = tf.Session(config=config)
    init = tf.global_variables_initializer()
    sess.run(init)

    input_image_tensor = sess.graph.get_tensor_by_name("create_inputs/batch:0") 
    output_tensor_name = sess.graph.get_tensor_by_name("conv6/out_1:0")  
    
##############################################################################################################
    raw_output_up = tf.image.resize_bilinear(output_tensor_name, size=[h, w], align_corners=True) 
    raw_output_up = tf.argmax(raw_output_up, axis=3)
##############################################################################################################

    for filename in os.listdir(image_dir):
        image_path = os.path.join(image_dir, filename)
  
        start = time.time()
        image_data = cv2.imread(image_path)
        image_data = cv2.resize(image_data, (w, h))
        image_data_1 = image_data - IMG_MEAN
        input_image = np.expand_dims(image_data_1, 0)
        
        predict_img = sess.run(raw_output_up, feed_dict={input_image_tensor: input_image})       # 1，height，width
        predict_img = np.squeeze(predict_img)     #  height， width 

        voc_palette = visual.make_palette(3)
        masked_im = visual.vis_seg(image_data, predict_img, voc_palette)
        cv2.imwrite("%s_pred.png" % (save_dir + filename.split(".")[0]), masked_im)
        print(time.time() - start)

    print(">>>>>>Done")

总结:

在迭代过程中, 在sess.run的for循环中不要加入tensorflow一些op操作，会增加图节点，否则随着迭代的进行，tf的图会越来越大，最终导致溢出；
建议不要使用tf.gfile.FastGFile(image_path, 'rb').read()读入数据（有可能会造成溢出），用opencv之类读取。