What is batch_size?

深度学习中经常看到epoch、 iteration和batchsize,下面按自己的理解说说这三个的区别:

(1)batchsize:批大小。在深度学习中,一般采用SGD训练,即每次训练在训练集中取batchsize个样本训练;
(2)iteration:1个iteration等于使用batchsize个样本训练一次;
(3)epoch:1个epoch等于使用训练集中的全部样本训练一次;

举个例子,训练集有1000个样本,batchsize=10,那么:
训练完整个样本集需要:

100次iteration,1次epoch。

Batch size defines number of samples that going to be propagated through the network.

For instance, let's say you have 1050 training samples and you want to set up batch_size equal to 100. Algorithm takes first 100 samples (from 1st to 100th) from the training dataset and trains network. Next it takes second 100 samples (from 101st to 200th) and train network again. We can keep doing this procedure until we will propagate through the networks all samples. The problem usually happens with the last set of samples. In our example we've used 1050 which is not divisible by 100 without remainder. The simplest solution is just to get final 50 samples and train the network.

Advantages:

  • It requires less memory. Since you train network using less number of samples the overall training procedure requires less memory. It's especially important in case if you are not able to fit dataset in memory.

  • Typically networks trains faster with mini-batches. That's because we update weights after each propagation. In our example we've propagated 11 batches (10 of them had 100 samples and 1 had 50 samples) and after each of them we've updated network's parameters. If we used all samples during propagation we would make only 1 update for the network's parameter.

Disadvantages:

  • The smaller the batch the less accurate estimate of the gradient. In the figure below you can see that mini-batch (green color) gradient's direction fluctuates compare to the full batch (blue color).

enter image description here

Stochastic is just a mini-batch with batch_size equal to 1. Gradient changes its direction even more often than a mini-batch.

# pygcbs: # app_name: 'APP1' # master: '192.168.0.123' # port: 6789 # level: 'DEBUG' # interval: 1 # checklist: [ "System","CPU", "GPU","Mem","NPU", ] # save_path: "./" # docker: # pygcbs_image: nvidia-pygcbs:v1.0 # worker_image: nvidia-mindspore1.8.1:v1.0 # python_path: /opt/miniconda/bin/python # workers: # - '192.168.0.123:1' # socket_ifname: # - enp4s0 # tasks: #--------------------wide_deep--------------------------------- # - application_domain: "推荐" # task_framework: "Mindspore" # task_type: "推理" # task_name: "wide_deep_infer" # scenario: "SingleStream" # is_run_infer: True # project_path: '/home/gcbs/infer/wide_deep_infer' # main_path: "main.py" # dataset_path: '/home/gcbs/Dataset/wide_deep_data/' # times: 1 # 重试次数 #distribute do_eval: True is_distributed: False is_mhost: False exp_value: 0.501 #model log name: "wide_deep" Metrics: "AUC" request_auc: 0.74 dataset_name: "Criteo 1TB Click Logs Dataset" application: "推荐" standard_time: 3600 python_version: 3.8 mindspore_version: 1.8.1 # Builtin Configurations(DO NOT CHANGE THESE CONFIGURATIONS unless you know exactly what you are doing) enable_modelarts: False data_url: "" train_url: "" checkpoint_url: "" data_path: "./data" dataset_path: "/home/gcbs/Dataset/wide_deep_data/" output_path: "/cache/train" load_path: "/cache/checkpoint_path" device_target: GPU enable_profiling: False data_format: 1 total_size: 10000000 performance_count: 10 # argparse_init 'WideDeep' epochs: 15 full_batch: False batch_size: 16000 eval_batch_size: 16000 test_batch_size: 16000 field_size: 39 vocab_size: 200000 vocab_cache_size: 0 emb_dim: 80 deep_layer_dim: [1024, 512, 256, 128] deep_layer_act: 'relu' keep_prob: 1.0 dropout_flag: False ckpt_path: "./check_points" stra_ckpt: "./check_points" eval_file_name: "./output/eval.log" loss_file_name: "./output/loss.log" host_device_mix: 0 dataset_type: "mindrecord" parameter_server: 0 field_slice: False sparse: False use_sp: True deep_table_slice_mode: "column_slice" #star_logen config mlperf_conf: './test.conf' user_conf: './user.conf' output: '/tmp/code/' scenario: 'Offline' max_batchsize: 16000 threads: 4 model_path: "./check_points/widedeep_train-12_123328.ckpt" is_accuracy: False find_peak_performance: False duration: False target_qps: False count_queries: False samples_per_query_multistream: False max_latency: False samples_per_query_offline: 500 # WideDeepConfig #data_path: "./test_raw_data/" #vocab_cache_size: 100000 #stra_ckpt: './checkpoints/strategy.ckpt' weight_bias_init: ['normal', 'normal'] emb_init: 'normal' init_args: [-0.01, 0.01] l2_coef: 0.00008 # 8e-5 manual_shape: None # wide_and_deep export device_id: 1 ckpt_file: "./check_points/widedeep_train-12_123328.ckpt" file_name: "wide_and_deep" file_format: "MINDIR" # src/process_data.py "Get and Process datasets" raw_data_path: "./raw_data" # src/preprocess_data.py "Recommendation dataset" dense_dim: 13 slot_dim: 26 threshold: 100 train_line_count: 45840617 skip_id_convert: 0 # src/generate_synthetic_data.py 'Generate Synthetic Data' output_file: "./train.txt" label_dim: 2 number_examples: 4000000 vocabulary_size: 400000000 random_slot_values: 0 #get_score threads_count: 4 base_score: 1 accuracy: 0.72 baseline_performance: 1文件中的这些是什么?、
最新发布
06-17
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值