原文链接:http://blog.youkuaiyun.com/u010167269/article/details/52851667
前言
之前我在 论文阅读:SSD: Single Shot MultiBox Detector 中,讲了这个最新的 Object Detection 算法。
既然 SSD 是用来检测物体的,那么可不可以将 SSD 用来检测自然场景图像中的文字呢?答案肯定是可以的~
同时,受到浙大 solace_hyh 同学的 ssd-plate_detection 工作,这篇文章记录我自己将 SSD 用于文字检测的过程。
全部的代码上传到 Github 了:https://github.com/chenxinpeng/SSD_scene-text-detection,代码质量不太高,还请高手指点 。^_^
准备与转换数据集
ICDAR 2011 数据集训练集共有 229 张图像,我将其分为 159 张、70张图像两部分。前者用作训练,后者用于训练时进行测试。
下面就是要将这些图像,转换成 lmdb 格式,用于 caffe 训练;将文字区域的标签,转换为 Pascal VOC 的 XML 格式。
将 ground truth 转换为 Pascal VOC XML 文件
先将 ICDAR 2011 给定的 gt_**.txt
标签文件转换为 Pascal VOC XML 格式。
先看下原来的 gt_**.txt
格式,如下图,有一张原始图像:
下面是其 ground truth 文件:
<code class="hljs bash has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">158</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">128</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">412</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">182</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"Footpath"</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">442</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">128</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">501</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">170</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"To"</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">393</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">198</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">488</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">240</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"and"</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">63</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">200</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">363</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">242</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"Colchester"</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">71</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">271</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">383</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">313</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"Greenstead"</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li></ul>
ground truth 文件格式为:
xmin, ymin, xmax, ymax, label
。同时,要注意,这里的坐标系是如下摆放:
将 ground truth 的 txt 文件转换为 Pascal VOC 的 XML 格式的代码如下:
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#! /usr/bin/python</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> os, sys <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> glob <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> PIL <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> Image <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># ICDAR 图像存储位置</span> src_img_dir = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"/media/chenxp/Datadisk/ocr_dataset/ICDAR2011/train-textloc"</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># ICDAR 图像的 ground truth 的 txt 文件存放位置</span> src_txt_dir = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"/media/chenxp/Datadisk/ocr_dataset/ICDAR2011/train-textloc"</span> img_Lists = glob.glob(src_img_dir + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/*.jpg'</span>) img_basenames = [] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># e.g. 100.jpg</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> item <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> img_Lists: img_basenames.append(os.path.basename(item)) img_names = [] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># e.g. 100</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> item <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> img_basenames: temp1, temp2 = os.path.splitext(item) img_names.append(temp1) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> img <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> img_names: im = Image.open((src_img_dir + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/'</span> + img + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'.jpg'</span>)) width, height = im.size <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># open the crospronding txt file</span> gt = open(src_txt_dir + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/gt_'</span> + img + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'.txt'</span>).read().splitlines() <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># write in xml file</span> os.mknod(src_txt_dir + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/'</span> + img + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'.xml'</span>) xml_file = open((src_txt_dir + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/'</span> + img + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'.xml'</span>), <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'w'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'<annotation>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <folder>VOC2007</folder>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <filename>'</span> + str(img) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'.jpg'</span> + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'</filename>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <size>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <width>'</span> + str(width) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'</width>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <height>'</span> + str(height) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'</height>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <depth>3</depth>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' </size>\n'</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># write the region of text on xml file</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> img_each_label <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> gt: spt = img_each_label.split(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">','</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <object>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <name>text</name>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <pose>Unspecified</pose>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <truncated>0</truncated>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <difficult>0</difficult>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <bndbox>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <xmin>'</span> + str(spt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'</xmin>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <ymin>'</span> + str(spt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'</ymin>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <xmax>'</span> + str(spt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>]) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'</xmax>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' <ymax>'</span> + str(spt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span>]) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'</ymax>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' </bndbox>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' </object>\n'</span>) xml_file.write(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'</annotation>'</span>)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li><li style="box-sizing: border-box; padding: 0px 5px;">46</li><li style="box-sizing: border-box; padding: 0px 5px;">47</li><li style="box-sizing: border-box; padding: 0px 5px;">48</li><li style="box-sizing: border-box; padding: 0px 5px;">49</li><li style="box-sizing: border-box; padding: 0px 5px;">50</li><li style="box-sizing: border-box; padding: 0px 5px;">51</li><li style="box-sizing: border-box; padding: 0px 5px;">52</li><li style="box-sizing: border-box; padding: 0px 5px;">53</li><li style="box-sizing: border-box; padding: 0px 5px;">54</li><li style="box-sizing: border-box; padding: 0px 5px;">55</li><li style="box-sizing: border-box; padding: 0px 5px;">56</li><li style="box-sizing: border-box; padding: 0px 5px;">57</li><li style="box-sizing: border-box; padding: 0px 5px;">58</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li><li style="box-sizing: border-box; padding: 0px 5px;">46</li><li style="box-sizing: border-box; padding: 0px 5px;">47</li><li style="box-sizing: border-box; padding: 0px 5px;">48</li><li style="box-sizing: border-box; padding: 0px 5px;">49</li><li style="box-sizing: border-box; padding: 0px 5px;">50</li><li style="box-sizing: border-box; padding: 0px 5px;">51</li><li style="box-sizing: border-box; padding: 0px 5px;">52</li><li style="box-sizing: border-box; padding: 0px 5px;">53</li><li style="box-sizing: border-box; padding: 0px 5px;">54</li><li style="box-sizing: border-box; padding: 0px 5px;">55</li><li style="box-sizing: border-box; padding: 0px 5px;">56</li><li style="box-sizing: border-box; padding: 0px 5px;">57</li><li style="box-sizing: border-box; padding: 0px 5px;">58</li></ul>
x上面代码运行结果是得到如下的 XML 文件,同样用上面的 100.jpg
图像示例,其转换结果如下:
<code class="language-xml hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">annotation</span>></span> <span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">folder</span>></span>VOC2007<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">folder</span>></span> <span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">filename</span>></span>100.jpg<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">filename</span>></span> <span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">size</span>></span> <span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">width</span>></span>640<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">width</span>></span> <span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">height</span>></span>480<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">height</span>></span> <span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">depth</span>></span>3<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">depth</span>></span> <span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">size</span>></span> <span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">object</span>></span> ...... <span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">annotation</span>></span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li></ul>
上面代码生成的 XML 文件,与图像文件存储在一个地方。
生成训练图像与 XML 标签的位置文件
这一步,按照 SSD 训练的需求,将图像位置,及其对应的 XML 文件位置写入一个 txt 文件,供训练时读取,一个文件名称叫做:trainval.txt
文件,另一个叫做:test.txt
文件。形式如下:
<code class="hljs lasso has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;">scenetext/JPEGImages/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">106.</span>jpg scenetext/Annotations/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">106.</span><span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">xml</span> scenetext/JPEGImages/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">203.</span>jpg scenetext/Annotations/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">203.</span><span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">xml</span> scenetext/JPEGImages/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">258.</span>jpg scenetext/Annotations/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">258.</span><span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">xml</span> scenetext/JPEGImages/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">122.</span>jpg scenetext/Annotations/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">122.</span><span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">xml</span> scenetext/JPEGImages/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">103.</span>jpg scenetext/Annotations/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">103.</span><span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">xml</span> scenetext/JPEGImages/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">213.</span>jpg scenetext/Annotations/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">213.</span><span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">xml</span> scenetext/JPEGImages/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">149.</span>jpg scenetext/Annotations/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">149.</span><span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">xml</span> <span class="hljs-attribute" style="box-sizing: border-box;">...</span><span class="hljs-attribute" style="box-sizing: border-box;">...</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul>
生成的代码如下:
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#! /usr/bin/python</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> os, sys <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> glob trainval_dir = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"/home/chenxp/data/VOCdevkit/scenetext/trainval"</span> test_dir = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"/home/chenxp/data/VOCdevkit/scenetext/test"</span> trainval_img_lists = glob.glob(trainval_dir + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/*.jpg'</span>) trainval_img_names = [] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> item <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> trainval_img_lists: temp1, temp2 = os.path.splitext(os.path.basename(item)) trainval_img_names.append(temp1) test_img_lists = glob.glob(test_dir + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/*.jpg'</span>) test_img_names = [] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> item <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> test_img_lists: temp1, temp2 = os.path.splitext(os.path.basename(item)) test_img_names.append(temp1) dist_img_dir = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"scenetext/JPEGImages"</span> dist_anno_dir = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"scenetext/Annotations"</span> trainval_fd = open(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"/home/chenxp/caffe/data/scenetext/trainval.txt"</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'w'</span>) test_fd = open(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"/home/chenxp/caffe/data/scenetext/test.txt"</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'w'</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> item <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> trainval_img_names: trainval_fd.write(dist_img_dir + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/'</span> + str(item) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'.jpg'</span> + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' '</span> + dist_anno_dir + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/'</span> + str(item) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'.xml\n'</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> item <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> test_img_names: test_fd.write(dist_img_dir + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/'</span> + str(item) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'.jpg'</span> + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' '</span> + dist_anno_dir + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/'</span> + str(item) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'.xml\n'</span>)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li></ul>
生成 test name size 文本文件
这一步,SSD 还需要一个名叫:test_name_size.txt
的文件,里面记录训练图像、测试图像的图像名称、height、width。内容形式如下:
<code class="hljs lasso has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">106</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">480</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">640</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">203</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">480</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">640</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">258</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">480</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">640</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">318</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">480</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">640</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">122</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">480</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">640</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">103</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">480</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">640</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">320</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">640</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">480</span> <span class="hljs-attribute" style="box-sizing: border-box;">...</span><span class="hljs-attribute" style="box-sizing: border-box;">...</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul>
生成这个文本文件的代码如下:
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#! /usr/bin/python</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> os, sys <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> glob <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">from</span> PIL <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> Image img_dir = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"/home/chenxp/data/VOCdevkit/scenetext/JPEGImages"</span> img_lists = glob.glob(img_dir + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/*.jpg'</span>) test_name_size = open(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/home/chenxp/caffe/data/scenetext/test_name_size.txt'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'w'</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> item <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> img_lists: img = Image.open(item) width, height = img.size temp1, temp2 = os.path.splitext(os.path.basename(item)) test_name_size.write(temp1 + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' '</span> + str(height) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">' '</span> + str(width) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'\n'</span>)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li></ul>
准备标签映射文件 labelmap
这个 prototxt
文件是记录 label 与 name 之间的对应关系的,内容如下:
<code class="language-prototxt hljs css has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-tag" style="color: rgb(0, 0, 0); box-sizing: border-box;">item</span> <span class="hljs-rules" style="box-sizing: border-box;">{ <span class="hljs-rule" style="box-sizing: border-box;"><span class="hljs-attribute" style="box-sizing: border-box;">name</span>:<span class="hljs-value" style="box-sizing: border-box; color: rgb(0, 102, 102);"> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"none_of_the_above"</span> label: <span class="hljs-number" style="box-sizing: border-box;">0</span> display_name: <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"background"</span> </span></span></span>} <span class="hljs-tag" style="color: rgb(0, 0, 0); box-sizing: border-box;">item</span> <span class="hljs-rules" style="box-sizing: border-box;">{ <span class="hljs-rule" style="box-sizing: border-box;"><span class="hljs-attribute" style="box-sizing: border-box;">name</span>:<span class="hljs-value" style="box-sizing: border-box; color: rgb(0, 102, 102);"> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"object"</span> label: <span class="hljs-number" style="box-sizing: border-box;">1</span> display_name: <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"text"</span> </span></span></span>}</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li></ul>
我的 prototxt
文件名称,被我重命名为:labelmap_voc.prototxt
生成 lmdb 数据库
准备好上述的几个文本文件,将其放置在如下位置:
<code class="language-bash hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;">/home/chenxp/caffe/data/scenetext</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>
这时候,需要修改调用 SSD 源码中提供的 create_data.sh
脚本文件(我将文件重命名为:create_data_scenetext.sh
):
<code class="language-bash hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;">cur_dir=$(<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">cd</span> $( dirname <span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">${BASH_SOURCE[0]}</span> ) && <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">pwd</span> ) root_dir=<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$cur_dir</span>/../.. <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">cd</span> <span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$root_dir</span> redo=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> data_root_dir=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$HOME</span>/data/VOCdevkit"</span> dataset_name=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"scenetext"</span> mapfile=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$root_dir</span>/data/<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$dataset_name</span>/labelmap_voc_scenetext.prototxt"</span> anno_<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">type</span>=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"detection"</span> db=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"lmdb"</span> min_dim=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> max_dim=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> width=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> height=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> extra_cmd=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"--encode-type=jpg --encoded"</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> [ <span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$redo</span> ] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">then</span> extra_cmd=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$extra_cmd</span> --redo"</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">fi</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> subset <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> test trainval <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">do</span> python <span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$root_dir</span>/scripts/create_annoset.py --anno-type=<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$anno_type</span> --label-map-file=<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$mapfile</span> \ --min-dim=<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$min_dim</span> --max-dim=<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$max_dim</span> --resize-width=<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$width</span> --resize-height=<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$height</span> \ --check-label <span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$extra_cmd</span> <span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$data_root_dir</span> <span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$root_dir</span>/data/<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$dataset_name</span>/<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$subset</span>.txt \ <span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$data_root_dir</span>/<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$dataset_name</span>/<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$db</span>/<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$dataset_name</span><span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"_"</span><span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$subset</span><span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"_"</span><span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$db</span> examples/<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">$dataset_name</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">done</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li></ul>
上面的 bash
脚本会自动将训练的 ICDAR 2011 的图像文件与对应 label 转换为 lmdb 文件。转换后的文件位置可参见上面脚本的内容,我的位置为:
<code class="language-bash hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;">/home/chenxp/caffe/examples/scenetext_trainval_lmdb /home/chenxp/caffe/examples/scenetext_test_lmdb</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>
训练模型
将 SSD 用于自己的检测任务,是需要 Fine-tuning a pretrained network 的。
具体的,需要加载 SSD 作者提供的 VGG_ILSVRC_16_layers_fc_reduced.caffemodel,在这个预训练的模型上,继续用我们的数据训练。
下载下来后,放在如下位置下面:
<code class="language-bash hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;">/home/chenxp/caffe/models/VGGNet</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>
之后,修改作者提供的训练 Python 代码:ssd_pascal.py
,这份代码会自动创建训练所需要的如下几个文件:
- deploy.prototxt
- solver.prototxt
- trainval.prototxt
- test.prototxt
我们需要按照自己的情况,修改如下几处地方:
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Modify the job name if you want.</span> job_name = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"SSD_{}"</span>.format(resize) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># The name of the model. Modify it if you want.</span> model_name = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"VGG_VOC0712_{}"</span>.format(job_name) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Directory which stores the model .prototxt file.</span> save_dir = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"models/VGGNet/VOC0712/{}"</span>.format(job_name) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Directory which stores the snapshot of models.</span> snapshot_dir = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"models/VGGNet/VOC0712/{}"</span>.format(job_name) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Directory which stores the job script and log file.</span> job_dir = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"jobs/VGGNet/VOC0712/{}"</span>.format(job_name) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Directory which stores the detection results.</span> output_result_dir = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"{}/data/VOCdevkit/results/VOC2007/{}/Main"</span>.format(os.environ[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'HOME'</span>], job_name) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># model definition files.</span> train_net_file = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"{}/train.prototxt"</span>.format(save_dir) test_net_file = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"{}/test.prototxt"</span>.format(save_dir) deploy_net_file = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"{}/deploy.prototxt"</span>.format(save_dir) solver_file = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"{}/solver.prototxt"</span>.format(save_dir) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># snapshot prefix.</span> snapshot_prefix = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"{}/{}"</span>.format(snapshot_dir, model_name) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># job script path.</span> job_file = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"{}/{}.sh"</span>.format(job_dir, model_name) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Stores the test image names and sizes. Created by data/VOC0712/create_list.sh</span> name_size_file = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"data/VOC0712/test_name_size.txt"</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># The pretrained model. We use the Fully convolutional reduced (atrous) VGGNet.</span> pretrain_model = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"models/VGGNet/VGG_ILSVRC_16_layers_fc_reduced.caffemodel"</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Stores LabelMapItem.</span> label_map_file = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"data/VOC0712/labelmap_voc.prototxt"</span> num_classes = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">21</span> num_test_image = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4952</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li></ul>
我的训练参数
其实还需要修改一些,如训练时的参数。因为一开始若直接用作者 ssd_pascal.py
文件中的默认的 solver.prototxt
参数,会出现如下情况:
跑着跑着,loss 就变成 nan
了,发散了,不收敛。
我调试了一段时间,我的 solver.prototxt
参数设置如下,可保证收敛:
<code class="language-prototxt hljs http has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-attribute" style="box-sizing: border-box;">base_lr</span>: <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">0.0001</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>
其余参数可看自己设置。学习率一定要小,原先的 0.001 就会发散。
训练结束:
可以看见,最后的测试精度为 0.776573,感觉 SSD 效果还可以。
我自己训练好的模型,上传到云端了:链接:http://share.weiyun.com/1c544de66be06ea04774fd11e820a780 (密码:ERid5Y)
这个需要在下一阶段的测试中用到。
用训练好的 model 进行 predict
SSD 的作者也给我们写好了 predict 的代码,我们只需要该参数就可以了。
用 jupyter notebook 打开 ~/caffe/examples/ssd_detect.ipynb
文件,这是作者为我们写好的将训练好的 caffemodel
用于检测的文件。
指定好 caffemodel
,deploy.txt
,详细的看我上传的代码吧。
测试几张图像,结果如下: