195 lines
5.9 KiB
XML
195 lines
5.9 KiB
XML
|
|
<?xml version="1.0" encoding="UTF-8"?>
|
|||
|
|
<article xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
|||
|
|
xsi:noNamespaceSchemaLocation="XThesis.xsd">
|
|||
|
|
|
|||
|
|
<!-- ===== 前 言 ===== -->
|
|||
|
|
<preface>
|
|||
|
|
<section caption="摘要">
|
|||
|
|
<paragraph>
|
|||
|
|
<span>本文提出了一种基于深度学习的图像识别方法。实验结果表明,该方法在</span>
|
|||
|
|
<b>ImageNet</b>
|
|||
|
|
<span>数据集上达到了</span>
|
|||
|
|
<inlineeq>92.3\%</inlineeq>
|
|||
|
|
<span>的准确率。</span>
|
|||
|
|
</paragraph>
|
|||
|
|
<paragraph>
|
|||
|
|
<b>关键词:</b>
|
|||
|
|
<span>深度学习;图像识别;卷积神经网络</span>
|
|||
|
|
</paragraph>
|
|||
|
|
</section>
|
|||
|
|
</preface>
|
|||
|
|
|
|||
|
|
<!-- ===== 正 文 ===== -->
|
|||
|
|
<document>
|
|||
|
|
<section1 caption="引言">
|
|||
|
|
<paragraph>
|
|||
|
|
<span>随着深度学习技术的快速发展</span>
|
|||
|
|
<cite>he2016, krizhevsky2012</cite>
|
|||
|
|
<span>,图像识别领域取得了突破性进展。然而,现有方法在计算效率方面仍存在诸多挑战</span>
|
|||
|
|
<cite>tan2019</cite>
|
|||
|
|
<span>。</span>
|
|||
|
|
</paragraph>
|
|||
|
|
<paragraph>
|
|||
|
|
<span>本文的主要贡献如下:</span>
|
|||
|
|
<cref>sec:method,fig:arch</cref>
|
|||
|
|
<span>。</span>
|
|||
|
|
</paragraph>
|
|||
|
|
|
|||
|
|
<section2 caption="研究背景">
|
|||
|
|
<paragraph>
|
|||
|
|
<span>卷积神经网络(CNN)自</span>
|
|||
|
|
<cite>lecun1998</cite>
|
|||
|
|
<span>提出以来,已成为计算机视觉领域的基础架构。</span>
|
|||
|
|
</paragraph>
|
|||
|
|
</section2>
|
|||
|
|
|
|||
|
|
<section2 caption="论文结构">
|
|||
|
|
<paragraph>
|
|||
|
|
<span>本文其余部分组织如下:第</span>
|
|||
|
|
<ref>sec:method</ref>
|
|||
|
|
<span>节介绍所提出的方法,第</span>
|
|||
|
|
<ref>sec:experiment</ref>
|
|||
|
|
<span>节展示实验结果,最后在第</span>
|
|||
|
|
<ref>sec:conclusion</ref>
|
|||
|
|
<span>节进行总结。</span>
|
|||
|
|
</paragraph>
|
|||
|
|
</section2>
|
|||
|
|
</section1>
|
|||
|
|
|
|||
|
|
<section1 caption="相关工作">
|
|||
|
|
<paragraph>
|
|||
|
|
<span>近年来,多种网络架构被提出。</span>
|
|||
|
|
<cite>simonyan2014, szegedy2015, he2016</cite>
|
|||
|
|
<span>等代表性工作极大地推动了该领域的发展。</span>
|
|||
|
|
</paragraph>
|
|||
|
|
</section1>
|
|||
|
|
|
|||
|
|
<section1 caption="方法">
|
|||
|
|
<paragraph>
|
|||
|
|
<span>本节详细描述所提出的方法架构,如</span>
|
|||
|
|
<cref>fig:arch</cref>
|
|||
|
|
<span>所示。整体流程如公式</span>
|
|||
|
|
<cref>eq:loss</cref>
|
|||
|
|
<span>所定义。</span>
|
|||
|
|
</paragraph>
|
|||
|
|
|
|||
|
|
<figure label="fig:arch" caption="网络架构示意图">figures/architecture.png</figure>
|
|||
|
|
|
|||
|
|
<section2 caption="损失函数">
|
|||
|
|
<paragraph>
|
|||
|
|
<span>本文采用交叉熵损失函数,其定义如下:</span>
|
|||
|
|
</paragraph>
|
|||
|
|
|
|||
|
|
<equation label="eq:loss">\mathcal{L} = -\sum_{i=1}^{N} y_i \log(\hat{y}_i)</equation>
|
|||
|
|
|
|||
|
|
<paragraph>
|
|||
|
|
<span>其中</span>
|
|||
|
|
<inlineeq>N</inlineeq>
|
|||
|
|
<span>表示类别数量,</span>
|
|||
|
|
<inlineeq>y_i</inlineeq>
|
|||
|
|
<span>为真实标签,</span>
|
|||
|
|
<inlineeq>\hat{y}_i</inlineeq>
|
|||
|
|
<span>为预测概率。</span>
|
|||
|
|
</paragraph>
|
|||
|
|
</section2>
|
|||
|
|
|
|||
|
|
<section2 caption="优化策略">
|
|||
|
|
<paragraph>
|
|||
|
|
<span>模型使用</span>
|
|||
|
|
<i>Adam</i>
|
|||
|
|
<span>优化器进行训练,初始学习率设为</span>
|
|||
|
|
<inlineeq>10^{-3}</inlineeq>
|
|||
|
|
<span>。</span>
|
|||
|
|
</paragraph>
|
|||
|
|
</section2>
|
|||
|
|
</section1>
|
|||
|
|
|
|||
|
|
<section1 caption="实验">
|
|||
|
|
<paragraph>
|
|||
|
|
<span>本节在三个标准数据集上评估所提出方法的性能。实验结果汇总于</span>
|
|||
|
|
<ref>tbl:result</ref>
|
|||
|
|
<span>。</span>
|
|||
|
|
</paragraph>
|
|||
|
|
|
|||
|
|
<section2 caption="数据集">
|
|||
|
|
<paragraph>
|
|||
|
|
<span>实验采用以下数据集:</span>
|
|||
|
|
<b>CIFAR-10</b>
|
|||
|
|
<span>、</span>
|
|||
|
|
<b>CIFAR-100</b>
|
|||
|
|
<span>和</span>
|
|||
|
|
<b>ImageNet</b>
|
|||
|
|
<span>。</span>
|
|||
|
|
</paragraph>
|
|||
|
|
</section2>
|
|||
|
|
|
|||
|
|
<section2 caption="结果分析">
|
|||
|
|
<paragraph>
|
|||
|
|
<span>不同方法的对比结果如下表所示:</span>
|
|||
|
|
</paragraph>
|
|||
|
|
|
|||
|
|
<table label="tbl:result" caption="各方法在 CIFAR-10 上的准确率对比">
|
|||
|
|
<thead>
|
|||
|
|
<td>方法</td>
|
|||
|
|
<td>准确率 (%)</td>
|
|||
|
|
<td>参数量 (M)</td>
|
|||
|
|
</thead>
|
|||
|
|
<tbody>
|
|||
|
|
<tr>
|
|||
|
|
<td>ResNet-18</td>
|
|||
|
|
<td>93.0</td>
|
|||
|
|
<td>11.7</td>
|
|||
|
|
</tr>
|
|||
|
|
<tr>
|
|||
|
|
<td>ResNet-50</td>
|
|||
|
|
<td>93.5</td>
|
|||
|
|
<td>25.6</td>
|
|||
|
|
</tr>
|
|||
|
|
<tr>
|
|||
|
|
<td>DenseNet-121</td>
|
|||
|
|
<td>94.2</td>
|
|||
|
|
<td>8.0</td>
|
|||
|
|
</tr>
|
|||
|
|
<tr>
|
|||
|
|
<td>本文方法</td>
|
|||
|
|
<td>95.1</td>
|
|||
|
|
<td>9.3</td>
|
|||
|
|
</tr>
|
|||
|
|
</tbody>
|
|||
|
|
</table>
|
|||
|
|
|
|||
|
|
<pagebreaker/>
|
|||
|
|
|
|||
|
|
<paragraph>
|
|||
|
|
<span>从表中可以看出,本文方法以较少的参数量取得了最优的准确率。</span>
|
|||
|
|
</paragraph>
|
|||
|
|
</section2>
|
|||
|
|
</section1>
|
|||
|
|
|
|||
|
|
<section1 caption="结论">
|
|||
|
|
<paragraph>
|
|||
|
|
<span>本文提出了一种高效的图像识别方法,在多个数据集上验证了其有效性。未来工作将探索该方法在视频分析领域的应用。</span>
|
|||
|
|
</paragraph>
|
|||
|
|
</section1>
|
|||
|
|
</document>
|
|||
|
|
|
|||
|
|
<!-- ===== 附 录 ===== -->
|
|||
|
|
<appendix>
|
|||
|
|
<section caption="致谢">
|
|||
|
|
<paragraph>
|
|||
|
|
<span>本研究得到了国家自然科学基金(项目编号:No. 114514)的资助,在此表示感谢。</span>
|
|||
|
|
</paragraph>
|
|||
|
|
</section>
|
|||
|
|
|
|||
|
|
<section caption="附录 A 核心代码">
|
|||
|
|
<paragraph>
|
|||
|
|
<span>以下是模型核心模块的伪代码实现:</span>
|
|||
|
|
</paragraph>
|
|||
|
|
</section>
|
|||
|
|
</appendix>
|
|||
|
|
|
|||
|
|
<!-- ===== 参考文献 ===== -->
|
|||
|
|
<reference>references.bib</reference>
|
|||
|
|
|
|||
|
|
</article>
|