1
0
Files
bagu-thesis/docs/XThesis.example.xml

197 lines
6.3 KiB
XML
Raw Permalink Normal View History

2026-05-22 12:51:10 +08:00
<?xml version="1.0" encoding="UTF-8"?>
<article xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="XThesis.xsd">
<!-- ===== 前 言 ===== -->
<preface>
<section caption="摘要">
<paragraph>
<span>本文提出了一种基于深度学习的图像识别方法。实验结果表明,该方法在</span>
<b>ImageNet</b>
<span>数据集上达到了</span>
<inlineeq>92.3\%</inlineeq>
<span>的准确率。</span>
</paragraph>
<paragraph>
<b>关键词:</b>
<span>深度学习;图像识别;卷积神经网络</span>
</paragraph>
</section>
</preface>
<!-- ===== 正 文 ===== -->
<document>
<section1 caption="引言">
<paragraph>
<span>随着深度学习技术的快速发展</span>
<cite>he2016, krizhevsky2012</cite>
<span>,图像识别领域取得了突破性进展。然而,现有方法在计算效率方面仍存在诸多挑战</span>
<cite>tan2019</cite>
<span></span>
</paragraph>
<paragraph>
<span>本文的主要贡献如下:</span>
<cref>sec:method,fig:arch</cref>
<span></span>
</paragraph>
<section2 caption="研究背景">
<paragraph>
<span>卷积神经网络CNN</span>
<cite>lecun1998</cite>
<span>提出以来,已成为计算机视觉领域的基础架构。</span>
</paragraph>
</section2>
<section2 caption="论文结构">
<paragraph>
<span>本文其余部分组织如下:第</span>
<ref>sec:method</ref>
<span>节介绍所提出的方法,第</span>
<ref>sec:experiment</ref>
<span>节展示实验结果,最后在第</span>
<ref>sec:conclusion</ref>
<span>节进行总结。</span>
</paragraph>
</section2>
</section1>
<section1 caption="相关工作">
<paragraph>
<span>近年来,多种网络架构被提出。</span>
<cite>simonyan2014, szegedy2015, he2016</cite>
<span>等代表性工作极大地推动了该领域的发展。</span>
</paragraph>
</section1>
<section1 caption="方法">
<paragraph>
<span>本节详细描述所提出的方法架构,如</span>
<cref>fig:arch</cref>
<span>所示。整体流程如公式</span>
<cref>eq:loss</cref>
<span>所定义。</span>
</paragraph>
2026-05-29 14:08:14 +08:00
<ol>
<li><span>输入图像经过预处理模块进行归一化。</span></li>
<li><span>特征提取网络(如</span><cref>fig:arch</cref><span>所示)提取多尺度特征。</span></li>
<li><span>分类头根据</span><cref>eq:loss</cref><span>计算损失并输出预测结果。</span></li>
</ol>
2026-05-22 12:51:10 +08:00
<figure label="fig:arch" caption="网络架构示意图">figures/architecture.png</figure>
<section2 caption="损失函数">
<paragraph>
<span>本文采用交叉熵损失函数,其定义如下:</span>
</paragraph>
<equation label="eq:loss">\mathcal{L} = -\sum_{i=1}^{N} y_i \log(\hat{y}_i)</equation>
<paragraph>
<span>其中</span>
<inlineeq>N</inlineeq>
<span>表示类别数量,</span>
<inlineeq>y_i</inlineeq>
<span>为真实标签,</span>
<inlineeq>\hat{y}_i</inlineeq>
<span>为预测概率。</span>
</paragraph>
</section2>
<section2 caption="优化策略">
<paragraph>
<span>模型使用</span>
<i>Adam</i>
<span>优化器进行训练,初始学习率设为</span>
<inlineeq>10^{-3}</inlineeq>
<span></span>
</paragraph>
</section2>
</section1>
<section1 caption="实验">
<paragraph>
<span>本节在三个标准数据集上评估所提出方法的性能。实验结果汇总于</span>
<ref>tbl:result</ref>
<span></span>
</paragraph>
<section2 caption="数据集">
2026-05-29 14:08:14 +08:00
<ul>
<li><b>CIFAR-10</b><span>10 个类别,共 60,000 张 32×32 彩色图像。</span></li>
<li><b>CIFAR-100</b><span>100 个类别,共 60,000 张 32×32 彩色图像。</span></li>
<li><b>ImageNet</b><span>1,000 个类别,共约 120 万张训练图像。</span></li>
</ul>
2026-05-22 12:51:10 +08:00
</section2>
<section2 caption="结果分析">
<paragraph>
<span>不同方法的对比结果如下表所示:</span>
</paragraph>
<table label="tbl:result" caption="各方法在 CIFAR-10 上的准确率对比">
<thead>
<td>方法</td>
<td>准确率 (%)</td>
<td>参数量 (M)</td>
</thead>
<tbody>
<tr>
<td>ResNet-18</td>
<td>93.0</td>
<td>11.7</td>
</tr>
<tr>
<td>ResNet-50</td>
<td>93.5</td>
<td>25.6</td>
</tr>
<tr>
<td>DenseNet-121</td>
<td>94.2</td>
<td>8.0</td>
</tr>
<tr>
<td>本文方法</td>
<td>95.1</td>
<td>9.3</td>
</tr>
</tbody>
</table>
<pagebreaker/>
<paragraph>
<span>从表中可以看出,本文方法以较少的参数量取得了最优的准确率。</span>
</paragraph>
</section2>
</section1>
<section1 caption="结论">
<paragraph>
<span>本文提出了一种高效的图像识别方法,在多个数据集上验证了其有效性。未来工作将探索该方法在视频分析领域的应用。</span>
</paragraph>
</section1>
</document>
<!-- ===== 附 录 ===== -->
<appendix>
<section caption="致谢">
<paragraph>
<span>本研究得到了国家自然科学基金项目编号No. 114514的资助在此表示感谢。</span>
</paragraph>
</section>
<section caption="附录 A 核心代码">
<paragraph>
<span>以下是模型核心模块的伪代码实现:</span>
</paragraph>
</section>
</appendix>
<!-- ===== 参考文献 ===== -->
<reference>references.bib</reference>
</article>