feat: 切换后端至PaddleOCR-NCNN,切换工程为CMake

1.项目后端整体迁移至PaddleOCR-NCNN算法,已通过基本的兼容性测试
2.工程改为使用CMake组织,后续为了更好地兼容第三方库,不再提供QMake工程
3.重整权利声明文件,重整代码工程,确保最小化侵权风险

Log: 切换后端至PaddleOCR-NCNN,切换工程为CMake
Change-Id: I4d5d2c5d37505a4a24b389b1a4c5d12f17bfa38c
This commit is contained in:
wangzhengyang
2022-05-10 09:54:44 +08:00
parent ecdd171c6f
commit 718c41634f
10018 changed files with 3593797 additions and 186748 deletions

View File

@ -0,0 +1,113 @@
# Graph API {#gapi}
# Introduction {#gapi_root_intro}
OpenCV Graph API (or G-API) is a new OpenCV module targeted to make
regular image processing fast and portable. These two goals are
achieved by introducing a new graph-based model of execution.
G-API is a special module in OpenCV -- in contrast with the majority
of other main modules, this one acts as a framework rather than some
specific CV algorithm. G-API provides means to define CV operations,
construct graphs (in form of expressions) using it, and finally
implement and run the operations for a particular backend.
@note G-API is a new module and now is in active development. It's API
is volatile at the moment and there may be minor but
compatibility-breaking changes in the future.
# Contents
G-API documentation is organized into the following chapters:
- @subpage gapi_purposes
The motivation behind G-API and its goals.
- @subpage gapi_hld
General overview of G-API architecture and its major internal
components.
- @subpage gapi_kernel_api
Learn how to introduce new operations in G-API and implement it for
various backends.
- @subpage gapi_impl
Low-level implementation details of G-API, for those who want to
contribute.
- API Reference: functions and classes
- @subpage gapi_core
Core G-API operations - arithmetic, boolean, and other matrix
operations;
- @subpage gapi_imgproc
Image processing functions: color space conversions, various
filters, etc.
# API Example {#gapi_example}
A very basic example of G-API pipeline is shown below:
@include modules/gapi/samples/api_example.cpp
<!-- TODO align this code with text using marks and itemized list -->
G-API is a separate OpenCV module so its header files have to be
included explicitly. The first four lines of `main()` create and
initialize OpenCV's standard video capture object, which fetches
video frames from either an attached camera or a specified file.
G-API pipeline is constructed next. In fact, it is a series of G-API
operation calls on cv::GMat data. The important aspect of G-API is
that this code block is just a declaration of actions, but not the
actions themselves. No processing happens at this point, G-API only
tracks which operations form pipeline and how it is connected. G-API
_Data objects_ (here it is cv::GMat) are used to connect operations
each other. `in` is an _empty_ cv::GMat signalling that it is a
beginning of computation.
After G-API code is written, it is captured into a call graph with
instantiation of cv::GComputation object. This object takes
input/output data references (in this example, `in` and `out`
cv::GMat objects, respectively) as parameters and reconstructs the
call graph based on all the data flow between `in` and `out`.
cv::GComputation is a thin object in sense that it just captures which
operations form up a computation. However, it can be used to execute
computations -- in the following processing loop, every captured frame (a
cv::Mat `input_frame`) is passed to cv::GComputation::apply().
![Example pipeline running on sample video 'vtest.avi'](pics/demo.jpg)
cv::GComputation::apply() is a polimorphic method which accepts a
variadic number of arguments. Since this computation is defined on one
input, one output, a special overload of cv::GComputation::apply() is
used to pass input data and get output data.
Internally, cv::GComputation::apply() compiles the captured graph for
the given input parameters and executes the compiled graph on data
immediately.
There is a number important concepts can be outlines with this example:
* Graph declaration and graph execution are distinct steps;
* Graph is built implicitly from a sequence of G-API expressions;
* G-API supports function-like calls -- e.g. cv::gapi::resize(), and
operators, e.g operator|() which is used to compute bitwise OR;
* G-API syntax aims to look pure: every operation call within a graph
yields a new result, thus forming a directed acyclic graph (DAG);
* Graph declaration is not bound to any data -- real data objects
(cv::Mat) come into picture after the graph is already declared.
<!-- FIXME: The above operator|() link links to MatExpr not GAPI -->
See [tutorials and porting examples](@ref tutorial_table_of_content_gapi)
to learn more on various G-API features and concepts.
<!-- TODO Add chapter on declaration, compilation, execution -->

View File

@ -0,0 +1,76 @@
# Why Graph API? {#gapi_purposes}
# Motivation behind G-API {#gapi_intro_why}
G-API module brings graph-based model of execution to OpenCV. This
chapter briefly describes how this new model can help software
developers in two aspects: optimizing and porting image processing
algorithms.
## Optimizing with Graph API {#gapi_intro_opt}
Traditionally OpenCV provided a lot of stand-alone image processing
functions (see modules `core` and `imgproc`). Many of that functions
are well-optimized (e.g. vectorized for specific CPUs, parallel, etc)
but still the out-of-box optimization scope has been limited to a
single function only -- optimizing the whole algorithm built atop of that
functions was a responsibility of a programmer.
OpenCV 3.0 introduced _Transparent API_ (or _T-API_) which allowed to
offload OpenCV function calls transparently to OpenCL devices and save
on Host/Device data transfers with cv::UMat -- and it was a great step
forward. However, T-API is a dynamic API -- user code still remains
unconstrained and OpenCL kernels are enqueued in arbitrary order, thus
eliminating further pipeline-level optimization potential.
G-API brings implicit graph model to OpenCV 4.0. Graph model captures
all operations and its data dependencies in a pipeline and so provides
G-API framework with extra information to do pipeline-level
optimizations.
The cornerstone of graph-based optimizations is _Tiling_. Tiling
allows to break the processing into smaller parts and reorganize
operations to enable data parallelism, improve data locality, and save
memory footprint. Data locality is an especially important aspect of
software optimization due to diffent costs of memory access on modern
computer architectures -- the more data is reused in the first level
cache, the more efficient pipeline is.
Definitely the aforementioned techniques can be applied manually --
but it requires extra skills and knowledge of the target platform and
the algorithm implementation changes irrevocably -- becoming more
specific, less flexible, and harder to extend and maintain.
G-API takes this responsibility and complexity from user and does the
majority of the work by itself, keeping the algorithm code clean from
device or optimization details. This approach has its own limitations,
though, as graph model is a _constrained_ model and not every
algorithm can be represented as a graph, so the G-API scope is limited
only to regular image processing -- various filters, arithmetic,
binary operations, and well-defined geometrical transformations.
## Porting with Graph API {#gapi_intro_port}
The essence of G-API is declaring a sequence of operations to run, and
then executing that sequence. G-API is a constrained API, so it puts a
number of limitations on which operations can form a pipeline and
which data these operations may exchange each other.
This formalization in fact helps to make an algorithm portable. G-API
clearly separates operation _interfaces_ from its _implementations_.
One operation (_kernel_) may have multiple implementations even for a
single device (e.g., OpenCV-based "reference" implementation and a
tiled optimized implementation, both running on CPU). Graphs (or
_Computations_ in G-API terms) are built only using operation
interfaces, not implementations -- thus the same graph can be executed
on different devices (and, of course, using different optimization
techniques) with little-to-no changes in the graph itself.
G-API supports plugins (_Backends_) which aggregate logic and
intelligence on what is the best way to execute on a particular
platform. Once a pipeline is built with G-API, it can be parametrized
to use either of the backends (or a combination of it) and so a graph
can be ported easily to a new platform.
@sa @ref gapi_hld

View File

@ -0,0 +1,160 @@
# High-level design overview {#gapi_hld}
[TOC]
# G-API High-level design overview
G-API is a heterogeneous framework and provides an unified API to
program image processing pipelines with a number of supported
backends.
The key design idea is to keep pipeline code itself platform-neutral
while specifying which kernels to use and which devices to utilize
using extra parameters at graph compile (configuration) time. This
requirement has led to the following architecture:
<!-- FIXME: Render from dot directly -->
![G-API framework architecture](pics/gapi_scheme.png)
There are three layers in this architecture:
* **API Layer** -- this is the top layer, which implements G-API
public interface, its building blocks and semantics.
When user constructs a pipeline with G-API, he interacts with this
layer directly, and the entities the user operates on (like cv::GMat
or cv::GComputation) are provided by this layer.
* **Graph Compiler Layer** -- this is the intermediate layer which
unrolls user computation into a graph and then applies a number of
transformations to it (e.g. optimizations). This layer is built atop
of [ADE Framework](@ref gapi_detail_ade).
* **Backends Layer** -- this is the lowest level layer, which lists a
number of _Backends_. In contrast with the above two layers,
backends are highly coupled with low-level platform details, with
every backend standing for every platform. A backend operates on a
processed graph (coming from the graph compiler) and executes this
graph optimally for a specific platform or device.
# API layer {#gapi_api_layer}
API layer is what user interacts with when defining and using a
pipeline (a Computation in G-API terms). API layer defines a set of
G-API _dynamic_ objects which can be used as inputs, outputs, and
intermediate data objects within a graph:
* cv::GMat
* cv::GScalar
* cv::GArray (template class)
API layer specifies a list of Operations which are defined on these
data objects -- so called kernels. See G-API [core](@ref gapi_core)
and [imgproc](@ref gapi_imgproc) namespaces for details on which
operations G-API provides by default.
G-API is not limited to these operations only -- users can define
their own kernels easily using a special macro G_TYPED_KERNEL().
API layer is also responsible for marshalling and storing operation
parameters on pipeline creation. In addition to the aforementioned
G-API dynamic objects, operations may also accept arbitrary
parameters (more on this [here](@ref gapi_detail_params)), so API
layer captures its values and stores internally upon the moment of
execution.
Finally, cv::GComputation and cv::GCompiled are the remaining
important components of API layer. The former wraps a series of G-API
expressions into an object (graph), and the latter is a product of
graph _compilation_ (see [this chapter](@ref gapi_detail_compiler) for
details).
# Graph compiler layer {#gapi_compiler}
Every G-API computation is compiled before it executes. Compilation
process is triggered in two ways:
* _implicitly_, when cv::GComputation::apply() is used. In this case,
graph compilation is then immediately followed by execution.
* _explicitly_, when cv::GComputation::compile() is used. In this case,
a cv::GCompiled object is returned which then can be invoked as a
C++ functor.
The first way is recommended for cases when input data format is not
known in advance -- e.g. when it comes from an arbitrary input file.
The second way is recommended for deployment (production) scenarios
where input data characteristics are usually predefined.
Graph compilation process is built atop of ADE Framework. Initially, a
bipartite graph is generated from expressions captured by API layer.
This graph contains nodes of two types: _Data_ and _Operations_. Graph
always starts and ends with a Data node(s), with Operations nodes
in-between. Every Operation node has inputs and outputs, both are Data
nodes.
After the initial graph is generated, it is actually processed by a
number of graph transformations, called _passes_. ADE Framework acts
as a compiler pass management engine, and passes are written
specifically for G-API.
There are different passes which check graph validity, refine details
on operations and data, organize nodes into clusters ("Islands") based
on affinity or user-specified regioning[TBD], and more. Backends also
are able to inject backend-specific passes into the compilation
process, see more on this in the [dedicated chapter](@ref gapi_detail_meta).
Result of graph compilation is a compiled object, represented by class
cv::GCompiled. A new cv::GCompiled object is always created regardless
if there was an explicit or implicit compilation request (see
above). Actual graph execution happens within cv::GCompiled and is
determined by backends which participated in the graph compilation.
@sa cv::GComputation::apply(), cv::GComputation::compile(), cv::GCompiled
# Backends layer {#gapi_backends}
The above diagram lists two backends, _OpenCV_ and _Fluid_. _OpenCV_
is so-called "reference backend", which implements G-API operations
using plain old OpenCV functions. This backend is useful for
prototyping on a familiar development system. _Fluid_ is a plugin for
cache-efficient execution on CPU -- it implements a different
execution policy and operates with its own, special kernels. Fluid
backend allows to achieve less memory footprint and better memory
locality when running on CPU.
There may be more backends available, e.g. Halide, OpenCL, etc. --
G-API provides an uniform internal API to develop backends so any
enthusiast or a company are free to scale G-API on a new platform or
accelerator. In terms of OpenCV infrastructure, every new backend is a
new distinct OpenCV module, which extends G-API when build as a part
of OpenCV.
# Graph execution {#gapi_compiled}
The way graph executed is defined by backends selected for
compilation. In fact, every backend builds its own execution script as
the final stage of graph compilation process, when an executable
(compiled) object is being generated. For example, in OpenCV backend,
this script is just a topologically-sorted sequence of OpenCV
functions to call; for Fluid backend, it is a similar thing -- a
topologically sorted list of _Agents_ processing lines of input on
every iteration.
Graph execution is triggered in two ways:
* via cv::GComputation::apply(), with graph compiled in-place exactly
for the given input data;
* via cv::GCompiled::operator()(), when the graph has been precompiled.
Both methods are polimorphic and take a variadic number of arguments,
with validity checks performed in runtime. If a number, shapes, and
formats of passed data objects differ from expected, a run-time
exception is thrown. G-API also provides _typed_ wrappers to move
these checks to the compile time -- see `cv::GComputationT<>`.
G-API graph execution is declared stateless -- it means that a
compiled functor (cv::GCompiled) acts like a pure C++ function and
provides the same result for the same set of input arguments.
Both execution methods take \f$N+M\f$ parameters, where \f$N\f$ is a
number of inputs, and \f$M\f$ is a number of outputs on which a
cv::GComputation is defined. Note that while G-API types (cv::GMat,
etc) are used in definition, the execution methods accept OpenCV's
traditional data types (like cv::Mat) which hold actual data -- see
table in [parameter marshalling](@ref gapi_detail_params).
@sa @ref gapi_impl, @ref gapi_kernel_api

View File

@ -0,0 +1,188 @@
# Kernel API {#gapi_kernel_api}
[TOC]
# G-API Kernel API
The core idea behind G-API is portability -- a pipeline built with
G-API must be portable (or at least able to be portable). It means
that either it works out-of-the box when compiled for new platform,
_or_ G-API provides necessary tools to make it running there, with
little-to-no changes in the algorithm itself.
This idea can be achieved by separating kernel interface from its
implementation. Once a pipeline is built using kernel interfaces, it
becomes implementation-neutral -- the implementation details
(i.e. which kernels to use) are passed on a separate stage (graph
compilation).
Kernel-implementation hierarchy may look like:
@dot Kernel API/implementation hierarchy example
digraph {
rankdir=BT;
node [shape=record];
ki_a [label="{<f0> interface\nA}"];
ki_b [label="{<f0> interface\nB}"];
{rank=same; ki_a ki_b};
"CPU::A" -> ki_a [dir="forward"];
"OpenCL::A" -> ki_a [dir="forward"];
"Halide::A" -> ki_a [dir="forward"];
"CPU::B" -> ki_b [dir="forward"];
"OpenCL::B" -> ki_b [dir="forward"];
"Halide::B" -> ki_b [dir="forward"];
}
@enddot
A pipeline itself then can be expressed only in terms of `A`, `B`, and
so on, and choosing which implementation to use in execution becomes
an external parameter.
# Defining a kernel {#gapi_defining_kernel}
G-API provides a macro to define a new kernel interface --
G_TYPED_KERNEL():
@snippet modules/gapi/samples/kernel_api_snippets.cpp filter2d_api
This macro is a shortcut to a new type definition. It takes three
arguments to register a new type, and requires type body to be present
(see [below](@ref gapi_kernel_supp_info)). The macro arguments are:
1. Kernel interface name -- also serves as a name of new type defined
with this macro;
2. Kernel signature -- an `std::function<>`-like signature which defines
API of the kernel;
3. Kernel's unique name -- used to identify kernel when its type
informattion is stripped within the system.
Kernel declaration may be seen as function declaration -- in both cases
a new entity must be used then according to the way it was defined.
Kernel signature defines kernel's usage syntax -- which parameters
it takes during graph construction. Implementations can also use this
signature to derive it into backend-specific callback signatures (see
next chapter).
Kernel may accept values of any type, and G-API _dynamic_ types are
handled in a special way. All other types are opaque to G-API and
passed to kernel in `outMeta()` or in execution callbacks as-is.
Kernel's return value can _only_ be of G-API dynamic type -- cv::GMat,
cv::GScalar, or `cv::GArray<T>`. If an operation has more than one
output, it should be wrapped into an `std::tuple<>` (which can contain
only mentioned G-API types). Arbitrary-output-number operations are
not supported.
Once a kernel is defined, it can be used in pipelines with special,
G-API-supplied method "::on()". This method has the same signature as
defined in kernel, so this code:
@snippet modules/gapi/samples/kernel_api_snippets.cpp filter2d_on
is a perfectly legal construction. This example has some verbosity,
though, so usually a kernel declaration comes with a C++ function
wrapper ("factory method") which enables optional parameters, more
compact syntax, Doxygen comments, etc:
@snippet modules/gapi/samples/kernel_api_snippets.cpp filter2d_wrap
so now it can be used like:
@snippet modules/gapi/samples/kernel_api_snippets.cpp filter2d_wrap_call
# Extra information {#gapi_kernel_supp_info}
In the current version, kernel declaration body (everything within the
curly braces) must contain a static function `outMeta()`. This function
establishes a functional dependency between operation's input and
output metadata.
_Metadata_ is an information about data kernel operates on. Since
non-G-API types are opaque to G-API, G-API cares only about `G*` data
descriptors (i.e. dimensions and format of cv::GMat, etc).
`outMeta()` is also an example of how kernel's signature can be
transformed into a derived callback -- note that in this example,
`outMeta()` signature exactly follows the kernel signature (defined
within the macro) but is different -- where kernel expects cv::GMat,
`outMeta()` takes and returns cv::GMatDesc (a G-API structure metadata
for cv::GMat).
The point of `outMeta()` is to propagate metadata information within
computation from inputs to outputs and infer metadata of internal
(intermediate, temporary) data objects. This information is required
for further pipeline optimizations, memory allocation, and other
operations done by G-API framework during graph compilation.
<!-- TODO add examples -->
# Implementing a kernel {#gapi_kernel_implementing}
Once a kernel is declared, its interface can be used to implement
versions of this kernel in different backends. This concept is
naturally projected from object-oriented programming
"Interface/Implementation" idiom: an interface can be implemented
multiple times, and different implementations of a kernel should be
substitutable with each other without breaking the algorithm
(pipeline) logic (Liskov Substitution Principle).
Every backend defines its own way to implement a kernel interface.
This way is regular, though -- whatever plugin is, its kernel
implementation must be "derived" from a kernel interface type.
Kernel implementation are then organized into _kernel
packages_. Kernel packages are passed to cv::GComputation::compile()
as compile arguments, with some hints to G-API on how to select proper
kernels (see more on this in "Heterogeneity"[TBD]).
For example, the aforementioned `Filter2D` is implemented in
"reference" CPU (OpenCV) plugin this way (*NOTE* -- this is a
simplified form with improper border handling):
@snippet modules/gapi/samples/kernel_api_snippets.cpp filter2d_ocv
Note how CPU (OpenCV) plugin has transformed the original kernel
signature:
- Input cv::GMat has been substituted with cv::Mat, holding actual input
data for the underlying OpenCV function call;
- Output cv::GMat has been transformed into extra output parameter, thus
`GCPUFilter2D::run()` takes one argument more than the original
kernel signature.
The basic intuition for kernel developer here is _not to care_ where
that cv::Mat objects come from instead of the original cv::GMat -- and
just follow the signature conventions defined by the plugin. G-API
will call this method during execution and supply all the necessary
information (and forward the original opaque data as-is).
# Compound kernels {#gapi_kernel_compound}
Sometimes kernel is a single thing only on API level. It is convenient
for users, but on a particular implementation side it would be better to
have multiple kernels (a subgraph) doing the thing instead. An example
is goodFeaturesToTrack() -- while in OpenCV backend it may remain a
single kernel, with Fluid it becomes compound -- Fluid can handle Harris
response calculation but can't do sparse non-maxima suppression and
point extraction to an STL vector:
<!-- PIC -->
A compound kernel _implementation_ can be defined using a generic
macro GAPI_COMPOUND_KERNEL():
@snippet modules/gapi/samples/kernel_api_snippets.cpp compound
<!-- TODO: ADD on how Compound kernels may simplify dispatching -->
<!-- TODO: Add details on when expand() is called! -->
It is important to distinguish a compound kernel from G-API high-order
function, i.e. a C++ function which looks like a kernel but in fact
generates a subgraph. The core difference is that a compound kernel is
an _implementation detail_ and a kernel implementation may be either
compound or not (depending on backend capabilities), while a
high-order function is a "macro" in terms of G-API and so cannot act as
an interface which then needs to be implemented by a backend.

View File

@ -0,0 +1,29 @@
# Implementation details {#gapi_impl}
[TOC]
# G-API Implementation details
Note -- this section is still in progress.
# API layer {#gapi_detail_api}
## Expression unrolling {#gapi_detail_expr}
## Parameter marshalling {#gapi_detail_params}
## Operations representation {#gapi_detail_operations}
# Graph compiler {#gapi_detail_compiler}
## ADE basics {#gapi_detail_ade}
## Graph model representation {#gapi_detail_gmodel}
## G-API metadata and passes {#gapi_detail_meta}
# Backends {#gapi_detail_backends}
## Backend scope of work {#gapi_backend_scope}
## Graph transformation {#gapi_backend_pass}

Binary file not shown.

After

Width:  |  Height:  |  Size: 64 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 75 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.3 KiB

View File

@ -0,0 +1,27 @@
# G-API Overview
This is the latest overview slide deck on G-API.
## Prerequisites
- [Emacs] v24 or higher;
- [Org]-mode 8.2.10;
- `pdflatex`;
- `texlive-latex-recommended` ([Beamer] package);
- `texlive-font-utils` (`epstopdf`);
- `wget` (for `get_sty.sh`).
## Building
1. Download and build the [Metropolis] theme with the script:
```
$ ./get_sty.sh
```
2. Now open `gapi_overview.org` with Emacs and press `C-c C-e l P`.
[Emacs]: https://www.gnu.org/software/emacs/
[Org]: https://orgmode.org/
[Beamer]: https://ctan.org/pkg/beamer
[Metropolis]: https://github.com/matze/mtheme

View File

@ -0,0 +1,961 @@
#+TITLE: OpenCV 4.4 Graph API
#+AUTHOR: Dmitry Matveev\newline Intel Corporation
#+OPTIONS: H:2 toc:t num:t
#+LATEX_CLASS: beamer
#+LATEX_CLASS_OPTIONS: [presentation]
#+LATEX_HEADER: \usepackage{transparent} \usepackage{listings} \usepackage{pgfplots} \usepackage{mtheme.sty/beamerthememetropolis}
#+LATEX_HEADER: \setbeamertemplate{frame footer}{OpenCV 4.4 G-API: Overview and programming by example}
#+BEAMER_HEADER: \subtitle{Overview and programming by example}
#+BEAMER_HEADER: \titlegraphic{ \vspace*{3cm}\hspace*{5cm} {\transparent{0.2}\includegraphics[height=\textheight]{ocv_logo.eps}}}
#+COLUMNS: %45ITEM %10BEAMER_ENV(Env) %10BEAMER_ACT(Act) %4BEAMER_COL(Col) %8BEAMER_OPT(Opt)
* G-API: What is, why, what's for?
** OpenCV evolution in one slide
*** Version 1.x -- Library inception
- Just a set of CV functions + helpers around (visualization, IO);
*** Version 2.x -- Library rewrite
- OpenCV meets C++, ~cv::Mat~ replaces ~IplImage*~;
*** Version 3.0 -- Welcome Transparent API (T-API)
- ~cv::UMat~ is introduced as a /transparent/ addition to
~cv::Mat~;
- With ~cv::UMat~, an OpenCL kernel can be enqeueud instead of
immediately running C code;
- ~cv::UMat~ data is kept on a /device/ until explicitly queried.
** OpenCV evolution in one slide (cont'd)
# FIXME: Learn proper page-breaking!
*** Version 4.0 -- Welcome Graph API (G-API)
- A new separate module (not a full library rewrite);
- A framework (or even a /meta/-framework);
- Usage model:
- /Express/ an image/vision processing graph and then /execute/ it;
- Fine-tune execution without changes in the graph;
- Similar to Halide -- separates logic from
platform details.
- More than Halide:
- Kernels can be written in unconstrained platform-native code;
- Halide can serve as a backend (one of many).
** OpenCV evolution in one slide (cont'd)
# FIXME: Learn proper page-breaking!
*** Version 4.2 -- New horizons
- Introduced in-graph inference via OpenVINO™ Toolkit;
- Introduced video-oriented Streaming execution mode;
- Extended focus from individual image processing to the full
application pipeline optimization.
*** Version 4.4 -- More on video
- Introduced a notion of stateful kernels;
- The road to object tracking, background subtraction, etc. in the
graph;
- Added more video-oriented operations (feature detection, Optical
flow).
** Why G-API?
*** Why introduce a new execution model?
- Ultimately it is all about optimizations;
- or at least about a /possibility/ to optimize;
- A CV algorithm is usually not a single function call, but a
composition of functions;
- Different models operate at different levels of knowledge on the
algorithm (problem) we run.
** Why G-API? (cont'd)
# FIXME: Learn proper page-breaking!
*** Why introduce a new execution model?
- *Traditional* -- every function can be optimized (e.g. vectorized)
and parallelized, the rest is up to programmer to care about.
- *Queue-based* -- kernels are enqueued dynamically with no guarantee
where the end is or what is called next;
- *Graph-based* -- nearly all information is there, some compiler
magic can be done!
** What is G-API for?
*** Bring the value of graph model with OpenCV where it makes sense:
- *Memory consumption* can be reduced dramatically;
- *Memory access* can be optimized to maximize cache reuse;
- *Parallelism* can be applied automatically where it is hard to do
it manually;
- It also becomes more efficient when working with graphs;
- *Heterogeneity* gets extra benefits like:
- Avoiding unnecessary data transfers;
- Shadowing transfer costs with parallel host co-execution;
- Improving system throughput with frame-level pipelining.
* Programming with G-API
** G-API Basics
*** G-API Concepts
- *Graphs* are built by applying /operations/ to /data objects/;
- API itself has no "graphs", it is expression-based instead;
- *Data objects* do not hold actual data, only capture /dependencies/;
- *Operations* consume and produce data objects.
- A graph is defined by specifying its /boundaries/ with data objects:
- What data objects are /inputs/ to the graph?
- What are its /outputs/?
** The code is worth a thousand words
:PROPERTIES:
:BEAMER_opt: shrink=42
:END:
#+BEGIN_SRC C++
#include <opencv2/gapi.hpp> // G-API framework header
#include <opencv2/gapi/imgproc.hpp> // cv::gapi::blur()
#include <opencv2/highgui.hpp> // cv::imread/imwrite
int main(int argc, char *argv[]) {
if (argc < 3) return 1;
cv::GMat in; // Express the graph:
cv::GMat out = cv::gapi::blur(in, cv::Size(3,3)); // `out` is a result of `blur` of `in`
cv::Mat in_mat = cv::imread(argv[1]); // Get the real data
cv::Mat out_mat; // Output buffer (may be empty)
cv::GComputation(cv::GIn(in), cv::GOut(out)) // Declare a graph from `in` to `out`
.apply(cv::gin(in_mat), cv::gout(out_mat)); // ...and run it immediately
cv::imwrite(argv[2], out_mat); // Save the result
return 0;
}
#+END_SRC
** The code is worth a thousand words
:PROPERTIES:
:BEAMER_opt: shrink=42
:END:
*** Traditional OpenCV :B_block:BMCOL:
:PROPERTIES:
:BEAMER_env: block
:BEAMER_col: 0.45
:END:
#+BEGIN_SRC C++
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
int main(int argc, char *argv[]) {
using namespace cv;
if (argc != 3) return 1;
Mat in_mat = imread(argv[1]);
Mat gx, gy;
Sobel(in_mat, gx, CV_32F, 1, 0);
Sobel(in_mat, gy, CV_32F, 0, 1);
Mat mag, out_mat;
sqrt(gx.mul(gx) + gy.mul(gy), mag);
mag.convertTo(out_mat, CV_8U);
imwrite(argv[2], out_mat);
return 0;
}
#+END_SRC
*** OpenCV G-API :B_block:BMCOL:
:PROPERTIES:
:BEAMER_env: block
:BEAMER_col: 0.5
:END:
#+BEGIN_SRC C++
#include <opencv2/gapi.hpp>
#include <opencv2/gapi/core.hpp>
#include <opencv2/gapi/imgproc.hpp>
#include <opencv2/highgui.hpp>
int main(int argc, char *argv[]) {
using namespace cv;
if (argc != 3) return 1;
GMat in;
GMat gx = gapi::Sobel(in, CV_32F, 1, 0);
GMat gy = gapi::Sobel(in, CV_32F, 0, 1);
GMat mag = gapi::sqrt( gapi::mul(gx, gx)
+ gapi::mul(gy, gy));
GMat out = gapi::convertTo(mag, CV_8U);
GComputation sobel(GIn(in), GOut(out));
Mat in_mat = imread(argv[1]), out_mat;
sobel.apply(in_mat, out_mat);
imwrite(argv[2], out_mat);
return 0;
}
#+END_SRC
** The code is worth a thousand words (cont'd)
# FIXME: sections!!!
*** What we have just learned?
- G-API functions mimic their traditional OpenCV ancestors;
- No real data is required to construct a graph;
- Graph construction and graph execution are separate steps.
*** What else?
- Graph is first /expressed/ and then /captured/ in an object;
- Graph constructor defines /protocol/; user can pass vectors of
inputs/outputs like
#+BEGIN_SRC C++
cv::GComputation(cv::GIn(...), cv::GOut(...))
#+END_SRC
- Calls to ~.apply()~ must conform to graph's protocol
** On data objects
Graph *protocol* defines what arguments a computation was defined on
(both inputs and outputs), and what are the *shapes* (or types) of
those arguments:
| *Shape* | *Argument* | Size |
|--------------+------------------+-----------------------------|
| ~GMat~ | ~Mat~ | Static; defined during |
| | | graph compilation |
|--------------+------------------+-----------------------------|
| ~GScalar~ | ~Scalar~ | 4 x ~double~ |
|--------------+------------------+-----------------------------|
| ~GArray<T>~ | ~std::vector<T>~ | Dynamic; defined in runtime |
|--------------+------------------+-----------------------------|
| ~GOpaque<T>~ | ~T~ | Static, ~sizeof(T)~ |
~GScalar~ may be value-initialized at construction time to allow
expressions like ~GMat a = 2*(b + 1)~.
** On operations and kernels
:PROPERTIES:
:BEAMER_opt: shrink=22
:END:
*** :B_block:BMCOL:
:PROPERTIES:
:BEAMER_env: block
:BEAMER_col: 0.45
:END:
- Graphs are built with *Operations* over virtual *Data*;
- *Operations* define interfaces (literally);
- *Kernels* are implementations to *Operations* (like in OOP);
- An *Operation* is platform-agnostic, a *kernel* is not;
- *Kernels* are implemented for *Backends*, the latter provide
APIs to write kernels;
- Users can /add/ their *own* operations and kernels,
and also /redefine/ "standard" kernels their *own* way.
*** :B_block:BMCOL:
:PROPERTIES:
:BEAMER_env: block
:BEAMER_col: 0.45
:END:
#+BEGIN_SRC dot :file "000-ops-kernels.eps" :cmdline "-Kdot -Teps"
digraph G {
node [shape=box];
rankdir=BT;
Gr [label="Graph"];
Op [label="Operation\nA"];
{rank=same
Impl1 [label="Kernel\nA:2"];
Impl2 [label="Kernel\nA:1"];
}
Op -> Gr [dir=back, label="'consists of'"];
Impl1 -> Op [];
Impl2 -> Op [label="'is implemented by'"];
node [shape=note,style=dashed];
{rank=same
Op;
CommentOp [label="Abstract:\ndeclared via\nG_API_OP()"];
}
{rank=same
Comment1 [label="Platform:\ndefined with\nOpenCL backend"];
Comment2 [label="Platform:\ndefined with\nOpenCV backend"];
}
CommentOp -> Op [constraint=false, style=dashed, arrowhead=none];
Comment1 -> Impl1 [style=dashed, arrowhead=none];
Comment2 -> Impl2 [style=dashed, arrowhead=none];
}
#+END_SRC
** On operations and kernels (cont'd)
*** Defining an operation
- A type name (every operation is a C++ type);
- Operation signature (similar to ~std::function<>~);
- Operation identifier (a string);
- Metadata callback -- describe what is the output value format(s),
given the input and arguments.
- Use ~OpType::on(...)~ to use a new kernel ~OpType~ to construct graphs.
#+LaTeX: {\footnotesize
#+BEGIN_SRC C++
G_API_OP(GSqrt,<GMat(GMat)>,"org.opencv.core.math.sqrt") {
static GMatDesc outMeta(GMatDesc in) { return in; }
};
#+END_SRC
#+LaTeX: }
** On operations and kernels (cont'd)
*** ~GSqrt~ vs. ~cv::gapi::sqrt()~
- How a *type* relates to a *functions* from the example?
- These functions are just wrappers over ~::on~:
#+LaTeX: {\scriptsize
#+BEGIN_SRC C++
G_API_OP(GSqrt,<GMat(GMat)>,"org.opencv.core.math.sqrt") {
static GMatDesc outMeta(GMatDesc in) { return in; }
};
GMat gapi::sqrt(const GMat& src) { return GSqrt::on(src); }
#+END_SRC
#+LaTeX: }
- Why -- Doxygen, default parameters, 1:n mapping:
#+LaTeX: {\scriptsize
#+BEGIN_SRC C++
cv::GMat custom::unsharpMask(const cv::GMat &src,
const int sigma,
const float strength) {
cv::GMat blurred = cv::gapi::medianBlur(src, sigma);
cv::GMat laplacian = cv::gapi::Laplacian(blurred, CV_8U);
return (src - (laplacian * strength));
}
#+END_SRC
#+LaTeX: }
** On operations and kernels (cont'd)
*** Implementing an operation
- Depends on the backend and its API;
- Common part for all backends: refer to operation being implemented
using its /type/.
*** OpenCV backend
- OpenCV backend is the default one: OpenCV kernel is a wrapped OpenCV
function:
#+LaTeX: {\footnotesize
#+BEGIN_SRC C++
GAPI_OCV_KERNEL(GCPUSqrt, cv::gapi::core::GSqrt) {
static void run(const cv::Mat& in, cv::Mat &out) {
cv::sqrt(in, out);
}
};
#+END_SRC
#+LaTeX: }
** Operations and Kernels (cont'd)
# FIXME!!!
*** Fluid backend
- Fluid backend operates with row-by-row kernels and schedules its
execution to optimize data locality:
#+LaTeX: {\footnotesize
#+BEGIN_SRC C++
GAPI_FLUID_KERNEL(GFluidSqrt, cv::gapi::core::GSqrt, false) {
static const int Window = 1;
static void run(const View &in, Buffer &out) {
hal::sqrt32f(in .InLine <float>(0)
out.OutLine<float>(0),
out.length());
}
};
#+END_SRC
#+LaTeX: }
- Note ~run~ changes signature but still is derived from the operation
signature.
** Operations and Kernels (cont'd)
*** Specifying which kernels to use
- Graph execution model is defined by kernels which are available/used;
- Kernels can be specified via the graph compilation arguments:
#+LaTeX: {\footnotesize
#+BEGIN_SRC C++
#include <opencv2/gapi/fluid/core.hpp>
#include <opencv2/gapi/fluid/imgproc.hpp>
...
auto pkg = cv::gapi::combine(cv::gapi::core::fluid::kernels(),
cv::gapi::imgproc::fluid::kernels());
sobel.apply(in_mat, out_mat, cv::compile_args(pkg));
#+END_SRC
#+LaTeX: }
- Users can combine kernels of different backends and G-API will partition
the execution among those automatically.
** Heterogeneity in G-API
:PROPERTIES:
:BEAMER_opt: shrink=35
:END:
*** Automatic subgraph partitioning in G-API
*** :B_block:BMCOL:
:PROPERTIES:
:BEAMER_env: block
:BEAMER_col: 0.18
:END:
#+BEGIN_SRC dot :file "010-hetero-init.eps" :cmdline "-Kdot -Teps"
digraph G {
rankdir=TB;
ranksep=0.3;
node [shape=box margin=0 height=0.25];
A; B; C;
node [shape=ellipse];
GMat0;
GMat1;
GMat2;
GMat3;
GMat0 -> A -> GMat1 -> B -> GMat2;
GMat2 -> C;
GMat0 -> C -> GMat3
subgraph cluster {style=invis; A; GMat1; B; GMat2; C};
}
#+END_SRC
The initial graph: operations are not resolved yet.
*** :B_block:BMCOL:
:PROPERTIES:
:BEAMER_env: block
:BEAMER_col: 0.18
:END:
#+BEGIN_SRC dot :file "011-hetero-homo.eps" :cmdline "-Kdot -Teps"
digraph G {
rankdir=TB;
ranksep=0.3;
node [shape=box margin=0 height=0.25];
A; B; C;
node [shape=ellipse];
GMat0;
GMat1;
GMat2;
GMat3;
GMat0 -> A -> GMat1 -> B -> GMat2;
GMat2 -> C;
GMat0 -> C -> GMat3
subgraph cluster {style=filled;color=azure2; A; GMat1; B; GMat2; C};
}
#+END_SRC
All operations are handled by the same backend.
*** :B_block:BMCOL:
:PROPERTIES:
:BEAMER_env: block
:BEAMER_col: 0.18
:END:
#+BEGIN_SRC dot :file "012-hetero-a.eps" :cmdline "-Kdot -Teps"
digraph G {
rankdir=TB;
ranksep=0.3;
node [shape=box margin=0 height=0.25];
A; B; C;
node [shape=ellipse];
GMat0;
GMat1;
GMat2;
GMat3;
GMat0 -> A -> GMat1 -> B -> GMat2;
GMat2 -> C;
GMat0 -> C -> GMat3
subgraph cluster_1 {style=filled;color=azure2; A; GMat1; B; }
subgraph cluster_2 {style=filled;color=ivory2; C};
}
#+END_SRC
~A~ & ~B~ are of backend ~1~, ~C~ is of backend ~2~.
*** :B_block:BMCOL:
:PROPERTIES:
:BEAMER_env: block
:BEAMER_col: 0.18
:END:
#+BEGIN_SRC dot :file "013-hetero-b.eps" :cmdline "-Kdot -Teps"
digraph G {
rankdir=TB;
ranksep=0.3;
node [shape=box margin=0 height=0.25];
A; B; C;
node [shape=ellipse];
GMat0;
GMat1;
GMat2;
GMat3;
GMat0 -> A -> GMat1 -> B -> GMat2;
GMat2 -> C;
GMat0 -> C -> GMat3
subgraph cluster_1 {style=filled;color=azure2; A};
subgraph cluster_2 {style=filled;color=ivory2; B};
subgraph cluster_3 {style=filled;color=azure2; C};
}
#+END_SRC
~A~ & ~C~ are of backend ~1~, ~B~ is of backend ~2~.
** Heterogeneity in G-API
*** Heterogeneity summary
- G-API automatically partitions its graph in subgraphs (called "islands")
based on the available kernels;
- Adjacent kernels taken from the same backend are "fused" into the same
"island";
- G-API implements a two-level execution model:
- Islands are executed at the top level by a G-API's *Executor*;
- Island internals are run at the bottom level by its *Backend*;
- G-API fully delegates the low-level execution and memory management to backends.
* Inference and Streaming
** Inference with G-API
*** In-graph inference example
- Starting with OpencV 4.2 (2019), G-API allows to integrate ~infer~
operations into the graph:
#+LaTeX: {\scriptsize
#+BEGIN_SRC C++
G_API_NET(ObjDetect, <cv::GMat(cv::GMat)>, "pdf.example.od");
cv::GMat in;
cv::GMat blob = cv::gapi::infer<ObjDetect>(bgr);
cv::GOpaque<cv::Size> size = cv::gapi::streaming::size(bgr);
cv::GArray<cv::Rect> objs = cv::gapi::streaming::parseSSD(blob, size);
cv::GComputation pipelne(cv::GIn(in), cv::GOut(objs));
#+END_SRC
#+LaTeX: }
- Starting with OpenCV 4.5 (2020), G-API will provide more streaming-
and NN-oriented operations out of the box.
** Inference with G-API
*** What is the difference?
- ~ObjDetect~ is not an operation, ~cv::gapi::infer<T>~ is;
- ~cv::gapi::infer<T>~ is a *generic* operation, where ~T=ObjDetect~ describes
the calling convention:
- How many inputs the network consumes,
- How many outputs the network produces.
- Inference data types are ~GMat~ only:
- Representing an image, then preprocessed automatically;
- Representing a blob (n-dimensional ~Mat~), then passed as-is.
- Inference *backends* only need to implement a single generic operation ~infer~.
** Inference with G-API
*** But how does it run?
- Since ~infer~ is an *Operation*, backends may provide *Kernels* implenting it;
- The only publicly available inference backend now is *OpenVINO™*:
- Brings its ~infer~ kernel atop of the Inference Engine;
- NN model data is passed through G-API compile arguments (like kernels);
- Every NN backend provides its own structure to configure the network (like
a kernel API).
** Inference with G-API
*** Passing OpenVINO™ parameters to G-API
- ~ObjDetect~ example:
#+LaTeX: {\footnotesize
#+BEGIN_SRC C++
auto face_net = cv::gapi::ie::Params<ObjDetect> {
face_xml_path, // path to the topology IR
face_bin_path, // path to the topology weights
face_device_string, // OpenVINO plugin (device) string
};
auto networks = cv::gapi::networks(face_net);
pipeline.compile(.., cv::compile_args(..., networks));
#+END_SRC
#+LaTeX: }
- ~AgeGender~ requires binding Op's outputs to NN layers:
#+LaTeX: {\footnotesize
#+BEGIN_SRC C++
auto age_net = cv::gapi::ie::Params<AgeGender> {
...
}.cfgOutputLayers({"age_conv3", "prob"}); // array<string,2> !
#+END_SRC
#+LaTeX: }
** Streaming with G-API
#+BEGIN_SRC dot :file 020-fd-demo.eps :cmdline "-Kdot -Teps"
digraph {
rankdir=LR;
node [shape=box];
cap [label=Capture];
dec [label=Decode];
res [label=Resize];
cnn [label=Infer];
vis [label=Visualize];
cap -> dec;
dec -> res;
res -> cnn;
cnn -> vis;
}
#+END_SRC
Anatomy of a regular video analytics application
** Streaming with G-API
#+BEGIN_SRC dot :file 021-fd-serial.eps :cmdline "-Kdot -Teps"
digraph {
node [shape=box margin=0 width=0.3 height=0.4]
nodesep=0.2;
rankdir=LR;
subgraph cluster0 {
colorscheme=blues9
pp [label="..." shape=plaintext];
v0 [label=V];
label="Frame N-1";
color=7;
}
subgraph cluster1 {
colorscheme=blues9
c1 [label=C];
d1 [label=D];
r1 [label=R];
i1 [label=I];
v1 [label=V];
label="Frame N";
color=6;
}
subgraph cluster2 {
colorscheme=blues9
c2 [label=C];
nn [label="..." shape=plaintext];
label="Frame N+1";
color=5;
}
c1 -> d1 -> r1 -> i1 -> v1;
pp-> v0;
v0 -> c1 [style=invis];
v1 -> c2 [style=invis];
c2 -> nn;
}
#+END_SRC
Serial execution of the sample video analytics application
** Streaming with G-API
:PROPERTIES:
:BEAMER_opt: shrink
:END:
#+BEGIN_SRC dot :file 022-fd-pipelined.eps :cmdline "-Kdot -Teps"
digraph {
nodesep=0.2;
ranksep=0.2;
node [margin=0 width=0.4 height=0.2];
node [shape=plaintext]
Camera [label="Camera:"];
GPU [label="GPU:"];
FPGA [label="FPGA:"];
CPU [label="CPU:"];
Time [label="Time:"];
t6 [label="T6"];
t7 [label="T7"];
t8 [label="T8"];
t9 [label="T9"];
t10 [label="T10"];
tnn [label="..."];
node [shape=box margin=0 width=0.4 height=0.4 colorscheme=blues9]
node [color=9] V3;
node [color=8] F4; V4;
node [color=7] DR5; F5; V5;
node [color=6] C6; DR6; F6; V6;
node [color=5] C7; DR7; F7; V7;
node [color=4] C8; DR8; F8;
node [color=3] C9; DR9;
node [color=2] C10;
{rank=same; rankdir=LR; Camera C6 C7 C8 C9 C10}
Camera -> C6 -> C7 -> C8 -> C9 -> C10 [style=invis];
{rank=same; rankdir=LR; GPU DR5 DR6 DR7 DR8 DR9}
GPU -> DR5 -> DR6 -> DR7 -> DR8 -> DR9 [style=invis];
C6 -> DR5 [style=invis];
C6 -> DR6 [constraint=false];
C7 -> DR7 [constraint=false];
C8 -> DR8 [constraint=false];
C9 -> DR9 [constraint=false];
{rank=same; rankdir=LR; FPGA F4 F5 F6 F7 F8}
FPGA -> F4 -> F5 -> F6 -> F7 -> F8 [style=invis];
DR5 -> F4 [style=invis];
DR5 -> F5 [constraint=false];
DR6 -> F6 [constraint=false];
DR7 -> F7 [constraint=false];
DR8 -> F8 [constraint=false];
{rank=same; rankdir=LR; CPU V3 V4 V5 V6 V7}
CPU -> V3 -> V4 -> V5 -> V6 -> V7 [style=invis];
F4 -> V3 [style=invis];
F4 -> V4 [constraint=false];
F5 -> V5 [constraint=false];
F6 -> V6 [constraint=false];
F7 -> V7 [constraint=false];
{rank=same; rankdir=LR; Time t6 t7 t8 t9 t10 tnn}
Time -> t6 -> t7 -> t8 -> t9 -> t10 -> tnn [style=invis];
CPU -> Time [style=invis];
V3 -> t6 [style=invis];
V4 -> t7 [style=invis];
V5 -> t8 [style=invis];
V6 -> t9 [style=invis];
V7 -> t10 [style=invis];
}
#+END_SRC
Pipelined execution for the video analytics application
** Streaming with G-API: Example
**** Serial mode (4.0) :B_block:BMCOL:
:PROPERTIES:
:BEAMER_env: block
:BEAMER_col: 0.45
:END:
#+LaTeX: {\tiny
#+BEGIN_SRC C++
pipeline = cv::GComputation(...);
cv::VideoCapture cap(input);
cv::Mat in_frame;
std::vector<cv::Rect> out_faces;
while (cap.read(in_frame)) {
pipeline.apply(cv::gin(in_frame),
cv::gout(out_faces),
cv::compile_args(kernels,
networks));
// Process results
...
}
#+END_SRC
#+LaTeX: }
**** Streaming mode (since 4.2) :B_block:BMCOL:
:PROPERTIES:
:BEAMER_env: block
:BEAMER_col: 0.45
:END:
#+LaTeX: {\tiny
#+BEGIN_SRC C++
pipeline = cv::GComputation(...);
auto in_src = cv::gapi::wip::make_src
<cv::gapi::wip::GCaptureSource>(input)
auto cc = pipeline.compileStreaming
(cv::compile_args(kernels, networks))
cc.setSource(cv::gin(in_src));
cc.start();
std::vector<cv::Rect> out_faces;
while (cc.pull(cv::gout(out_faces))) {
// Process results
...
}
#+END_SRC
#+LaTeX: }
**** More information
#+LaTeX: {\footnotesize
https://opencv.org/hybrid-cv-dl-pipelines-with-opencv-4-4-g-api/
#+LaTeX: }
* Latest features
** Latest features
*** Python API
- Initial Python3 binding is available now in ~master~ (future 4.5);
- Only basic CV functionality is supported (~core~ & ~imgproc~ namespaces,
selecting backends);
- Adding more programmability, inference, and streaming is next.
** Latest features
*** Python API
#+LaTeX: {\footnotesize
#+BEGIN_SRC Python
import numpy as np
import cv2 as cv
sz = (1280, 720)
in1 = np.random.randint(0, 100, sz).astype(np.uint8)
in2 = np.random.randint(0, 100, sz).astype(np.uint8)
g_in1 = cv.GMat()
g_in2 = cv.GMat()
g_out = cv.gapi.add(g_in1, g_in2)
gr = cv.GComputation(g_in1, g_in2, g_out)
pkg = cv.gapi.core.fluid.kernels()
out = gr.apply(in1, in2, args=cv.compile_args(pkg))
#+END_SRC
#+LaTeX: }
* Understanding the "G-Effect"
** Understanding the "G-Effect"
*** What is "G-Effect"?
- G-API is not only an API, but also an /implementation/;
- i.e. it does some work already!
- We call "G-Effect" any measurable improvement which G-API demonstrates
against traditional methods;
- So far the list is:
- Memory consumption;
- Performance;
- Programmer efforts.
Note: in the following slides, all measurements are taken on
Intel\textregistered{} Core\texttrademark-i5 6600 CPU.
** Understanding the "G-Effect"
# FIXME
*** Memory consumption: Sobel Edge Detector
- G-API/Fluid backend is designed to minimize footprint:
#+LaTeX: {\footnotesize
| Input | OpenCV | G-API/Fluid | Factor |
| | MiB | MiB | Times |
|-------------+--------+-------------+--------|
| 512 x 512 | 17.33 | 0.59 | 28.9x |
| 640 x 480 | 20.29 | 0.62 | 32.8x |
| 1280 x 720 | 60.73 | 0.72 | 83.9x |
| 1920 x 1080 | 136.53 | 0.83 | 164.7x |
| 3840 x 2160 | 545.88 | 1.22 | 447.4x |
#+LaTeX: }
- The detector itself can be written manually in two ~for~
loops, but G-API covers cases more complex than that;
- OpenCV code requires changes to shrink footprint.
** Understanding the "G-Effect"
*** Performance: Sobel Edge Detector
- G-API/Fluid backend also optimizes cache reuse:
#+LaTeX: {\footnotesize
| Input | OpenCV | G-API/Fluid | Factor |
| | ms | ms | Times |
|-------------+--------+-------------+--------|
| 320 x 240 | 1.16 | 0.53 | 2.17x |
| 640 x 480 | 5.66 | 1.89 | 2.99x |
| 1280 x 720 | 17.24 | 5.26 | 3.28x |
| 1920 x 1080 | 39.04 | 12.29 | 3.18x |
| 3840 x 2160 | 219.57 | 51.22 | 4.29x |
#+LaTeX: }
- The more data is processed, the bigger "G-Effect" is.
** Understanding the "G-Effect"
*** Relative speed-up based on cache efficiency
#+BEGIN_LATEX
\begin{figure}
\begin{tikzpicture}
\begin{axis}[
xlabel={Image size},
ylabel={Relative speed-up},
nodes near coords,
width=0.8\textwidth,
xtick=data,
xticklabels={QVGA, VGA, HD, FHD, UHD},
height=4.5cm,
]
\addplot plot coordinates {(1, 1.0) (2, 1.38) (3, 1.51) (4, 1.46) (5, 1.97)};
\end{axis}
\end{tikzpicture}
\end{figure}
#+END_LATEX
The higher resolution is, the higher relative speed-up is (with
speed-up on QVGA taken as 1.0).
* Resources on G-API
** Resources on G-API
:PROPERTIES:
:BEAMER_opt: shrink
:END:
*** Repository
- https://github.com/opencv/opencv (see ~modules/gapi~)
*** Article
- https://opencv.org/hybrid-cv-dl-pipelines-with-opencv-4-4-g-api/
*** Documentation
- https://docs.opencv.org/4.4.0/d0/d1e/gapi.html
*** Tutorials
- https://docs.opencv.org/4.4.0/df/d7e/tutorial_table_of_content_gapi.html
* Thank you!

View File

@ -0,0 +1,25 @@
#!/usr/bin/env bash
set -e
MTHEME_VER=2fa6084b9d34fec9d2d5470eb9a17d0bf712b6c8
MTHEME_DIR=mtheme.sty
function make_sty {
if [ -d "$MTHEME_DIR" ]; then rm -rf "$MTHEME_DIR"; fi
mkdir "$MTHEME_DIR"
# Download template from Github
tmp_dir=$(mktemp -d)
wget -P "$tmp_dir" -c https://github.com/matze/mtheme/archive/${MTHEME_VER}.tar.gz
pushd "$tmp_dir"
tar -xzvf "$MTHEME_VER.tar.gz"
popd
make -C "$tmp_dir"/mtheme-"$MTHEME_VER"
cp -v "$tmp_dir"/mtheme-"$MTHEME_VER"/*.sty "$MTHEME_DIR"
rm -r "$tmp_dir"
# Put our own .gitignore to ignore this directory completely
echo "*" > "$MTHEME_DIR/.gitignore"
}
make_sty

View File

@ -0,0 +1,181 @@
%!PS-Adobe-3.0 EPSF-3.0
%%Creator: cairo 1.14.6 (http://cairographics.org)
%%CreationDate: Wed Dec 12 17:03:17 2018
%%Pages: 1
%%DocumentData: Clean7Bit
%%LanguageLevel: 2
%%BoundingBox: 0 -1 598 739
%%EndComments
%%BeginProlog
save
50 dict begin
/q { gsave } bind def
/Q { grestore } bind def
/cm { 6 array astore concat } bind def
/w { setlinewidth } bind def
/J { setlinecap } bind def
/j { setlinejoin } bind def
/M { setmiterlimit } bind def
/d { setdash } bind def
/m { moveto } bind def
/l { lineto } bind def
/c { curveto } bind def
/h { closepath } bind def
/re { exch dup neg 3 1 roll 5 3 roll moveto 0 rlineto
0 exch rlineto 0 rlineto closepath } bind def
/S { stroke } bind def
/f { fill } bind def
/f* { eofill } bind def
/n { newpath } bind def
/W { clip } bind def
/W* { eoclip } bind def
/BT { } bind def
/ET { } bind def
/pdfmark where { pop globaldict /?pdfmark /exec load put }
{ globaldict begin /?pdfmark /pop load def /pdfmark
/cleartomark load def end } ifelse
/BDC { mark 3 1 roll /BDC pdfmark } bind def
/EMC { mark /EMC pdfmark } bind def
/cairo_store_point { /cairo_point_y exch def /cairo_point_x exch def } def
/Tj { show currentpoint cairo_store_point } bind def
/TJ {
{
dup
type /stringtype eq
{ show } { -0.001 mul 0 cairo_font_matrix dtransform rmoveto } ifelse
} forall
currentpoint cairo_store_point
} bind def
/cairo_selectfont { cairo_font_matrix aload pop pop pop 0 0 6 array astore
cairo_font exch selectfont cairo_point_x cairo_point_y moveto } bind def
/Tf { pop /cairo_font exch def /cairo_font_matrix where
{ pop cairo_selectfont } if } bind def
/Td { matrix translate cairo_font_matrix matrix concatmatrix dup
/cairo_font_matrix exch def dup 4 get exch 5 get cairo_store_point
/cairo_font where { pop cairo_selectfont } if } bind def
/Tm { 2 copy 8 2 roll 6 array astore /cairo_font_matrix exch def
cairo_store_point /cairo_font where { pop cairo_selectfont } if } bind def
/g { setgray } bind def
/rg { setrgbcolor } bind def
/d1 { setcachedevice } bind def
%%EndProlog
%%BeginSetup
%%EndSetup
%%Page: 1 1
%%BeginPageSetup
%%PageBoundingBox: 0 -1 598 739
%%EndPageSetup
q 0 -1 598 740 rectclip q
1 0.00392157 0.00392157 rg
225.648 478.363 m 171.051 509.887 144.43 574.156 160.746 635.051 c 177.066
695.945 232.254 738.277 295.301 738.277 c 358.348 738.277 413.535 695.945
429.855 635.051 c 446.172 574.156 419.551 509.887 364.949 478.363 c 323.008
551.008 l 344.73 563.547 355.324 589.117 348.832 613.34 c 342.34 637.566
320.383 654.41 295.301 654.41 c 270.219 654.41 248.262 637.566 241.77 613.34
c 235.277 589.117 245.871 563.547 267.59 551.008 c h
225.648 478.363 m f
0.00392157 0.00392157 1 rg
523.949 444.637 m 578.551 413.113 605.172 348.844 588.855 287.949 c 572.535
227.055 517.348 184.723 454.301 184.723 c 391.254 184.723 336.066 227.055
319.746 287.949 c 303.43 348.844 330.051 413.113 384.648 444.637 c 426.59
371.992 l 404.871 359.453 394.277 333.883 400.77 309.66 c 407.262 285.434
429.219 268.59 454.301 268.59 c 479.383 268.59 501.34 285.434 507.832 309.66
c 514.324 333.883 503.73 359.453 482.008 371.992 c h
523.949 444.637 m f
0.00392157 1 0.00392157 rg
278.602 324 m 278.602 260.953 236.254 205.762 175.359 189.449 c 114.461
173.133 50.207 199.762 18.684 254.363 c -12.84 308.961 -3.773 377.922 40.805
422.504 c 85.383 467.082 154.352 476.164 208.949 444.637 c 167.008 371.992
l 145.289 384.535 117.852 380.922 100.117 363.188 c 82.383 345.453 78.773
318.016 91.316 296.297 c 103.855 274.574 129.418 263.98 153.645 270.473
c 177.871 276.961 194.719 298.918 194.719 324 c h
278.602 324 m f
0.0196078 g
39.781 151.301 m 51.57 152.359 63.492 152.352 75.223 150.672 c 82.449 149.391
90.121 147.52 95.551 142.25 c 101.242 135.898 102.641 127.078 103.891 118.949
c 105.941 102.078 105.699 84.969 103.891 68.09 c 102.68 59.852 101.492
50.949 96.09 44.25 c 90.199 38.27 81.5 36.57 73.52 35.309 c 61.742 33.84
49.789 33.5 37.961 34.68 c 29.949 35.5 21.59 36.91 14.77 41.48 c 10.359
44.281 7.992 49.219 6.379 54.012 c 3.152 63.988 2.742 74.59 2.301 84.988
c 2.25 98.73 2.512 112.609 5.191 126.129 c 6.641 132.441 8.402 139.379
13.73 143.59 c 21.242 149.039 30.789 150.359 39.781 151.301 c h
41.73 132.469 m 51.723 133.27 61.922 133.512 71.801 131.57 c 75.629 130.801
80.152 128.941 80.871 124.578 c 83.871 112.309 83.172 99.531 83.289 86.988
c 82.922 78.07 83.129 68.852 80.141 60.309 c 77.531 54.699 70.422 54.238
65.062 53.422 c 54.312 52.809 43.152 52.27 32.723 55.461 c 27.91 56.73
26.391 61.891 25.652 66.219 c 23.652 79.051 24.301 92.102 24.551 105.031
c 25.082 112.281 24.992 119.801 27.602 126.691 c 30.59 131.309 36.77 131.719
41.73 132.469 c h
41.73 132.469 m f*
147.07 112.219 m 154.23 116.77 163.121 117.512 171.379 116.762 c 179.09
116.102 187.652 113.48 191.781 106.379 c 196.711 97.469 196.992 86.941
197.332 77 c 197.109 66.781 196.922 56.109 192.699 46.609 c 190.289 40.84
184.75 37.059 178.82 35.57 c 169.742 33.34 159.762 33.102 151.012 36.719
c 146.281 38.57 143.012 42.59 140.301 46.711 c 140.301 0 l 120.301 0 l
120.312 38.66 120.281 77.328 120.312 115.988 c 126.781 116.02 133.25 116.02
139.711 115.988 c 139.492 112.012 139.27 108.039 139.16 104.051 c 141.562
106.98 143.789 110.199 147.07 112.219 c h
153.582 101.781 m 159.18 102.211 165.102 102.328 170.34 100.02 c 173.66
98.59 175.41 95.078 176 91.68 c 177.742 82.91 177.52 73.852 176.902 64.969
c 176.281 59.609 175.422 52.672 169.52 50.59 c 162.699 48.359 154.922 48.219
148.18 50.828 c 141.91 53.469 141.18 61.059 140.562 66.949 c 140.191 75.988
139.742 85.289 142.289 94.07 c 143.641 99.051 148.82 101.41 153.582 101.781
c h
153.582 101.781 m f*
221.262 112.07 m 231.09 117.121 242.602 117.301 253.391 116.789 c 262.371
116.039 273.27 114.539 278.223 105.949 c 283.801 95.578 282.891 83.379
283.672 72 c 228.961 72 l 229.602 66.129 228.84 59.801 231.801 54.422 c
234.332 50.172 239.699 49.301 244.242 49.051 c 249.852 49.012 255.891 48.551
261.062 51.16 c 264.02 53.48 264.039 57.602 264.422 61 c 270.82 61.012
277.223 61.012 283.621 61 c 283.379 54.32 282.52 46.84 277.16 42.141 c 269.109
34.922 257.59 34.172 247.289 33.969 c 238.199 34.238 228.602 34.699 220.461
39.18 c 213.871 43.07 211.77 51.059 210.609 58.102 c 209.141 68.559 208.77
79.219 210.02 89.719 c 211.039 98.012 213.27 107.762 221.262 112.07 c h
232.949 99.34 m 238.41 102.66 245.172 101.988 251.301 101.898 c 255.102
101.488 259.73 101.27 262.199 97.91 c 264.723 93.762 264.27 88.68 264.289
84.02 c 252.52 84 240.762 83.969 229 84.031 c 229.18 89.211 228.77 95.531
232.949 99.34 c h
232.949 99.34 m f*
326.262 112.121 m 333.18 116.922 342.121 117.59 350.262 116.648 c 357.191
115.922 364.531 113.281 368.621 107.301 c 372.25 102.34 373.262 96.02 373.312
90.012 c 373.281 71.672 373.32 53.34 373.301 35 c 366.961 34.988 360.629
34.988 354.312 35 c 354.281 52.352 354.332 69.691 354.281 87.031 c 354.09
90.82 354.242 95.199 351.391 98.121 c 348.352 101.41 343.582 102.051 339.332
102.02 c 334.191 102.051 328.629 101.172 324.672 97.621 c 320.801 94.32
319.332 89 319.312 84.078 c 319.281 67.719 319.32 51.359 319.289 35.012
c 312.961 34.988 306.629 34.988 300.312 35 c 300.301 62 300.301 89 300.312
116 c 306.531 116.02 312.762 116.012 318.98 116 c 318.949 111.262 318.48
106.551 318.34 101.809 c 320.379 105.641 322.52 109.68 326.262 112.121
c h
326.262 112.121 m f*
407.691 147.602 m 418.172 151.121 429.34 151.621 440.301 152.012 c 450.922
151.961 462.02 151.859 471.941 147.578 c 476.98 145.48 480.473 140.879
482.172 135.801 c 484.941 128.211 485.02 119.988 485.082 112 c 477.77 112
470.461 111.98 463.16 112.012 c 463.039 117.629 463.473 123.93 459.992
128.711 c 456.473 132.309 450.973 132.301 446.301 132.852 c 436.801 133.031
426.91 133.641 417.812 130.359 c 414.531 129.32 412.832 126.039 412.172
122.879 c 410.301 114.398 410.289 105.648 410.301 97 c 410.41 85.441 410.23
73.711 412.699 62.34 c 413.352 58.18 417.18 55.621 421.02 54.699 c 429.902
52.488 439.172 52.809 448.242 53.352 c 452.973 53.969 458.73 54.281 461.699
58.621 c 464.871 63.801 464.34 70.172 464.172 75.988 c 471.551 76.02 478.922
76.012 486.301 75.988 c 486.211 66.801 486.051 57.309 482.711 48.609 c
480.992 44.059 477.441 40.199 472.84 38.461 c 463.812 34.84 453.91 34.609
444.332 34.031 c 433.223 33.84 421.973 34.109 411.109 36.699 c 404.742
38.359 397.781 41.281 394.832 47.609 c 391.062 55.98 390.371 65.289 389.402
74.301 c 388.59 86.199 388.07 98.121 388.359 110.039 c 388.93 119.691 389.812
129.859 395.02 138.27 c 397.789 142.949 402.652 145.879 407.691 147.602
c h
407.691 147.602 m f*
489.902 150.969 m 497.52 150.961 505.141 151.18 512.75 150.859 c 520.16
127.352 528.301 104.078 535.781 80.602 c 538.691 71.578 540.75 62.301 543.762
53.309 c 547.129 63.012 549.289 73.09 552.59 82.809 c 559.902 105.52 567.41
128.16 574.711 150.871 c 582.23 151.191 589.77 150.91 597.301 151.012 c
597.301 148.52 l 584.922 110.789 572.832 72.961 560.699 35.141 c 549.379
34.91 538.039 34.879 526.723 35.16 c 514.66 73.828 502.02 112.32 489.902
150.969 c h
489.902 150.969 m f*
Q Q
showpage
%%Trailer
end restore
%%EOF