Skip to content

Latest commit

 

History

History
101 lines (73 loc) · 6.22 KB

about.md

File metadata and controls

101 lines (73 loc) · 6.22 KB

here are important instructions for VideoPipe if you want to figure out how it works!

Core parts in video structured application

video structured is a process which converts unstructure data (video here) into structured data. unstructure data:

  • video
  • image
  • audio
  • nature text

and structure data mainly includes something like json, xml or data table in DataBase which can by processed directly by machine (program).

Specifically in terms of video, the process of structured mainly involves these core parts:

  1. read stream. capture video stream from network or local machine.
  2. decode. decode byte stream to frames, algorithm/procedure can only act on images.
  3. inference. deep learning work on images, detect, classification or feature extraction.
  4. track. track on objects in video.
  5. behaviour analysis (optional). analysis on objects' tracks.
  6. osd. on screen display, show results on images for debug purpose or intuitive effects.
  7. message broker. push structured data to external.
  8. encode. encode frames which contains results to byte stream for transfer/serialization purpose.
  9. push stream. push byte stream to external or save it directly.

figure 1. core parts in video structured application.

each core part in video structured corresponding to one type of plugin in VideoPipe, namely Node in code.

Node in VideoPipe

one Node in VideoPipe responsible for single task such as decoding or inference. we put many nodes together to construct a pipe, and let video data flow through the whole pipeline. every Node has 2 queues inside, one is for caching data from upstream nodes and another one is for caching data waiting for being pushed to downstream nodes. we can write logic code between the 2 queues, they are typical producer-consumer pattern.

figure 2. what Node looks like?

by default, producer and consumer work with single thread inside node, we need write async code when deal with complex tasks (for example, pushing data is a time-consuming operation in vp_message_broker_node) to avoid blocking the pipeline.

figure 3. async task inside Node.

there are 3 types of Node in VideoPipe, namely:

  • SRC Node. source node where data was created (only 1 queue inside used for caching data being pushed to downstream nodes).
  • MID Node. middle node where data would be handled.
  • DES Node. destination node where data disappears (only 1 queue inside used for caching data from upstream nodes).

each Node itself has the ability to merge multi upstream nodes, and split into multi downstream nodes as well. note that Node use shallow-copy and copy Equally when data transfered from one node to other nodes by default, if you need deep-copy or want to transfer data by channel index (just hope data unconfused), add a vp_split_node at the point of spliting which would get different behaviour.

figure 4. merge & split in VideoPipe.

Data flow in VideoPipe

video (frame here) is a type of heavyweight data, so deep copying frequently would decrease the performance of pipeline. actually data transfered between 2 nodes in VideoPipe use smart pointers by default, once data was created by source nodes, the data content would NOT be copyed later at most time in the whole pipeline (but we can specify deep-copy if we need, using vp_split_node for instance).

figure 5. how data flows in VideoPipe.

video consist of continuous frames, VideoPipe handle these frames One by One, so the frame index in frame meta would increase continuously as well.

Hooks in VideoPipe

hook is a mechanism which let host notify listeners when something happens, VideoPipe support hooks as well. pipeline invokes callback functions (via std::function object) to communicate with external code, such as export fps, latency and other status of pipeline itself. we should NOT block the callback functions when writing custom code inside it.

figure 6. hooks in VideoPipe.

hooks help to debug with our application and quickly find the bottleneck in whole pipe, visualization tool vp_analysis_board works depend on hooks.

Implement new Node type in VideoPipe

vp_node is the base class for all nodes in VideoPipe. we can define a new node class derived from vp_node and override some virtual functions like handle_frame_meta and handle_control_meta.

  • handle_frame_meta. handle frame data flowing current node.
  • handle_control_meta. handle control data flowing current node.

figure 7. override virtual functions in custom Node.

frame data means vp_frame_meta in VideoPipe, contains data related to frame such as frame index, data buffer, original width. control data means vp_control_meta in VideoPipe, contains data related command such as record video, record image.

note, NOT all data flowing current node should be handled using new logic, they just pass through if no operations work on them. we just need handle what we are interested in.

Hardware Acceleration in VideoPipe

some operations in video structured applications can benefit from hardware such as GPUs/NPUs. for example, video encoding on GPUs have higher speed/performace than CPUs. VideoPipe support hardware acceleration for these parts:

  • decode/encode. based on HARD decode/encode gstreamer plugins, look more.
  • inference. no doubt about it.
  • osd. need implement based on hardware acceleration SDKs by yourself.

it is important to note that, although VideoPipe support hardware acceleration for above logic, they could NOT share memory between each others. it means that data will be copyed from GPU to CPU or CPU to GPU over and over again, which is the biggest disadvantage compared to other similar SDKs such as DeepStream.