visiongraph

Visiongraph Logo Bright

Visiongraph

Pepy Total Downloads

Visiongraph is a high level computer vision framework that includes predefined modules to quickly create and run algorithms on images. It is based on opencv and includes other computer vision frameworks like Intel openVINO and Google MediaPipe.

Here an example on how to start a webcam capture and display the image:

from visiongraph import vg

vg.create_graph(vg.VideoCaptureInput()).then(vg.ImagePreview()).open()

Get started with visiongraph by reading the documentation.

Installation

Visiongraph supports Python 3.10 and 3.11. Other versions may also work, but are not officially supported. Usually this is a third-party dependency problem: for example, pyrealsense2 does not have wheel packages for 3.12.

To install visiongraph with all dependencies call pip like this:

pip install "visiongraph[all]"

It is also possible to only install certain packages depending on your needs (recommended):

# example on how to install realsense and openvino support only
pip install "visiongraph[realsense, openvino]"

Please read more about the extra packages in the documentation.

Optional Mediapipe Support

Visiongraph can integrate Google’s MediaPipe for advanced hand, face and object tracking pipelines. Unfortunately, the official PyPI MediaPipe wheels declare a strict dependency on numpy<2.0, which prevents installation alongside NumPy 2.x, even though most functionality works fine with NumPy 2.0 and above. To work around this limitation, we maintain a custom mediapipe-numpy2 build that removes the <2.0 pin.

When you install with the mediapipe extra, pip will automatically fetch the matching patched wheel for your OS and Python version.

Alternative: Use the Official MediaPipe Release

If you’re happy to stick with NumPy <2.0, you can skip our custom package entirely and install the upstream MediaPipe wheel from PyPI:

pip install visiongraph mediapipe

This will install Visiongraph plus the official mediapipe package (which requires numpy<2.0). Just make sure your environment’s NumPy version is below 2.0 when using this route.

Examples

To demonstrate the possibilities of visiongraph there are already implemented examples ready for you to try out. Here is a list of the current examples:

SimpleVisionGraph - SSD object detection & tracking of live webcam input with 5 lines of code.
VisionGraphExample - A face detection and tracking example with custom events.
InputExample - A basic input example that determines the center if possible.
RealSenseDepthExample - Display the RealSense or Azure Kinect depth map.
FaceDetectionExample - A face detection pipeline example.
FindFaceExample - A face recognition example to find a target face.
CascadeFaceDetectionExample - A face detection pipeline that also predicts other feature points of the face.
HandDetectionExample - A hand detection pipeline example.
PoseEstimationExample - A pose estimation pipeline which annotates the generic pose keypoints.
ProjectedPoseExample - Project the pose estimation into 3d space with the RealSense camera.
ObjectDetectionExample - An object detection & tracking example.
InstanceSegmentationExample - Intance Segmentation based on COCO80 dataset.
InpaintExample - GAN based inpainting example.
MidasDepthExample - Realtime depth prediction with the midas-small network.
RGBDSmoother - Smooth RGB-D depth map videos with a one-euro filter per pixel.
FaceMeshVVADExample.py - Detect voice activation by landmark sequence classification.

There are even more examples where visiongraph is currently in use:

Spout/Syphon RGB-D Example - Share RGB-D images over spout or syphon.
WebRTC Input - WebRTC input example for visiongraph

Development

To develop on visiongraph it is recommended to clone this repository and install the dependencies like this. First install the uv package manager.

# in the visiongraph directory install all dependencies
uv sync --all-extras --dev --group docs

Build

To build a new wheel package of visiongraph run the following command in the root directory. Please find the wheel and source distribution in ./dist.

uv run python setup.py generate_init
uv build

Docs

To generate the documentation, use the following commands.

# create documentation into "./docs
uv run python setup.py doc

# launch pdoc webserver
uv run python setup.py doc --launch

Dependencies

Parts of these libraries are directly included and adapted to work with visiongraph.

motpy - simple multi object tracking library (MIT License)
motrackers - Multi-object trackers in Python (MIT License)
OneEuroFilter-Numpy - (MIT License)

Here you can find a list of the dependencies of visiongraph and their licence:

depthai               MIT License
faiss-cpu             MIT License
filterpy              MIT License
mediapipe             Apache License 2.0
moviepy               MIT License
numba                 BSD License
onnxruntime           MIT License
onnxruntime-directml  MIT License
onnxruntime-gpu       MIT License
opencv-python         Apache License 2.0
openvino              Apache License 2.0
pyk4a-bundle          MIT License
pyopengl              BSD License
pyrealsense2          Apache License 2.0
pyrealsense2-macosx   Apache License 2.0
requests              Apache License 2.0
scipy                 MIT License
SpoutGL               BSD License
syphon-python         MIT License
tqdm                  MIT License
vector                BSD License
vidgear               Apache License 2.0
wheel                 MIT License

For more information about the dependencies have a look at the requirements.txt.

Please note that some models (such as Ultralytics YOLOv8 and YOLOv11) have specific licences (AGPLv3). Always check the model licence before using the model.

About

Documentation

This documentation is intended to provide an overview of the framework. A full documentation will be available later.

Import Visiongraph

There are two ways on how to import visiongraph related objects and classes. The classical way is to use the direct import like this:

from visiongraph.estimator.openvino.OpenVinoEngine import OpenVinoEngine

engine = OpenVinoEngine(...)

However, due to the amount of packages and package depth in visiongraph, it is recommended to use the vg package:

from visiongraph import vg

engine = vg.OpenVinoEngine(...)

Optional Imports

vg allows for direct access of all members of visiongraph and even handles optional imports. If an import is not available, a stub-object is returned which throws an error on accessing its attributes. The reason behind this is, that it is possible to work with objects types, which would not be accessable on certain systems (like MacOS):

from visiongraph import vg

device = ...

if isinstance(device, vg.AzureKinectInput):
    # would always be "False" on MacOS
    print("This is a Kinect")

Graph

The core component of visiongraph is the BaseGraph class. It contains and handles all the nodes of the graph. A BaseGraph can run on the same thread as called or a new thread or process. The nodes in the graph are just a list, the graph itself is created by nesting nodes into each other.

Graph Node

A GraphNode is a single step in the graph. It has a input and output type and processes the data within the process() method.

Graph Builder

The graph builder helps to create new graphs on a single line in python. It creates a VisionGraph object which is a child of the BaseGraph. The following code snippet is an example of the graph builder which creates a smooth pose estimation graph.

from visiongraph import vg

graph = (
    vg.create_graph(name="Smooth Pose Estimation",
                    input_node=vg.VideoCaptureInput(0),
                    handle_signals=True)
    .apply(ssd=vg.sequence(vg.OpenPoseEstimator.create(), vg.MotpyTracker(), vg.LandmarkSmoothFilter()),
           image=vg.passthrough())
    .then(vg.ResultAnnotator(image="image"), vg.ImagePreview())
)
graph.open()

Input

Supported are image, video, webcam, RealSense and Azure Kinect input types.

Estimator

Usually an estimator is a graph node which takes an image as an input and estimates an information about the content. This could be a pose estimation or a face detection. It is also possible to have a transformation of the image, for example de-blurring it or estimate the depth map.

Object Detection Tracker

Object detection trackers allow a detected object to be assigned an id that remains the same across successive frames.

DSP (Digital Signal Processing)

To filter noisy estimations or inputs, the DSP package provides different filters which can be applied directly into a graph.

Recorder

To record incoming frames or annotated results, multiple frame recorders are provided.

Assets

Most estimators use big model and weight descriptions for their neural networks. To keep visiongraph small and easy to install, these assets are hosted externally on github. Visiongraph provides a system to directly download and cache these files.

Argparse

To support rapid prototyping many graph and estimator options are already provided to add to the argparse parser.

Logging

To enable logging for visiongraph imports please set the following environment variable:

# zsh / bash
export VISIONGRAPH_LOGLEVEL=INFO

# cmd
set VISIONGRAPH_LOGLEVEL=INFO

# powershell
$env:VISIONGRAPH_LOGLEVEL="INFO"

Extras

It is possible to install extra module to visiongraph by specifying them when installing visiongraph. Here is a list of currently supported extras:

realsense - Support for Intel RealSense cameras
azure - Support for Microsoft Azure Kinect cameras
depthai - Support for the Luxonis cameras
openvino - Support for the Intel openVINO machine learning framework
mediapipe - Support for the Google MediaPipe machine learning framework
onnxruntime - Support for the ONNX machine learning framework (CPU)
onnxruntime-gpu - Support for the ONNX machine learning framework (CUDA GPU)
onnxruntime-directml - Support for the ONNX machine learning framework (DirctML GPU)
media - Support for VidGear and MoviePy video reading and writing
numba - Improved performance for smoothing and tracking algorithms
fbs - Support for framebuffer sharing (SpoutGL or Syphon)
faiss - Support for fast pose classification
mot - Support for multi-object-tracking using motpy

View Source

 1"""
 2.. include:: ../README.md
 3.. include:: ../DOCUMENTATION.md
 4"""
 5
 6
 7def __getattr__(name):
 8    raise AttributeError(f"Visiongraph has no attribute '{name}'.\n\n"
 9                         f"Please note that with version '1.0.0', the 'vg' import has to be done like this:\n"
10                         f"    from visiongraph import vg")