visiongraph
Visiongraph
Visiongraph is a high level computer vision framework that includes predefined modules to quickly create and run algorithms on images. It is based on opencv and includes other computer vision frameworks like Intel openVINO and Google MediaPipe.
Here an example on how to start a webcam capture and display the image:
from visiongraph import vg
vg.create_graph(vg.VideoCaptureInput()).then(vg.ImagePreview()).open()
Get started with visiongraph by reading the documentation.
Installation
Visiongraph supports Python 3.10 and 3.11. Other versions may also work, but are not officially supported. Usually this is a third-party dependency problem: for example, pyrealsense2 does not have wheel packages for 3.12.
To install visiongraph with all dependencies call pip like this:
pip install "visiongraph[all]"
It is also possible to only install certain packages depending on your needs (recommended):
# example on how to install realsense and openvino support only
pip install "visiongraph[realsense, openvino]"
Please read more about the extra packages in the documentation.
Optional Mediapipe Support
Visiongraph can integrate Google’s MediaPipe for advanced hand, face and object tracking pipelines. Unfortunately, the official PyPI MediaPipe wheels declare a strict dependency on numpy<2.0, which prevents installation alongside NumPy 2.x, even though most functionality works fine with NumPy 2.0 and above. To work around this limitation, we maintain a custom mediapipe-numpy2 build that removes the <2.0 pin.
When you install with the mediapipe extra, pip will automatically fetch the matching patched wheel for your OS and Python version.
Alternative: Use the Official MediaPipe Release
If you’re happy to stick with NumPy <2.0, you can skip our custom package entirely and install the upstream MediaPipe wheel from PyPI:
pip install visiongraph mediapipe
This will install Visiongraph plus the official mediapipe package (which requires numpy<2.0). Just make sure your environment’s NumPy version is below 2.0 when using this route.
Examples
To demonstrate the possibilities of visiongraph there are already implemented examples ready for you to try out. Here is a list of the current examples:
- SimpleVisionGraph - SSD object detection & tracking of live webcam input with
5lines of code. - VisionGraphExample - A face detection and tracking example with custom events.
- InputExample - A basic input example that determines the center if possible.
- RealSenseDepthExample - Display the RealSense or Azure Kinect depth map.
- FaceDetectionExample - A face detection pipeline example.
- FindFaceExample - A face recognition example to find a target face.
- CascadeFaceDetectionExample - A face detection pipeline that also predicts other feature points of the face.
- HandDetectionExample - A hand detection pipeline example.
- PoseEstimationExample - A pose estimation pipeline which annotates the generic pose keypoints.
- ProjectedPoseExample - Project the pose estimation into 3d space with the RealSense camera.
- ObjectDetectionExample - An object detection & tracking example.
- InstanceSegmentationExample - Intance Segmentation based on COCO80 dataset.
- InpaintExample - GAN based inpainting example.
- MidasDepthExample - Realtime depth prediction with the midas-small network.
- RGBDSmoother - Smooth RGB-D depth map videos with a one-euro filter per pixel.
- FaceMeshVVADExample.py - Detect voice activation by landmark sequence classification.
There are even more examples where visiongraph is currently in use:
- Spout/Syphon RGB-D Example - Share RGB-D images over spout or syphon.
- WebRTC Input - WebRTC input example for visiongraph
Development
To develop on visiongraph it is recommended to clone this repository and install the dependencies like this. First install the uv package manager.
# in the visiongraph directory install all dependencies
uv sync --all-extras --dev --group docs
Build
To build a new wheel package of visiongraph run the following command in the root directory. Please find the wheel and source distribution in ./dist.
uv run python setup.py generate_init
uv build
Docs
To generate the documentation, use the following commands.
# create documentation into "./docs
uv run python setup.py doc
# launch pdoc webserver
uv run python setup.py doc --launch
Dependencies
Parts of these libraries are directly included and adapted to work with visiongraph.
- motpy - simple multi object tracking library (MIT License)
- motrackers - Multi-object trackers in Python (MIT License)
- OneEuroFilter-Numpy - (MIT License)
Here you can find a list of the dependencies of visiongraph and their licence:
depthai MIT License
faiss-cpu MIT License
filterpy MIT License
mediapipe Apache License 2.0
moviepy MIT License
numba BSD License
onnxruntime MIT License
onnxruntime-directml MIT License
onnxruntime-gpu MIT License
opencv-python Apache License 2.0
openvino Apache License 2.0
pyk4a-bundle MIT License
pyopengl BSD License
pyrealsense2 Apache License 2.0
pyrealsense2-macosx Apache License 2.0
requests Apache License 2.0
scipy MIT License
SpoutGL BSD License
syphon-python MIT License
tqdm MIT License
vector BSD License
vidgear Apache License 2.0
wheel MIT License
For more information about the dependencies have a look at the requirements.txt.
Please note that some models (such as Ultralytics YOLOv8 and YOLOv11) have specific licences (AGPLv3). Always check the model licence before using the model.
About
Copyright (c) 2025 Florian Bruggisser
Documentation
This documentation is intended to provide an overview of the framework. A full documentation will be available later.
Import Visiongraph
There are two ways on how to import visiongraph related objects and classes. The classical way is to use the direct import like this:
from visiongraph.estimator.openvino.OpenVinoEngine import OpenVinoEngine
engine = OpenVinoEngine(...)
However, due to the amount of packages and package depth in visiongraph, it is recommended to use the vg package:
from visiongraph import vg
engine = vg.OpenVinoEngine(...)
Optional Imports
vg allows for direct access of all members of visiongraph and even handles optional imports. If an import is not available, a stub-object is returned which throws an error on accessing its attributes. The reason behind this is, that it is possible to work with objects types, which would not be accessable on certain systems (like MacOS):
from visiongraph import vg
device = ...
if isinstance(device, vg.AzureKinectInput):
# would always be "False" on MacOS
print("This is a Kinect")
Graph
The core component of visiongraph is the BaseGraph class. It contains and handles all the nodes of the graph. A BaseGraph can run on the same thread as called or a new thread or process. The nodes in the graph are just a list, the graph itself is created by nesting nodes into each other.
Graph Node
A GraphNode is a single step in the graph. It has a input and output type and processes the data within the process() method.
Graph Builder
The graph builder helps to create new graphs on a single line in python. It creates a VisionGraph object which is a child of the BaseGraph. The following code snippet is an example of the graph builder which creates a smooth pose estimation graph.
from visiongraph import vg
graph = (
vg.create_graph(name="Smooth Pose Estimation",
input_node=vg.VideoCaptureInput(0),
handle_signals=True)
.apply(ssd=vg.sequence(vg.OpenPoseEstimator.create(), vg.MotpyTracker(), vg.LandmarkSmoothFilter()),
image=vg.passthrough())
.then(vg.ResultAnnotator(image="image"), vg.ImagePreview())
)
graph.open()
Input
Supported are image, video, webcam, RealSense and Azure Kinect input types.
Estimator
Usually an estimator is a graph node which takes an image as an input and estimates an information about the content. This could be a pose estimation or a face detection. It is also possible to have a transformation of the image, for example de-blurring it or estimate the depth map.
Object Detection Tracker
Object detection trackers allow a detected object to be assigned an id that remains the same across successive frames.
DSP (Digital Signal Processing)
To filter noisy estimations or inputs, the DSP package provides different filters which can be applied directly into a graph.
Recorder
To record incoming frames or annotated results, multiple frame recorders are provided.
Assets
Most estimators use big model and weight descriptions for their neural networks. To keep visiongraph small and easy to install, these assets are hosted externally on github. Visiongraph provides a system to directly download and cache these files.
Argparse
To support rapid prototyping many graph and estimator options are already provided to add to the argparse parser.
Logging
To enable logging for visiongraph imports please set the following environment variable:
# zsh / bash
export VISIONGRAPH_LOGLEVEL=INFO
# cmd
set VISIONGRAPH_LOGLEVEL=INFO
# powershell
$env:VISIONGRAPH_LOGLEVEL="INFO"
Extras
It is possible to install extra module to visiongraph by specifying them when installing visiongraph. Here is a list of currently supported extras:
realsense- Support for Intel RealSense camerasazure- Support for Microsoft Azure Kinect camerasdepthai- Support for the Luxonis camerasopenvino- Support for the Intel openVINO machine learning frameworkmediapipe- Support for the Google MediaPipe machine learning frameworkonnxruntime- Support for the ONNX machine learning framework (CPU)onnxruntime-gpu- Support for the ONNX machine learning framework (CUDA GPU)onnxruntime-directml- Support for the ONNX machine learning framework (DirctML GPU)media- Support for VidGear and MoviePy video reading and writingnumba- Improved performance for smoothing and tracking algorithmsfbs- Support for framebuffer sharing (SpoutGL or Syphon)faiss- Support for fast pose classificationmot- Support for multi-object-tracking using motpy