VisionerΒΆ

The Visioner is a library that implements face detection, key point localization and pose estimation in still images using Boosted Classifiers. For the time being, We only provide a limited set of interfaces allowing detection and localization. You can incorporate a call to the Visioner detection system in 3-ways on your script:

  1. Use simple (single) face detection with bob.visioner.MaxDetector:

    In this mode, the Visioner will only detect the most likely face object in a given image. It returns a tuple containing the detection bounding box (top-left x, top-left y, width, height, score). Here is an usage example:

    detect_max = bob.visioner.MaxDetector()
    image = bob.io.load(...)
    bbox = detect_max(image)
    

    With this technique you can control:

    • the number of scanning levels;
    • the scale variation in pixels.

    Look at the user manual using help() for operational details.

  2. Use simple face detection with bob.visioner.Detector:

    In this mode, the Visioner will return all bounding boxes above a given threshold in the image. It returns a tuple of tuples (descending threshold ordered) containing the detection bounding boxes (top-left x, top-left y, width, height, score). Here is an usage example:

    detect = bob.visioner.Detector()
    image = bob.io.load(...)
    bboxes = detect(image) #note this is a tuple of tuples
    

    With this technique you can control:

    • the minimum detection threshold;
    • the number of scanning levels;
    • the scale variation in pixels;
    • the NMS clustering overlapping threshold.

    Look at the user manual using help() for operational details.

  3. Use key-point localization with bob.visioner.Localizer:

    In this mode, the Visioner will return a single bounding box and the x and y coordinates of every detected land mark in the image. The number of landmarks following the bounding box is determined by the loaded model. In Bob, we ship with two basic models:

    • bob.visioner.DEFAULT_LMODEL_EC: this is the default model used for keypoint localization if you don’t provide anything to the bob.visioner.Localizer constructor. A call to the function operator (__call__()) will return the bounding box followed by the coordinates of the left and right eyes respectively. The format is (top-left b.box x, top-left b.box y, b.box width, b.box height, left-eye x, left-eye y, right-eye x, right-eye y).
    • bob.visioner.DEFAULT_LMODEL_MP: this is an alternative model that can be used for keypoint localization. A call to the function operator with a Localizer equipped with this model will return the bounding box followed by the coordinates of the eye centers, eye corners, nose tip, nostrils and mouth corners (always left and then right coordinates, with the x value coming first followed by the y value of the keypoint).

    Note

    No scores are returned in this mode.

    Example usage:

    locate = bob.visioner.Localizer()
    image = bob.io.load(...)
    bbx_points = locate(image) #note (x, y, width, height, x1, y1, x2, y2...)
    

    With this technique you can control:

    • the number of scanning levels;
    • the scale variation in pixels;

    Look at the user manual using help() for operational details.

We provide 2 applications that are shipped with Bob:

  • visioner_facebox.py: This application takes as input either a video or image file and can output bounding boxes for faces detected on those files. It uses bob.visioner.MaxDetector for this purpose. You can configure, via command-line parameters, the number of scanning levels or the use of a user-provided classification model for face localization;
  • visioner_fecepoints.py: Is similar to the facebox script, but detects both the face and keypoints on the given video or image. You can configure the number of scanning levels, or provide external classification and localization models. By default, this program will use the default localization model provide by Bob which can detect eye-centers;

The face detection and keypoint localization programs can, optionally, create an output video or image with the face bounding box and localized keypoints drawn, for debugging purposes. Look at their help message for more instructions and examples.

Visioner Resources

DEFAULT_DETECTION_MODEL str(object) -> string
DEFAULT_LOCALIZATION_MODEL str(object) -> string
MaxDetector([model_file, threshold, ...]) A class that bridges the Visioner to bob so as to detect the most
Detector([model_file, threshold, ...]) A class that bridges the Visioner to bob so as to detect faces in
Localizer([model_file, method, detector]) A class that bridges the Visioner to bob to localize keypoints in

Previous topic

bob.measure.load.split_four_column

Next topic

bob.visioner.DEFAULT_DETECTION_MODEL

This Page