Motion Tracking Workshop April 2022 Report

A few weeks ago (8-10 April), to spontaneously celebrate the 10th anniversary of the Tears of Steel project, a mini-workshop dedicated to VFX-related topics was organised at the Blender HQ. During this workshop me and Sebastian König went over the accumulated backlog of tasks and ideas, and came up with an updated roadmap for the VFX module. This is a summary of the results of this workshop.

Sergey Sharybin

Mask editor readability patches

We kick-started the workshop by going over open patches which got stalled due to lack of clarity in the design. There is a set of patches done by Simon Lenz:

  • D13317 Mask editor: add mask spline width slider
  • D13314 Mask Editor: Add toggle for mask spline drawing
  • D13284 Add mask blending factor to mask editor
  • D12776 ClipMasks: add color spline outline mode

The main challenge was to define what should be configurable on per-file basis, and what should be part of a theme. With the feedback of Sebastian, and by checking on test footage, it turned out that per-file configuration is the better solution. This is due to the different types of footage and ways of color-coding splines (green screen footage, blue screen footage, regular footage, a single character in the shot, multiple characters in the shot, etc). So it is not really different to how track colors or viewport color for objects is used.

The agreed design is that the system will automatically suggest automatic not-very-similar color, which then can be overwritten by the artist.

Proportional editing

This is an old open patch (D2771). Proportional editing is designed to smooth out jumps when track slides off from its feature and an artist needs manually re-position it. Currently, adjusting such discontinuities can cause further jumps in the camera solution.

Unlike other cases where proportional editing happens in 2D or 3D space, in motion tracking we need to do proportional editing in the time domain. This poses the interesting challenge of finding a good way to visualise the area of effect.

The solution we agreed on is to use a color overlay for the track path: a solid color at the center of the path, which fades out to default red/blue splines as an effect of proportional editing. Using cache line for visualizing affected frames was also discussed. It does sound like an interesting idea to explore deeper, although with low priority. The issue with the cache line is that it is almost out of sight when artists work on an individual tracks.

Last but not least: there are many ways to transform tracks. This can be done with a preview widget, or with a click-slide operator. Both ways of interaction use different code paths from the general transform system. On a user level they would behave same as regular transform, but under the hood it we need to disentangle some code and share as much of it as possible.

Migrate Movie Clip Editor to new tool system

During 2.8 days, a migration of the Clip Editor to the tool system was started (D10198). The challenge here is that motion tracking tools works different from generic 3D modeling tools.

VFX tools prototype

I did some improvements and cleanup in the tracking_tools branch before the workshop so now the existing tools works nicely (unfortunately, operation with left mouse select configuration is broken, temporarily!). During the workshop we went over the current state of the project, and realised the follwing:

  • Pie menus are very powerful in the clip editor!
  • We’re much closer to be able to switch to the tool system than it seemed

We agreed that as a first step only a handful of tools will exist: add/slide marker in tracking mode, and spline operation tools in masking mode. The rest of the tacking tools will be accessible via the header, menu, and a pie-menu (depending on how frequently those tools are used). This will de-clutter the interface and make space for further tools.

An example of such new tool is the Orientation tool. But more about it later, as it might not be coming as an initial “toolification” of the Clip Editor.

For the Masking Mode a better add primitive tool is possible:

  • For adding a circle, using click-drag will define the circle center and its radius.
  • Similarly for rectangle: define 2 corners of the rectangle.

General improvements and features

When it comes to improvements, there are a few small things which will improve quality of life. For example, implement double-click to close masks, or introduce a fix for the single point handle which currently is 90 degrees out of phase when tweaking it with a mouse. But there are also bigger new features to be implemented.

Tracking: guided tracks. The idea is to use the “flow” of the already-tracked features to improve robustness of motion tracker. In shaky (hand-held) shots Kalman-filter based prediction might fail, and using information from adjacent tracks can give more reliable information on where to look for a feature on the next frame.

This will allow the following workflow: On a shaky shot, creating a first good track will still be challenging for artists, requiring a lot of manual correction. Adding more tracks will become easier as the tracker algorithm will have more information to work with.

Tracking: Scene orientation tool. The idea is to offer a more interactive experience which will involve an interactive visualization of floor/walls and allow easier tweaking of scale and possible offset from reconstructed track position (i.e. there might not be a good track where the scene origin is desired to be). The scale, orientation, and such will be able to be tweaked via widgets.

In order to get a good experience of scene orientation, extra smartness will be added to the solver: the goal is to preserve scene orientation as good as possible when scene is re-solved after manual orientation is performed.

Tracking: dopesheet. Default to sort by average error, with highest error on top.

Tracking: “bake” plane track. Allow the creation of an image datablock from what the plane track “sees”. This will enable a much easier workflow of cleaning up billboards (or walls, or screens, or anything what somewhat represents flat rectangular shape in the real world :)

Masking: mask direction should not matter. Not sure there is much more to be said here. It is annoying for artists to worry about winding of masking points. We need to do something about it :)

Masking: feather inside. Current masking design is that feather can only be added to the outside. For some shots it works fine, for others it adds a lot of planning to be done before artist starts to define mask. Being able to define outer spline of an object and feather inside will bring joy to artists! In technical terms it is something like inverting meaning of mask spline and feather spline when passing to the rasterizer.

Masking: feather offset. Currently feather fade-out always begin at the mask spline. But it doesn’t have to! The idea is to allow a “dual-band” feather control: by default fade-out will be defined by the mask spline and feather, but it will be possible to “offset” feather relative to the mask.

Masking: transform cage. A tool to allow transform masking points using a cage, similar to how it is done for edit mode for objects.

Masking: inset tool. Shrink or expand mask by offsetting points from their original position in the direction of normal.

Masking: layer sets. Currently the active mask layer only defines order of drawing. Editing outside masks from non-active layer is still possible by default and it is only possible by per-layer Lock control. This leads to cumbersome workflow when multiple layers are used for an object. While the final design still needs polishing, the current idea behind it is to allow multiple selection in the mask layer list and disallow editing anything that does not belong to any of the selected layers.

Masking: An improvement is needed for parenting a mask to a rotating object. Currently a plane track is the only way to do so in an automated manner, but in cases like face masking it is hard to find 4 points to solve a plane.

Compositor: active scene clip. The idea is to pre-fill all motion-tracking related nodes with an active movie clip from scene. Small feature, but saves some clicks.

VFX workspace: add Image editor. The workspace does need image editor, especially after addition of plane track backing.

Green button

We’ve discussed how to move forward with the “green button” project: implementing a single button which takes care of tracking and solving the entire shot. This is ideal when dealing with long shots with many features appearing and disappearing. This project is more about math, rather than UI/UX. The latter one is simple: just a single button in the interface, right? :)

Digital sets

This is an exciting topic! It is a very interesting collaboration of various projects which are already happening or are planned to happen within Blender.

Depending on who you ask, a digital set might imply different things, but the common ground is that it is a technique to simplify/streamline the VFX process. In practice it could either be a special setup which takes care of keying and compositing live action footage with CG content. More exiting aspect of digital set will imply extending the live action set with a CG content via a projection of some sort (projector or a large screen).

In any case, it is essential to acquire or solve actual camera motion, and do compositing tricks. We’ve looked into different ways of doing so, and the most promising from both ease of setup and accuracy of solver is to use VR trackers (or even controllers) attached to the real camera. The alternative could be to use lidar data and video stream from a phone, and solve everything manually, but this has feedback loop issues with digital set projector, and is also prone to solver stability issues. Leveraging VR technology has the advantage of using an existing rock-solid technology, and, drumroll, allows to put DP into the VR world!

This is already kind of possible, but the setup is tricky and there is room for streamlining and making it easier to setup. The project consists of the following steps:

  • Streamline controllers and VR trackers setup in Blender XR
  • Implement XR data “capture” so that it is possible to re-create it exactly as a post-process (i.e. improve preview quality with the most realistic rendering which can not happen at realtime)
  • Similar thing would need to happen for capturing video stream from camera
  • A real-time compositing (keying node) is needed. This seems to be aligned with the realtime compositing project
  • Some sort of viewport split is needed: output compositing setup for a director’s monitor, but also output separate viewport which will be used for digital set projection

This is just a quick overview. A more clear design presentation deserve its own dedicated blog post.

Closing words

It was very exciting to have such a workshop! We will surely be sharing more details as we get into each of the projects.

Here is a preview of the upcoming tracking UI, tested during the workshop. Enjoy!

This Summer’s Sculpt Mode Refactor
Geometry Nodes Workshop: October 2024
New Brush Thumbnails
Blender 4.3 Beta is here!

9 comments
  1. I would like to suggest features that I would love to see.
    A unified system of “weight”. Currently, we have tracker weight to influence a final solution, but I would like to suggest we use this same functionality for other aspects of tracking/matchmove as well. for example:
    – a weighting system for constraint.
    – Track Track1 —— X[x], Y[ ], Z[ ] — weight [1.0]
    to describe a constraint between 2 tracker that we want both to share same EXACT X position.
    – Track2 Track3 —— X[ ], Y[x], Z[ ] — weight [0.33]
    to describe a constraint between 2 trackers that we want both to share the same exact Y position. BUT we’re 33% sure it IS. for example a tracker on the ground. This means we are creating a system where we know what we want but we allow some “different” Y values for each tracker to share a “similar” Y position.
    – Track4 Track5 —- X[ ], Y[x], Z[x] — weight [1.0]
    We’re SURE these 2 trackers share the same exact Y and Z value, but not X. So it is parallel to X-axis
    – We can do the same for other parameters. Focal Length [50mm] weight [0.75] we are NOT sure it IS 50mm so we somewhat allow the system to go to a different value, but not too much, or Lens Distortion k1,k2,k3 also add a weight [0.0] -[1.0]
    – Also we can have a constraint where we tell the system NOT to solve a jitter camera path where we can specify a “filtering” or “smoothness” value working with weight [0.0]-[1.0] to control how much we REALLY want the camera path to be smooth in exchange for maybe a higher “solve error”

  2. In regards to more interactive controls to re-orient the scene it might worth also thinking about using concepts from fspy/this addon

    https://twitter.com/markkingsnorth/status/1493592691191566340?s=20&t=8d9h2PqTEoy9Quk9DhvxYg

  3. Would be great to add survey functionality. This is crucial for tracking shots with low/no parallax and aligning multiple cameras to the same set. Basically the ability to tag trackers with known 3d locations, or group trackers from different shots that represent the same 3D location and have the solver factor those in. I’ve written about the this in more detail here:

    https://blender.community/c/rightclickselect/Brx2/

    For 3D tracking, it is very common to take a wide angle “Set Survey” shot where you walk through the set, getting lots of parallax, then track that scene to build a very accurate 3D representation of the set. Then for each VFX shot, you just track some markers that also exist in the Set Survey you took, and your camera will solve and be aligned to the 3D scene exactly where it would be in real world space. The best part is, that the VFX shot can have virtually no parallax, but because the solver knows where those 2D trackers should be in 3D space (thanks to the Set Survey), it takes all the guess work out of aligning multiple shots to the same scene.

    • Indeed.
      We’ve also discussed a constraint system like “TrackA is in front of TrackB” to help dealing with shots for which you don’t have survey data.
      But there are no concrete plans laid down in this field yet.

      • Thanks Sergey,
        Good to know there aren’t any concrete plans yet. I wish I was a developer so I could contribute in a meaningful way.

        You may have seen this add-on that was released recently which uses the opencv Perspective-n-Point methods to solve a stills camera using 2D markers and known 3D positions:
        https://blender.community/c/today/XYog

        This is bypassing Blender’s built-in solver, and is rather limited, but could provide some inspiration.

        • Another vote for survey tracking, and the approach Philip wrote in his righclickselect post. The modern scanning abilities in phones make it quite quick to generate an accurate 3D model of the working space.

  4. Blender for VFX is best but it is missing as advanced with 3D tracking can be to inspire from PFTrack to create 3D tracking texture from 2d images and generate in 3D model for easy process and smooth that would be very nice :). Also needed as rostocoping for detecting to video with characters and if possible why not to some advanced planar tracking mesh as mocha pro for deformation, it would be open a lot of doors for vfx arstist as well. There is a lot of VFX room to improve… Thank you for your idea for VFX it is important to us. You dev team are awesome !

  5. A very interesting idea that will greatly simplify the VFX process, hooray!

  6. I don’t think traditional editing tools are suitable for existing workflows.
    If you clip a 3D scene in blender, you will not see the rendered result.
    Generally speaking, the files rendered by Blender will be re-colored by Leonardo da Vinci or premiere.

    Recommendations:
    There are a lot of sequences in the movie editor, and I want them to output to different folders.
    Click the render button to render the selected sequence instead of rendering the entire timeline as one.
    If the rendered sequence is moved or cut, the Blender Movie Editor plays the rendered sequence.

In order to prevent spam, comments are closed 7 days after the post is published.
Feel free to continue the conversation on the forums.