Cycles X Project Update

The rendering improvements from the Cycles X project will be in the upcoming Blender 3.0 release. Since the announcement, developers have been working to complete and stabilize the code, as well add new features and improve performance.

Well give a quick overview of recent developments.

GPU Performance

GPU rendering performance has been further improved. Here’s where we stand compared to 2.93.

Render time on an NVIDIA Quadro RTX A6000 with OptiX

This is an accumulation of many incremental changes. For details, see the GPU kernel documentation and GPU performance development tasks.

At the time of the initial announcement there was no volume rendering support. Since then we have restored volume rendering, and found that GPU rendering performance improved 3-5x in various volume scenes.


Hair and Shadow Improvements

While most benchmark scenes were rendering faster with Cycles X, a few involving many layers of transparent hair were showing performance regressions compared to 2.93.

One issue we found is that in GPU rendering, if only a small subset of the whole image is slow to render (like a character’s hair) then GPU occupancy would be low. This was improved by making the algorithm to estimate the number of samples to render in one batch smarter. Previously we’d end up rendering 1 sample at a time. Now we detect low GPU occupancy and adaptively increase the number of samples to batch together, which then increases occupancy.

Another part of the solution was to change the shadow kernel scheduling. Previously, continuing to the next bounce would have to wait for all light and shadows to be resolved at the previous bounce. Now this is decoupled, and shadow tracing work for many bounces can be accumulated in a queue. This then gives a bigger number of shadow rays to trace at once, improving GPU occupancy. This matters especially when only a small amount of pixels are going 64 bounces deep into transparent hair, as in the Spring scene.

Further, we found that transparency in hair is usually quite simple, either a fixed value or a simple gradient to fade out from the root to the tip. Instead of evaluating the shader for every shadow intersection, we now bake transparency at hair curve keys and simply interpolate them. Render results are identical in all scenes we tested. Below are two sample images to compare the results.

For the statistics enthusiast, here are some memory and timing results for a few well known scenes, so that you can see the results for the transparent hair baking and the shadowing optimizations (ref is the reference without any optimizations).

Shadow Optimization Results

Distance Scrambling aka Micro-Jittering

Sobol & Progressive Multi-Jitter (PMJ) can now use distance scrambling (or micro-jittering) to improve GPU rendering performance by increasing the correlation between pixels. There is also an automatic scrambling option to automatically choose a scrambling distance value. These are available in the advanced settings in the render properties.

To render the above images the scrambling distance was set to zero to maximize the correlation between pixels. This should not be used in practice and was only done in order to make it easier to see the correlation introduced by the micro-jittering (notice the girls shoulder in the images above to the right). In a real setting you would generally have a larger distance to hide these artifacts. This technique can result in less noisy images and in some cases improved performance in the range of 1% to 5% depending on your rendering setup (it’s only beneficial for GPU rendering). Below are some performance results using the automatic scrambling distance which currently does not work so well for CUDA due to the tile sizes. Work is currently underway to choose better tile sizes for CUDA which should result in better performance.


Ambient Occlusion

Ambient occlusion did not take into account transparency in the initial version of Cycles X. We now restored this, taking advantage of the shadow kernel improvements that also helped with hair.

Also, additive ambient occlusion (AO) support is now available through the Fast GI settings. Additionally, a new option has been added to “Add” the AO result as well as the “Replace” operation that was available already. Below are a few images to compare the results.


Denoising Improvements

We improved denoising for volumes. Previously these were mostly excluded from the albedo and normal passes used by denoisers. While there is not exact equivalent to albedo and normals on surfaces, we make an estimate. This can significantly help the denoiser to denoise volume detail.

The denoising depth pass has also been restored, which was previously removed along with NLM.


AMD HIP

We’ve worked with AMD to bring back AMD GPU rendering support. This is based on the HIP platform. In Blender 3.0, it is planned to be supported on Windows with RDNA and RDNA2 generation discrete graphics card. It includes Radeon RX 5000 and RX 6000 series GPUs.

We are working with AMD to add support for Linux and investigate earlier generation graphics cards, for the Blender 3.1 release. While we would have liked to support more in 3.0, HIP for GPU producing rendering is still very new.

However we think it is the right choice going forward. It lets us share the same GPU rendering kernels and features with CUDA and OptiX, whereas previously the OpenCL implementation was always lagging behind and had more limitations and bugs.

To test the HIP release you need to get the Blender 3.1 alpha and also to download the latest AMD drivers (See this blog post for more information.)

Blender 2.93 vs Blender 3.0 on AMD

Apple Metal

We also recently announced a collaboration with Apple. They are contributing a Metal backend for Cycles , planned for Blender 3.1.


Future Work

Now that the new Cycles X architecture is in place, we expect that adding various new production features will be easier. This will start in 3.1 and continue through the 3.x series.

Download the latest Blender 3.1 Alpha builds to try out the new features.

The Future of Overrides
This Summer’s Sculpt Mode Refactor
Geometry Nodes Workshop: October 2024
New Brush Thumbnails

58 comments
  1. [D13533 – Adding Manifold Next Event Estimation Sampling Technique]
    I wanted to add a link to the following YouTube video, which was a talk presenting “Optimised Path Space Regularisation” is from EGSR 2021 Day 1, with regard to how reflective caustics can be added in addition to MNEE (the time point in the video gets you right to the matter at hand):

    Optimised Path Space Regularisation – Improvements on MNEE – https://www.youtube.com/watch?v=u9HqKGqvJhQ&t=2081s

    Hopefully this comment can make its way to Olivier Maury (omaury) for consideration.

  2. HIP supports 6700XT? I have seen that AMD staff said Navi/Navi 2 is not supported on GitHub.

  3. hi,

    i am running in some weird gpu render perfomance issues with 4x 1080ti system and blender 3.0 RC…
    when rendering a very simple scene (sphere+cube) with default initial cycles settings, the rendertimes are very high (42sec). when enabling only two of the GPUs in the prefs, i get 7-10sec…
    is cycles x in blender 3.0 hardcoded to max 2 GPUs?
    the same scene renders in 2.93 (same max samples) with 4x gpus in approx. 3sec…
    also the gpu load is only between 16-26% in blender 3.0 comparing to 100% load in 2.93.x
    does anybody also experiance this behaviour with multi-gpu and blender 3.0 cycles?

    • There are some known issues with scaling to more than 2 GPUs, it’s being investigated.

  4. Did seriously nobody notice that the before and after images of Koro are identical?

    • Sorry my bad. Should have read first. They are intended to be identical of course. :)

  5. Is there any possibility that blender will utilize slightly older intel dedicated graphic cards in the future?

    • We are working with Intel to add support for their discrete GPUs, though it’s hard to say which range of cards will be supported at this point or when this be supported. Discrete GPUs are a pretty recent thing for Intel though, I don’t think there are any old ones.

  6. Hi there, you guys do a great job and I thank you for it, but I was wondering if in the future blender will have a cad section so you can model based on measurements.
    Thanks

  7. OK, now lets wait for 10 years and then see if AMD is still committed to HIP and if it can actually be relied upon for production. Until then outside of testing I’m probably not very likely to use it.

  8. What CPU config is being used in the CPU tests?

    RTX A6000 is a top of the line GPU. Is it going against a 5 year old 4-core CPU or a dual 64-core AMD Epyc system?

    • No, all tests have been run on the same system.

  9. Is there any improvement in and cpu rendering or built in graphics card on laptops…

    • There have been various improvements to the cycles rendering engine such as the PMJ sampler and the hair improvements which should improve rendering on CPU also. As for the laptop GPU it depends on what you have. AMD is working hard to support as many cards as possible or if you have an Nvidia card that supports a recent CUDA version it should work also.

  10. Very cool! Any ideas when we’ll be seeing improvements to the Shadow Catcher in Cycles X?

    • Many improvements to the shadow catcher have been committed and there are more to come.

  11. I wish Blender had more controls for texture or geometry compression. When rendering a complex scene on the GPU, it’s often possible to exceed available VRAM, and there are no tools available to get a quick memory usage breakdown, and how to deal with that. You can spend a lot of time debugging your scene.

    Also, you don’t know that you exceeded available VRAM until the rendering process has already spent a couple of minutes processing the scene.

    This can also be important, if you intend to render your scene on a different GPU with less VRAM than the one you are currently using.

    I know Cycles isn’t entirely at fault here, but there would be *many* productivity speedups from having such tools.

    • Yes I agree, this was something I was looking into earlier using compression or tiling to control the amount of RAM used for textures. Unfortunately, I don’t have a time frame for this, if I remember correctly I think someone started a patch but I can’t seem to find it now :-(

  12. I would love to see tiles brought back, as I am unable to render high resolution images anymore. I´ve tried rendering the same scenes with the old version of blender (tile rendering) and they render just fine however, when I try the beta (progressive rendering) I cannot set my resoltion higher than 6000px before it overloads my vRAM on my rtx3060.

  13. Hello William, thanks for the update, any chance you’ve got a link that explains scrambling distance ? I have no idea what “correlation between pixels” means

  14. Amazing job, it’s so good to have a an open source software evolving like this ! Thanks a lot ! 🙌❤️

  15. In an upcoming version, I hope to have the same things as UDM and Nanite. They’re the new standard for efficiently running large and detailed scenes.

  16. Hello William, thanks for all nice improvements

    I have an inquire, i have made in today the download of the RC version and in testing the known BMW27 scene i have this results
    Optix RTX3060+5800H =13.8 seconds
    Optix RTX3600 only = 12.8 sec

    Why the R5800H is detrimental to the rendering speed? I would expect it would help something not the inverse.

    PS: i am willing to help test anything you need.

    • Unfortunately multi-device support is still a work in progress. I believe that in this case the optimal tile sizes for each device differ and this is causing a slow down as it causes the program to wait on the slower device on each iteration.

      Thank you for offering to help, all help is greatly appreciated. If you go to blender.chat you can link up with the developers to offer your help for more details check out here.

  17. That performance and development pace!

  18. I wonder how devs take this “i want lumen / nanite ” thing

    • Feedback and requests are welcome. You can make a request on Cycles requests. Here you can detail your request and get feedback from others.

  19. Great work and I’m looking forward for AMD GPU support on Linux.

  20. Put mesh shader on blender like UE5 As nanite :)

  21. Really excited to test out my 5700xt! Thank you devs for showing amd users some love! Fingers crossed for simulated volume rendering on the gpu (smoke sims don’t show up in cycles on my card as of 2.93)

  22. Pretty cool!!! Great rock-solid work, Cycles-X is already fantasic!

  23. will branched path tracing be added back, this was a huge feature for optimising renders

  24. Good work ! Couldn’t get back to 2.93 since i started using 3.0 Alpha, the new reactivity in the viewport is a big deal + the general speed up

  25. Fantastic Work by Brecht and William (sorry if I’m misspelling their name) and community devs.

    Thx so much for thx awesome piece of software !

    I wish I had something like Blender when I was young (when 3D software used to cost in the thousand).

    Cheers !

  26. I really hope that Blender 3.1 will support AMD RX5XX Polaris Series.
    Although not as fast as Nvidia GPUs, at least AMD GPU users don’t need to buy GPUs anymore.

    • I do believe AMD is actively working to support as many cards as possible so hopefully you’ll get your wish.

    • Especially important to have as wide GPU support as possible at the moment, given how absurdly expensive it is to buy a new GPU with the ongoing chip shortages. This would be the worst possible timing to impose a GPU upgrade on a user.

    • In a very small amount of time to directly translate the cuda code to ensure that it does not crash is good, want to improve performance enough also need to remove the cuda code stock, open hardware ray tracing, but hip early many bugs and light errors are not like cuda, but closer to optix, unbelievable ah

  27. Great work as always guys, lovely to finally see AMD supported again, even though I don’t currently own any.

    Hoping to eventually see this pave the way for a better shadow catcher implementation, and many-light-sampling optimizations.

    Blender devs are awesome!

    • The shadow catcher in 3.0 is much better. Supports a full color shadow pass and denoising. Read the release logs!

  28. It says that Apple joined as a patron level supporter. But yet, it does not show up on the dev fund page. Is there any reason for this?

  29. Hello when will we have Path guiding with caustics?

    • Path guiding is due to be worked on but it is hard to give and exact date for when it will be ready. However, very soon there will be a MNEE (see https://www.ics.uci.edu/~yug10/projects/translucent/papers/Hanika_et_al-2015-Computer_Graphics_Forum.pdf for details) patch which offers some similar functionality.

      • Grateful for the information, I’m looking forward to this implementation, caustics for those who design products like me are very important.

        • Hi, you may want to use LuxRender for product rendering

          • Hi, I already use it, but I’ve been having problems with it, and it still doesn’t work in Blender 3. Besides its speed, which is much slower than cycles X.

      • When caustics will use path guiding – does it also support volumetric caustics?

        • Path guiding is generally used to improve rendering of caustics and other hard to capture lighting so probably but until a something more concrete is planned it’s hard to know. The MNEE (which is due to arrive real soon) should help with caustics.

      • Will Path Guiding support volumetric caustics?

        • Path guiding is a range of different algorithms and it depends on how they are implemented so it’s hard to say exactly.

          • Thank you for your answer.
            At least i hope path guide will extend caustics to have reflective (mirror kind of) caustics and lay caustics out of shadows. LuxCore was wonderful but i noticed that when i have multiple transparent objects (caustics aren’t visible trough transparent objects) i hope this kind of problems will be at least solved.

            Volumetric caustics sure would be lovely either (i wonder could these be done with MNEE technique – maybe not). Other important thing is motion blur support (especially deformation motion blur for openvdb and also support for mantaflow’s openvdb fire and liquids + third part addons openvdb) also cryptomatte is important for vfx and openexr (so it should be keep up to date).

            Too bad that Pablo left Blender dev. I hope he will come back – because of this one z-named sculpting product that was sold to one big Corporation – so it would be great time to make sculpting better and try to get more user base from these users and make blender more industry standard this way but it would need a lot of work for sure (like, “dual contouring”, “enhance surface nets” or “df3d mesh” for dynamic topology meshing and that hybrid sculpting which was planned – at least Joseph will continue this development.) I wish everything good for blender dev. just take your time – I just wonder is development team aware from these questions / “concerns” – like prisma effects (dispersion caustics in volumetric) I know this is too much asked – just dreaming hard.

  30. Cycles X is a game changer. Everything got better in 3.0. Blender devs are incredible and we are so grateful! :)

  31. 🔥🙏🔥 got nothing to say but thank you guys for making ma carrier possible ☺️

In order to prevent spam, comments are closed 7 days after the post is published.
Feel free to continue the conversation on the forums.