Neural Frames Releases Visual Echos for stable AI animations
Neural frames creates its animations from an image-2-image loop in which the previous image gets fed into Stable Diffusion and a subsequent image is generated. A parameter called 'strength' determines how many inference steps Stable Diffusion adds to the input image, i.e., how much Stable Diffusion will change the previous image. Another important parameter called "Smooth" determines how many interpolation frames are added between two subsequent AI outputs.
This algorithm can create stunning animations, but unfortunately, the control over these animations is quite limited. If you have a nice starting image and want to keep the overall shape, you have to have a low strength. But a low strength can lead to image degradation over time, because - well - Stable Diffusion is doing only little work. No problem, we just pick a high strength, but suddenly, the images are changes a lot from frame to frame, leading to higher quality images but also perhaps to an epileptic shock, because things change so much.
For the longest time, this core tradeoff was something all AI artists struggled with. To circumvent this problem, we recently introduced visual echos. It is based on ControlNet and another feedback algorithm in addition to the video generation pipeline. We boiled the visual echo effect down to two parameters: Edge echo and tile echo. Edge echo is a form of Canny filter, that calculates the edges in an image and keeps those edges the same. Tile echo is another type of filter that averages over larger tiles of the images and tries to keep those the same. Both echos can work together but do different things. For instance, the above animation can be fully stabilized with edge and tile echo.
The squirrel is very nicely preserved and the animation can be made very stable. For the difference between edge and tile echo, let's look at the following two animations. The first one is with edge echo 0.8 and tile echo 0.
The exact same prompts with the tile echo look very different. Tile echo is a bit more trippy and creative. It is really great for prompt changes. Here, the settings are edge echo 0 and tile echo 0.8.
And of course we can also include this in neural frames' powerful audio-reactive modulation, which is particularly well suited to create AI music videos. This is demonstrated in the little demo below.
Thanks a lot for reading through, if you're interested to create your own AI animations or AI music videos, click here and you'll be able to create a 30 second video for free.
Info
No VC money, just a small team in love with text-to-video. Contact our team here: help@neuralframes.com. Our AI music video generation is inspired by the open-source Deforum algorithm, but doesn't actually use it. For inspiration on prompts, we recommend Civitai.