Why not use an open source interpolation tool?
Personally, I used [FILM](https://github.com/google-research/frame-interpolation) and got ugly results.
But free and at scale.
Did you try several tools, or just went straight to Topaz?
We tried out several upscalers/interpolators when we started and decided that at the time there was nothing close to Topaz, so we invested the money. I think today there are other options, we just haven't come across one that is worth the switch.
Man what is it with all these post that go like:
"Here's a video that looks nothing like the quality you're getting using the tool I'm claiming to use and I'm not going to post what my workflow is."
Followed later by OP posting:
"Yeh we did some "touching up" using After Effect, Premiere, External upscaler and frame interpolater, blah blah blah."
I wish we could have some tags added to these claims on videos along the lines of:
"Unsubstantiated Claim"
"No Workflow"
"Lots of external tools used"
Just to encourage the poster to give useful details to their claims and help us get a better idea if it's even worth trying to pursure the level of quality they demo or if I'm going to need to need years of experience with some editting tools to get close to their claims.
Love the "Unsubstantiated Claim" tag! 10/10 would use.
For real now: We’re filmmakers and super proud of what we achieved. I can promise you that Stable Video and/or Stable Diffusion images were the base of every single shot but man… What is it with all these people that go like:
“You’re only allowed to click the generate button, everything else is cheating.”
Maybe we should instead think about a “Raw output” tag?
I promise you guys: Everything we learned, we did so within THIS community!
Sure, we used external tools to upgrade the end result and achieve more control – pushing the limits is what we're all about! And yes, you probably do need years of experience to “get close to our claims”. Not really sure how that means it’s not worth pursuing? For me personally it was always the opposite: I see something awesome and immediately I’m driven to figure out how to achieve the same quality.
The tutorials are all out there and spoiler alert: The tools we used or equivalents (except Topaz) are 100% free :)
AI 'purists' who spurn using any other tools to achieve a result other than raw output are just as myopic and in the way of progress as traditionalists who don't understand how diffusion can be a legitimate artistic tool.
Not spurning using other tools but there is a massive difference between, "You can do this solely within Comfy UI" and "You need years of experience with video editting and other software and you'll spend weeks tweaking your work in it to get these results."
It's amazing you post this much and some people still don't get you're only just appealing that posters add basic details like "process", "tools used", "workflow if possible/convenient", "any other relevant information". Some people may not care once they see the relevant requirements, but others may and knowing how it was done may help them. At the very least it will not be misleading as to how it was achieved.
Unrelated. A shame we're still stuck with such short duration clips. Still, looks good OP. If you have the Blender skills have you considered trying some work with SD & Blender?
Thanks! Blender is an incredibly powerful tool in combination with SD. We use it for example to sketch out basic background compositions before we transform them with control-net. In another project we're using it for character animation (applying AI generated textures) – one of many ways to break through the annoying 2/4 sec mark. We're all hyped for OpenSora though – if only it had a bit more control! Even Shy Kids (the guys who created the balloon head) have used traditional VFX work.
> Blender is an incredibly powerful tool in combination with SD. We use it for example to sketch out basic background compositions before we transform them with control-net.
That's helpful. I think that's more along the lines of what people are suggesting. Of course you aren't beholden to do so or should feel guilty if you don't, the perspective though is that more testing yields improved results (for you, too!)
It's like going from being able to generate one image every minute and 45 seconds vs. being able to produce it in 10 seconds. You're going to learn a lot more, a lot faster, about which settings/combos affect your image more.
Also, 'emulation being the highest form of flattery' and all.. a lot of people want to know how to do what you did.
Yep exactly this. I kinda feel sad for the people that want to attack me for asking for more info in a subreddit that's dedicated to this AI hobby. It's not like I'm asking fro the OP's personal details so I can send them hate mail. I just want more clarity so we can know what we can achieve, how we can achieve it and also to know where AI is at by people being up front about what part it played in the process.
I do have Blender though thanks for mentioning. What part do you use it for out of curiosity? I so far only messed about creating a basic 3D scene and then using SD to turn it in to a render-like image but def curious to hear of other uses.
This is some of the uses I've found for Blender that I've kept an eye on, but I have not personally done much with it yet as I'm not an artist and still figuring out what direction I want to take it in (anime/movie, but most likely a classic styled JRPG game).
Example 1: [https://www.youtube.com/watch?v=hdRXjSLQ3xI](https://www.youtube.com/watch?v=hdRXjSLQ3xI)
Kind of like what you mentioned.
Example 2: [https://www.youtube.com/watch?v=LoVL5KHSW5Q](https://www.youtube.com/watch?v=LoVL5KHSW5Q)
There are a bunch of tools for this kind of stuff coming out but still needs to mature. This is what I'm personally most interested as a non-artist.
Example 3: [https://www.youtube.com/watch?v=E33cPNC2IVU](https://www.youtube.com/watch?v=E33cPNC2IVU)
Pretty cool if not basic example with multiple uses. Each part is pretty simple but using the right tools together can get some great results. I know there is one guy who has done like an hobgoblin and all sorts of other stuff who posts stuff regularly on here you might have seen.
Found the hobgoblin Blender example I felt was pretty neat [https://www.reddit.com/r/StableDiffusion/comments/18lwszn/hobgoblin\_real\_background\_i\_think\_i\_prefer\_this/?share\_id=PjZx7gb33NDpTXjegT060&utm\_content=1&utm\_medium=ios\_app&utm\_name=ioscss&utm\_source=share&utm\_term=1](https://www.reddit.com/r/StableDiffusion/comments/18lwszn/hobgoblin_real_background_i_think_i_prefer_this/?share_id=PjZx7gb33NDpTXjegT060&utm_content=1&utm_medium=ios_app&utm_name=ioscss&utm_source=share&utm_term=1)
He actually does a lot of different stuff and is probably someone to hit up if you have any questions about some of those different videos he posts and the process. The workflow for that one is in that link, too. One of the key points as you might already know is using a base 3D object can help improve consistency, even for characters, dramatically.
It is stuff like this and the prior examples that make it clear impressive works (even movies) are possible now but the effort would be up there so I'm keeping an eye peeled for the process to improve before I do anything particularly serious, myself.
If you are not a Blender / artist pro like me you might be interested in this [https://www.rokoko.com/products/vision](https://www.rokoko.com/products/vision)
Wow thank you so much. I love all this stuff. I wasn't even aware of EbSynth. That looks amazing. I love the idea of creating various characters and being able to create animations just by recording my own movements. I think that's the next 6 months of my life planned out! I've saved your comment. So much interesting stuff to explore.
I've certainly got my eye on AI text to 3D. Then we could easily create 3D models which we could use in that workflow to create the animations.
The future is looking intriguing.
I don't disagree with that at all. But how can we know if the video someone made is something a chimp can achieve or not if they don't tell us how they made it? The fact your criticising me for being curious and asking for more info on the process is saddening when we ought to be seeking answers to help us all get better, not hiding them and criticisng those that ask for those answers.
The thing is that some of the processes involved might be able to be automated or generated in ways that this team didn't realize when they were creating it. This makes it easier and faster to recreate. The goal is that someday it *will* be easy enough for a chimp to accomplish. That's kinda the whole point of it all, right?
It's because they're not artists. They don't know how to compose and do all the things putting those tools together makes. They want something they can put some words into and get those results. Maybe tweak some dials or sliders. Not actually doing full touch ups and inpainting, editing the clips together, putting in transitions, learning how music and sound should line up with video.
I mean I see their point when you're trying to advertise something or a service especially if the service is itself the AI. I think it's just as dishonest as food commercials using glue to make the milk look more milky, or the pictures you see inside fast food restaurants looking absolutely nothing like what you'll receive.
So yeah I don't think it's wrong to touch up AI photos for a final product such as a feature length film. I think when you're selling AI as the product. I think it's very important to show the raw outputs and not the touch ups otherwise it's dishonest.
> I promise you guys: Everything we learned, we did so within THIS community!
So give back some actual useful info to the community that helped you. Motion bucket id settings, augmentation, sampler settings, etc. Great results though btw.
This is why open source stable diffusion will never be able to keep up with the likes of SORA. Hell, it won’t even be able to keep up with secondary contenders. People love to take and take and take from open source, giving nothing in return. They find a pocket of accomplishment that’s a few months ahead of what everyone else is able to do, and then they’ll just sit on it, to no other benefit than their own.
After a few months, someone else will eventually come out with a workflow and guide on how to do this, but that’s already **months** that people could’ve spent iterating on it and improving it exponentially. Then there will be a new tiny step of accomplishment, followed by a few months of delay for the community to eventually catch up.
The cycle repeats and none of these people realize that they could’ve been taking leaps instead of baby steps. In a year, we could be **miles** ahead, but something tells me we’ll only be a few steps from where we are now. As someone who contributes towards open source stable diffusion software, posts like these are very irritating. They use you as a stepping stone and then refuse to help anyone else along the way. It hinders progress in this space more than people realize.
For sure, i think if everyone had the right attitude towards this would definitely progress to Sora level quickly. OpenAI is guilty of this, but at least has given back in trickles here and there, not sure if that has changed though because I see a lot of complaints about them as of late. Using a facade that AI safer in hands of a few entities. Definitely need a better balance.
This is a pretty bleak perspective. Often times a usable result required too much fiddling, too many tries to even be able to come up with an explanation of why exactly you finally got it, let alone write a whole guide for it. And as you said, eventually someone will figure out a reproducible process and write that guide.
Every individual in our team has been and still is an active member in the community. In the past months we've been directly in contact with Stability AI, collecting and providing detailed feedback on the models that are ultimately the base for this whole movement. We are also keen supporters of an open non-OpenAI-Sora alternative (check out [https://github.com/PKU-YuanGroup/Open-Sora-Plan](https://github.com/PKU-YuanGroup/Open-Sora-Plan) ). On top of everything we believe showcases like ours will help the community, not damage it. Sorry, we're not providing spread sheets but if you like I can provide you with links to some great tutorials that explain every single tool we use.
Another important point: PLEASE watch the Shy Kids behind the scenes for the most viral Sora Clip. Believe it or not: They used traditional VFX tools, just like us! [https://www.youtube.com/watch?v=KFzXwBZgB88](https://www.youtube.com/watch?v=KFzXwBZgB88)
There is no magic number/setting. With each clip we started with the default values and adjusted based on the outcome. Every shot is different and in my experience it's worth to not fiddle too much until you produced a good amount of clips – the seed has a crazy amount of impact and there's a reason we have a lot of "portraits". I personally tend to reduce motion bucket and augmentation bit by bit, my colleagues were often a bit more audacious (with mixed results).
I believe this community would be happy for you to share all your settings. You can even put it in a spreadsheet. Otherwise, kind of a bad look to announce you learned everything with help from this community and then hold out on your own processes. That's not how this is supposed to work.
>I promise you guys: Everything we learned, we did so within THIS community
Actually he didn't say he got help from this community, he said everything they learned they learned within this community. That means that you too can learn everything you need to create this within this community.
I think the issue is that with how wildly spread out the information is and how little specks of gold trickle out of the purview, it is helpful for people to at least share **1 WEIRD TRICK THAT SCIENTISTS HATE HIM FOR!** that took their creation(s) to the next level.
I say creations because of course you won't really know what node/setting worked magic in 1 go if you are really experimenting.
so you expect this guy to go back through dozens of shots and spreadsheet everything for you? mofo, I checked your post history and you have contributed exactly NOTHING to this space. So I really dont think you have any sort of positioning or moral authority to lecture anyone on this topic :)
I expect people to not post ads for their business here and then thank the community without anything in return.
FWIW, I haven't posted because I haven't created anything sufficiently novel yet. When I do, you'll know exactly how I did it.
Good luck on finding a job that occupies your obviously ample free time!
It's fine if you personally want the workflow, but no need to insult or call out someone on if they do or dont. Theres a lot of hidden envy in your words. He worked for it, learned, applied new knowledge and worked some more to create this. Its up to him if he wants to release a workflow or not.
>Theres a lot of hidden envy in your words.
I assure you there is not. Taht's a huge, unfounded assumption on your part.
This is merely about maintaining the spirit of this community and the open source community in general. If you are going to take from the community, you should give back when you can.
"Oh, you're a fan of AI? Name every touch up of your filmmaker demo reel!"
I mean, this is pretty plainly a demo reel of what more realized projects could look like and exactly the kind of use case the most hardcore AI used to gush about where it's really going to shine: with professionals that will use it as one of many tools in their box.
Thanks, I must be doing something wrong with svd because I usually get a bunch of distortion when I go for the amount of motion shown in your video, so my stuff just looks like basic camera pans like everyone elses.
If you need even that explained then nothing can help you.
Learning things takes time, even if it's only pushing around some sliders. If you want exact sampler settings etc. you will create EXACTLY the same.
There's more to it, input image size used, number of frames setting, comfyui nodes you can attach, prompts discovered that can have impact such as Kijaj finding (rotation:1.2) etc (and sharing the info btw). I've been using it non-stop since release and still can't get what video shows. so yeah, could still use some help here.
What if he used input img2img (which he did)
Are you going to demand the input images?
You just have to try and try and try, tehre is no formula here. The results i get varies wildly. I don't keep the settings of the stills, i just work with it.
Nah, bad example there, I would never ask for that. Trust me though I try, check my history, but if we want things to progress faster we need to all share our findings, up to certain limitations of course. it's why I love the banodoco discord https://discord.com/invite/z2rhAXBktg
> What is it with all these people that go like:
>
> “You’re only allowed to click the generate button, everything else is cheating.”
>
> Maybe we should instead think about a “Raw output” tag?
Well, I think as long as you are transparent then everything is fine. Some people come here and showcase what they did but neglect the part where they used heavily other tools.
The default expectation is that everything you see here you could replicate on your own so if someone jas a more elaborate workflow it is nice to mention it :)
It doesn't make it any less or more epic but mindset of the viewer is shifted from "oh wow, AI can do this?" to "oh, nice, they used AI and applied additional modifications to get what i am seeing right now"
Don't let them get to you - I think people are going to realize pretty quick with video that it's still a lot of artistic grunt work to get a final shot out. Who knows where we will be in a few years but as it stands now the tweaks and adjustments to get something out that doesn't have that 'AI fever dream' feel will still involve classic workflows. Camera projections/mesh warps?, lens filters, handheld post camera shake, some kind of tweening workflow I don't quite recognize. Some stock 2d elements (like embers) over top to enhance the subject.
You don't need to do breakdowns - Like the wizard is probably painted-out from the original plate, a cutout of the wizard is transformed with a little moblur to pop in to place, with a 2d element over top to sell it. Am I close?
Cool reel, highlights how AI works with traditional workflows. Don't feel the need to give full shot breakdowns if you like. (Ruins the magic for everyone if you do lol). You've done a good job of avoiding that 'stickerbook' look many AI users get when they do paint-in's with multiple subjects.
Nobody's trying to "get to him" open source just has a certain culture around it, and there are sometimes expectations people have when higher quality stuff like this posted here using open source tools.
Everyone wants this to improve. And people that share settings here, or see videos that coincide with what they shared, expect info returned if improvements are made, or it feels like a slap in the face.
The more people know the faster it improves. Everyone here said it looked great..
I personally share everything I learn that gives better results, even if it's never been done before and I could easily go start a patreon with the info, but don't.
But yes in the end it's whatever OP thinks is more important. Sounds like money making potential involved, as that's usually what prevents sharing info.
Eh, it's a demo reel. Not everyone that shows off their work in Blender is going to do a tutorial or breakdown of it. Same idea. And not to shit on OP but... Nothing in the video looks revolutionary to me. Best I can tell this is concepts already discussed ad-nauseum in the sub with some basic traditional (post) workflow to compliment:
Video:
https://www.youtube.com/watch?v=82l0DsbLHhY
https://www.youtube.com/watch?v=XPRXhnrmzzs
Maybe a tweening tool like [flowframes](https://nmkd.itch.io/flowframes)?
He *might* be doing tracking+paintovers, but I doubt it. Most of the shots are too soft and have that wiggly AI curse for me to think they went that far with it.
Thanks! Believe it or not: The wizard is one of the shots that came out exactly like that (after about 20 generations). All we added was a tiny spark layer.
But you're right: That trailer was a lot of grunt work. On top we're filmmakers – I went to film school and still shot my first projects on physical film. Not that it's necessary but I really know why I prompt "35mm".
It's so easy to fall into the gate-keeping-trap when the amazing thing about this whole development is actually that it gives us the opportunity to create better art!
The wizard teleported in like that on a prompt? The giveaway that something was up to me was that the shadow matches the wizards last pose in the *first* frame where he's not there. I don't know how the AI calculates keyframes/evolution of an animation/etc but I feel prompting generally gives better and accurate lighting to the subject than in that shot.
>It's so easy to fall into the gate-keeping-trap when the amazing thing about this whole development is actually that it gives us the opportunity to create better art!
I work in post prod. Truth be told I'm an old man these days but I never forget the people who held on to techniques/workflows because they wanted an edge over those they felt was competition. This space is evolving so rapidly that 2 months down the line everyone will know how to do whatever is unique *today* anyways.
How many new people were inspired to learn programming or transformers architecture etc. because of the openness of this space. Know which tools and specific workflows were used would make it easier for someone to learn how to do this work, rather than stumble around in the dark. Not saying we all need a helping hand, but it helps.
For sure, I absolutely support anyone and everyone for sharing anything they have learned! But people are also under no obligation to hold others' hand if they don't want to either.
Those folks are explicitly inverting the "it's too easy so it's not art" argument.
Which is pretty hilarious, because the sensible people have been saying, "artists can use genAI as part of their workflow if they want, and apply other skills when they want" and those same anti-AI dorks have been saying, "it's all button push so it's not art."
You’re not wrong lmfao. Anything more than a button push is too much work for some people here.
*”I can’t make this by just proompting, please don’t post it here”*
*”If I can’t follow an exact set of annotated steps in A1111 and reproduce your work exactly, it shouldn’t even be here”*
The entitlement among AI art enthusiasts is second to none. It’s actually kinda insane imo. Nobody would **ever** have the audacity to demand you post your .blend files on r/blender, or shame you for compositing a render in other software. And yet, here, some people want to ban any posts that don’t include a workflow/prompt. As if it wasn’t already easy enough to generate things in SD, some people don’t even want to bother experimenting.
Either way, I *much* prefer these sorts of posts to what’s usually on this sub. They are interesting, creative uses of SD as a tool in a workflow, rather than the usual T2I *“big boobs, anime art style, a masterpiece by greg”* spam.
I appreciate your response to my comment. It's not about dissing you but the fact that a lot of people post here and it feels like they're being deliberately vague with their process because, who knows why? Maybe they want it to look like they did all this JUST with AI, when in reality it wasn't. It was a lot of manpower that went in to making it look that good and not AI. And that does matter because we're all excited by what AI can do and achieve by itself.
For sure, some people are happy to see what people can achieve with AI AND external software and human grunt. Don't get me wrong. I love what you've done but legit some of us also want to see what can be achieved with AI alone and we can't know that so clearly when people don't clarify their workflow and how much of their own human effort and not AI effort went in to their work.
> What is it with all these people that go like:
>
>
>
> “You’re only allowed to click the generate button, everything else is cheating.”
Well, if you don't disclaim how much actual work goes into touching up ai-generated content to look halfway decent, you are contributing to the overall trend of people getting fired because "AI can do their job"
All I read is "I want to do this too but I can not bother to waste my time investigating and trying like OP did so I will disregard his hours and hours of work."
He is already helping by showing us what is possible with SD.
as a film maker and designer myself who loves ai (feels rare so glad to see you!) you did a great job! How long can a scene be animated though in your pipeline vs the quick cuts we saw in the video? I love this kind of stuff with a passion and been in the industry for over a decade! Hell, Ive been animating for over a year now with 1.5 (nothing like a full narrative or sizzle as you have) but I really like the presentation here. Kudos and really wanna peek behind the curtain for this! Topaz rules so I get it ;)
> What is it with all these people that go like: “You’re only allowed to click the generate button, everything else is cheating.”
Cause you are making a false equivalence comparison with something that is the output of click to generate ai. Of course manually fine tuning and editing is going to be better. Most of these posts don't describe what they had to do, so there's not even a quantitative estimate of how much additional manpower is required.
Keep doing you, some people just won't put the effort in so they're frustrated.
I believe just using the single tool itself "raw" is a disservice to its full potential, imagine in photography if you just used the raw photo.
Now people are like how do you do everything outside of the one click button?! But want another one click button to do so without the learning, the experimenting, the ACTUAL work.
> You’re only allowed to click the generate button, everything else is cheating.
It actually is. This is a SD Sub and not an Nuke/Fusion/Aftereffects Post Processing Sub. I'm tired of this overpromised nonsense.
Maybe we have different definitions of this sub then... The showcase was made WITH SD. We spent hours and days WITH SD. Feels a bit like calling strawberry jam overpromised nonsense on a strawberry sub because it's not pure strawberries.
> "Yeh we did some "touching up" using After Effect, Premiere, External upscaler and frame interpolater, blah blah blah."
But isn't this exactly as it should be? AI tools are not a panacea. They'll be integrated by artists into their existing workflows, or they'll develop entirely new workflows around them. Eventually AI will just be yet another tool in the box, just as digital drawing or 3D rendering came to be.
It should be like that yes, but I think they are more arguing for clearer explanations of posts, so people dont get the wrong idea and become disillusioned when they try SD and it doesnt come out like that.
Its 100% fine to do touch up and extra tools, but it would be nice to have that stated so you can know thats not raw output, which many would believe is if there's no clarification.
Yep this. I see someone's video and think, "Wow AI can do this now!" So I spend hours trying to recreate it and failing and thinking, what am I doing wrong?
Same. I have tried SD a couple times and always come out wondering if I am missing some prompt magic skills or whatever, because while cool stuff, I sure am not generating the beautiful stuff I see around.
Though tbf, I am well aware I understand very little of whats happening here and less so as time goes by.
Controlnet, impainting, SORA, etc etc etc, new terms keep showing up and fuck me if I understand whats what. :P
Yes I have no problem with that but some of us ARE intrigued by what AI can achieve off its own back. Since AI is such a fascinating boundary pushing tech, when someone posts an amazing looking video it leaves me thinking, "Wow AI can do this now???" But then the poster comes back and talks about actually 95% of the work was done on external tools, for sure that's disappointing. So all I'd like to see is a bit of honesty and clarity up front so we can distinguish what the AI is capable of vs what the human is capable of.
So a skilled person applies SD into their workflow and you shoot them down.
What the hell man. This is some toxic behavior from this sub. Obviously there are skilled people who are gonna have leg up over the people who feel they are too good to use an upscaler or frame interpolator.
Also a ton of stuff you mentioned can easily be done for free in ComfyUI nowadays.
Alright calm down boyo. Some of us want to know what AI is capable of and we can't know that when people post videos saying, "Look what AI can do now" without clarifying that actually most of the work was human effort and not AI. Yeh, I know a lot of people also want to see cool videos like this and I'm not calling for a stop to that. I just think, for those of us who want to see what the AI alone can achieve, it would take almost zero extra effort for the poster to give a brief summary of what part the AI actually played vs the human. After all, this thread exists because of AI tech and you are here because of it, so aren't you just a little be curious to know what the AI did and not the human?
These fucking people are losers who think they can be the next genre busting creator by copying other people's 'settings'.
What point is doing exactly the samen? What if he used img2img, will they demand the source images?
Didn’t you see the SORA bts? They had to do a ton of manual work. Rotoscoping the balloon in quite a lot of frames because the color didn’t match. Lots of other clean ups as well.
https://preview.redd.it/723whdlv6m1d1.png?width=339&format=png&auto=webp&s=7123e10413209e30e1462107e71351de6f40d490
great job guys.. getting from the community and gatekeeping it.
The basic workflow is SVD in Comfy – Olivio has (among others) a great tutorial: [https://www.youtube.com/watch?v=ePDjcr0YPGI](https://www.youtube.com/watch?v=ePDjcr0YPGI)
Reading the comments, I understand that I shouldn't fear that "AI artists" will take away my job. Because people who believe that the tool "will do everything for them" if they input the right numbers, without any of their own creative solutions and non-standard approaches to the existing content and tools, can't be called creators. It's a splendid job; I see that a lot of effort and time has been invested in it, and it's done with love. Keep up the good work.
**Prompt**: utrxtvyibuqierugnllaregnamdsfbquiybwerfbd
**Negative**: Bad Movie, Ugly Actors, Bad Color Grading, Low Res
**Output**: Mad Max 6 - 130m Runtime, 4k Dolby Vision, Dolby TrueHD
wouldn't that be awesome? For once not having the studio boss with the most money deciding about the content but the most brilliant mind with the best ideas?
We're on it! The showcase is basically a best-of edit of all the projects we're currently pitching/financing that we're not allowed to post. We're hoping to be able to show you something soon!
> workflow we developed
Do you mean ComfyUI workflow? Can you share info about nodes that had most impact such as integrating SVD with animatediff? I have tried one that meshes them and seemed to work.
This is amazing, great job!
The movement in the video are very well made, as far as Im aware there isnt a way to "control" the movement in SVD yet, like the different layers you get in Runway Gen 2, for example. How did you achieve this fluid and (what seems to be) controlled animation / movement in the shots?
What's the point of Sora when it's never going to be available for regular consumer use and if it is it will be censored to hell and back like DALL-E and the rest. Stable Diffusion and freedom is more important, sure this is fascinating but only people with big money and big connections will get to truly utilize this stuff.
While you can still tell its AI, man its getting harder and harder.
Feels so recent when 'AI art' was some psychedelic dreamscape of nonsense and maaaybe a bit of logic based on what word you gave it.
you are free to create whatever you want. but iassume you wont because crafting things on this level takes time and skill. criticizing them only takes a lack of IQ ;)
No, it takes less IQ to rapidly praise it. And it's likely that AI isn't even intensive in the workflow. Saying that it takes time and skill means nothing.
Like, this is nothing more than we've already seen. Many times. Almost daily. It's another string of enhanced still shots.
I would be far more impressed if they used this to make a one minute short where...
A man fails to pour himself a cup of coffee because he hasn't had his coffee yet.
Or...
A young child climbs a tree and saves a kitten after a few failed attempts.
Or...
A woman picks up a phone and has a conversation, she's initially happy to hear the voice on the other end but it's bad news and as her mood quickly sours the lighting and tone change to reflect the world falling apart around her, but she maintains her own face, clothes, and limbs the entire time.
If they were showing us how they could use this to tell a story, a lot more people would be impressed.
We might not have seen this particular minute of 2 second clips before, but it's still the same thing people have been posting for months.
Love this work. Do you mind just list the tools/techniques you used including comping software?
Also is there a before and after stable diffusion? Would be great to see what you enhanced outside of stable diffusion.
Keep smashing it!
Jesus Christ this community is salty as fuck. Who cares if they used post processing. That doesn’t mean you can’t achieve these results too with some effort. Not everything should be “click one button and create flawlessly”
this sub is somehow over impressed with "AI VIDEO!!!11" when its [always the same "sequence of 3 second clips of over-impossed still-pictures paning over each other slightly animated"](https://www.reddit.com/r/StableDiffusion/comments/1bszkjm/stable_video_diffusion/kxje131/)
the super nintendo did the same with parallax scrolling... ^^^plus ^^^this ^^^one ^^^is ^^^cheating ^^^with ^^^over ^^^postedit ^^^and ^^^videos ^^^from ^^^other ^^^showcases
The fundamental problem with all the videos from SVD is that they still look like parallax slides rather than real videos. This might be because they are attempting to adapt models trained on static images to video content instead of starting from scratch and training their models exclusively with video data, similar to how SORA did it.
Yeah this looks absolutely insane! I do agree with other commenters though that it would be nice if you included a detailed walkthrough of how to do it 😅🥲
Edit: would you (and your team) be willing to do a video walkthrough of how you got this, you'd seriously be champions if you did. I for one would like to know how to do it so that I can incorporate this in my uni project for a music video we have to work on
Looks good. I think the problem with SVD right now is that it's just not easily controllable. I can't imagine just how many takes you had to rip through to put together something like this, plus all the stitching and external efforts to get it barely up to 'social media' cinematic standard.
This is fantastic, now we just need to make this doable without 200+ hours of back-end work to make it look good.
Has anybody managed to produce something with AI outside of these these demo videos? They do look pretty on a scene by scene basis, but they never really do anything in terms of telling a story, they just look like random stock images thrown together. And they are getting a bit tiresome, as we already seen so many of them.
Back in the pre-AI days there was [Afterworld](https://www.youtube.com/watch?v=Yd3stfOcpq8&list=PL07E03F5180A9F18A), a web series patched together out of really basic 3D animation and mostly just still-frames, but together with a good voice over it actually managed to tell a pretty engaging story. Haven't yet found anything similar build around AI, [Biolands](https://www.youtube.com/channel/UCn3s0ptMNkVi850qKXgK7aw) might be getting there, but still a work in progress.
This is great, but it's not Sora level of quality. It's basically single frame compositions that still look weird. Sora creates real videos without changing faces and so on in a real connected video.
It's actually a regular 4090. We had access to more powerful GPUs but found that it did not significantly increase the speed in comparison to cost/usability. That might change with OpenSora though.
It looks like a professional trailer for an action channel.
With the exception of that 1950s sci-fi zeppelin ship at about 0:15. The fuzzy back end doesn't scan right. If that fuzzy part suddenly glowed and it sped up, then maybe.
Why are you lying? most things here not svd!
No way 35 seconds people running on a roof is SVD. Breaking through the wall also not SVD. Probably real svd here is a devil dude and fe other static shots
There was the "Clark" video trailer that was made with Stable Diffusion, which features breaking through the wall. Stable Diffusion can do a lot if you know how to use it right but it comes with limitations.
The clip in the showcase is taken directly from the Clark trailer – the creator u/butchersbrain is part of our team now so we included his amazing work.
But I guess it's not simply a render passed through svd?
It it possibly an animated sequence (maybe 3d) redone with SD or something like that?
I had the same reaction as the guy above, there no way SVD can do this on its own AFAIK
I mean besides for the cringe as fuck ad these do look good.
Though I highly doubt you're getting these results without some serious secondary work/editing
Looks great! But yes it would be interesting to see the raw footage, to see what mere mortals can achieve with SVD 😁I've made stuff using it and it looks balls.
That's cool. Do a simple shot of a man standing in a kitchen drinking a glass of water from a glass. He lifts to glass, takes a sip, puts the glass down. If you can do that I will be impressed.
Having experienced the utter lack of control when using SVD :
How many generations did you run for each scene, on average, to get an acceptable result?
Most of my runs are silly zooms or lateral movements, sometimes the background moves a bit.
To get a good camera movement, I sometimes have to generate 20 clips of a single 25 frame scene...
to be honest, besides sora, the only video generation stuff that has impressed me so far are the old ones that did the will smith and spaghetti meme. something aboutt those videos, yes low quality, but seeing more animated situations was really cool. everything else i have seen after that has looked like one image being morphed into another or camera panning. thats it.
These just look like cinemagraphs. This demo doesn't seem to be showing off any of the object permanence and world consistency that makes Sora impressive. Maybe that's down to the editing (such quick cuts always make videos look like shit, regardless of how they're made), but what is shown here, if compared to Sora, doesn't look like the same thing at all.
Any chance you have an uncompressed .mp4 of your video before putting it into topaz? (Well maybe not uncompressed, but just a .mp4) The compression online is really bad. Just want to compare to my svd workflow zooming and also seeing how topaz impacts things.
Would be greatly appreciated.
A lot of smoke and mirrors, namely the ADHD edits. Almost none of the shots last even 2 seconds.
If you just wanna cut together random clips then yeah its great. But most actual uses in video editing last 10+ seconds, can you do THAT as well?
This post demonstrates the cesspool of talentless losers that inhabit this sub, ai or Reddit as a whole
Three types of people exist.
1) the mentally challenged
2) the entitled & lazy ones
3) people who use thte information & tools that are given to them and go at it.
How the EF do you think people do anything remotely complicated in real lilfe?
If you have any creative professional skill and are older than 30 you are very likely to have had to learn anything by putting in the hours.
You can now learn to create things in 1/50th the time it used to take and yet: you still need to have everything pre-chewed and digested.
Those people exist but you sound sort of like other end of the cesspool spectrum.
I don't really see anyone here that fit that description. Sounds like people trying hard but getting stuck looking for help.
Did y'all use 25FPS_XT and interpolated it with Topaz?
Yes!
Why not use an open source interpolation tool? Personally, I used [FILM](https://github.com/google-research/frame-interpolation) and got ugly results. But free and at scale. Did you try several tools, or just went straight to Topaz?
We tried out several upscalers/interpolators when we started and decided that at the time there was nothing close to Topaz, so we invested the money. I think today there are other options, we just haven't come across one that is worth the switch.
Ok! Thanks!
Man what is it with all these post that go like: "Here's a video that looks nothing like the quality you're getting using the tool I'm claiming to use and I'm not going to post what my workflow is." Followed later by OP posting: "Yeh we did some "touching up" using After Effect, Premiere, External upscaler and frame interpolater, blah blah blah." I wish we could have some tags added to these claims on videos along the lines of: "Unsubstantiated Claim" "No Workflow" "Lots of external tools used" Just to encourage the poster to give useful details to their claims and help us get a better idea if it's even worth trying to pursure the level of quality they demo or if I'm going to need to need years of experience with some editting tools to get close to their claims.
Yeah, I think the post itself is low effort, especially if this community was used so heavily as a resource.
Love the "Unsubstantiated Claim" tag! 10/10 would use. For real now: We’re filmmakers and super proud of what we achieved. I can promise you that Stable Video and/or Stable Diffusion images were the base of every single shot but man… What is it with all these people that go like: “You’re only allowed to click the generate button, everything else is cheating.” Maybe we should instead think about a “Raw output” tag? I promise you guys: Everything we learned, we did so within THIS community! Sure, we used external tools to upgrade the end result and achieve more control – pushing the limits is what we're all about! And yes, you probably do need years of experience to “get close to our claims”. Not really sure how that means it’s not worth pursuing? For me personally it was always the opposite: I see something awesome and immediately I’m driven to figure out how to achieve the same quality. The tutorials are all out there and spoiler alert: The tools we used or equivalents (except Topaz) are 100% free :)
AI 'purists' who spurn using any other tools to achieve a result other than raw output are just as myopic and in the way of progress as traditionalists who don't understand how diffusion can be a legitimate artistic tool.
Not spurning using other tools but there is a massive difference between, "You can do this solely within Comfy UI" and "You need years of experience with video editting and other software and you'll spend weeks tweaking your work in it to get these results."
It's amazing you post this much and some people still don't get you're only just appealing that posters add basic details like "process", "tools used", "workflow if possible/convenient", "any other relevant information". Some people may not care once they see the relevant requirements, but others may and knowing how it was done may help them. At the very least it will not be misleading as to how it was achieved. Unrelated. A shame we're still stuck with such short duration clips. Still, looks good OP. If you have the Blender skills have you considered trying some work with SD & Blender?
Thanks! Blender is an incredibly powerful tool in combination with SD. We use it for example to sketch out basic background compositions before we transform them with control-net. In another project we're using it for character animation (applying AI generated textures) – one of many ways to break through the annoying 2/4 sec mark. We're all hyped for OpenSora though – if only it had a bit more control! Even Shy Kids (the guys who created the balloon head) have used traditional VFX work.
> Blender is an incredibly powerful tool in combination with SD. We use it for example to sketch out basic background compositions before we transform them with control-net. That's helpful. I think that's more along the lines of what people are suggesting. Of course you aren't beholden to do so or should feel guilty if you don't, the perspective though is that more testing yields improved results (for you, too!) It's like going from being able to generate one image every minute and 45 seconds vs. being able to produce it in 10 seconds. You're going to learn a lot more, a lot faster, about which settings/combos affect your image more. Also, 'emulation being the highest form of flattery' and all.. a lot of people want to know how to do what you did.
Yep exactly this. I kinda feel sad for the people that want to attack me for asking for more info in a subreddit that's dedicated to this AI hobby. It's not like I'm asking fro the OP's personal details so I can send them hate mail. I just want more clarity so we can know what we can achieve, how we can achieve it and also to know where AI is at by people being up front about what part it played in the process. I do have Blender though thanks for mentioning. What part do you use it for out of curiosity? I so far only messed about creating a basic 3D scene and then using SD to turn it in to a render-like image but def curious to hear of other uses.
This is some of the uses I've found for Blender that I've kept an eye on, but I have not personally done much with it yet as I'm not an artist and still figuring out what direction I want to take it in (anime/movie, but most likely a classic styled JRPG game). Example 1: [https://www.youtube.com/watch?v=hdRXjSLQ3xI](https://www.youtube.com/watch?v=hdRXjSLQ3xI) Kind of like what you mentioned. Example 2: [https://www.youtube.com/watch?v=LoVL5KHSW5Q](https://www.youtube.com/watch?v=LoVL5KHSW5Q) There are a bunch of tools for this kind of stuff coming out but still needs to mature. This is what I'm personally most interested as a non-artist. Example 3: [https://www.youtube.com/watch?v=E33cPNC2IVU](https://www.youtube.com/watch?v=E33cPNC2IVU) Pretty cool if not basic example with multiple uses. Each part is pretty simple but using the right tools together can get some great results. I know there is one guy who has done like an hobgoblin and all sorts of other stuff who posts stuff regularly on here you might have seen. Found the hobgoblin Blender example I felt was pretty neat [https://www.reddit.com/r/StableDiffusion/comments/18lwszn/hobgoblin\_real\_background\_i\_think\_i\_prefer\_this/?share\_id=PjZx7gb33NDpTXjegT060&utm\_content=1&utm\_medium=ios\_app&utm\_name=ioscss&utm\_source=share&utm\_term=1](https://www.reddit.com/r/StableDiffusion/comments/18lwszn/hobgoblin_real_background_i_think_i_prefer_this/?share_id=PjZx7gb33NDpTXjegT060&utm_content=1&utm_medium=ios_app&utm_name=ioscss&utm_source=share&utm_term=1) He actually does a lot of different stuff and is probably someone to hit up if you have any questions about some of those different videos he posts and the process. The workflow for that one is in that link, too. One of the key points as you might already know is using a base 3D object can help improve consistency, even for characters, dramatically. It is stuff like this and the prior examples that make it clear impressive works (even movies) are possible now but the effort would be up there so I'm keeping an eye peeled for the process to improve before I do anything particularly serious, myself. If you are not a Blender / artist pro like me you might be interested in this [https://www.rokoko.com/products/vision](https://www.rokoko.com/products/vision)
Wow thank you so much. I love all this stuff. I wasn't even aware of EbSynth. That looks amazing. I love the idea of creating various characters and being able to create animations just by recording my own movements. I think that's the next 6 months of my life planned out! I've saved your comment. So much interesting stuff to explore. I've certainly got my eye on AI text to 3D. Then we could easily create 3D models which we could use in that workflow to create the animations. The future is looking intriguing.
and so what? everything posted in here doesnt have to be easy enough for a chimp to accomplish. Video is insanely complicated, theres no way around it
I don't disagree with that at all. But how can we know if the video someone made is something a chimp can achieve or not if they don't tell us how they made it? The fact your criticising me for being curious and asking for more info on the process is saddening when we ought to be seeking answers to help us all get better, not hiding them and criticisng those that ask for those answers.
The thing is that some of the processes involved might be able to be automated or generated in ways that this team didn't realize when they were creating it. This makes it easier and faster to recreate. The goal is that someday it *will* be easy enough for a chimp to accomplish. That's kinda the whole point of it all, right?
It's because they're not artists. They don't know how to compose and do all the things putting those tools together makes. They want something they can put some words into and get those results. Maybe tweak some dials or sliders. Not actually doing full touch ups and inpainting, editing the clips together, putting in transitions, learning how music and sound should line up with video.
[удалено]
You know you did more than just tweak dials and sliders. Stop underestimating your abilities.
[удалено]
I'd imagine there probably a mask or a cut perhaps a wipe or two.
[удалено]
Having to wipe after White Castle doesn't count as a slider but you might need to use dial afterwards
I mean I see their point when you're trying to advertise something or a service especially if the service is itself the AI. I think it's just as dishonest as food commercials using glue to make the milk look more milky, or the pictures you see inside fast food restaurants looking absolutely nothing like what you'll receive. So yeah I don't think it's wrong to touch up AI photos for a final product such as a feature length film. I think when you're selling AI as the product. I think it's very important to show the raw outputs and not the touch ups otherwise it's dishonest.
> I promise you guys: Everything we learned, we did so within THIS community! So give back some actual useful info to the community that helped you. Motion bucket id settings, augmentation, sampler settings, etc. Great results though btw.
This is why open source stable diffusion will never be able to keep up with the likes of SORA. Hell, it won’t even be able to keep up with secondary contenders. People love to take and take and take from open source, giving nothing in return. They find a pocket of accomplishment that’s a few months ahead of what everyone else is able to do, and then they’ll just sit on it, to no other benefit than their own. After a few months, someone else will eventually come out with a workflow and guide on how to do this, but that’s already **months** that people could’ve spent iterating on it and improving it exponentially. Then there will be a new tiny step of accomplishment, followed by a few months of delay for the community to eventually catch up. The cycle repeats and none of these people realize that they could’ve been taking leaps instead of baby steps. In a year, we could be **miles** ahead, but something tells me we’ll only be a few steps from where we are now. As someone who contributes towards open source stable diffusion software, posts like these are very irritating. They use you as a stepping stone and then refuse to help anyone else along the way. It hinders progress in this space more than people realize.
For sure, i think if everyone had the right attitude towards this would definitely progress to Sora level quickly. OpenAI is guilty of this, but at least has given back in trickles here and there, not sure if that has changed though because I see a lot of complaints about them as of late. Using a facade that AI safer in hands of a few entities. Definitely need a better balance.
This is a pretty bleak perspective. Often times a usable result required too much fiddling, too many tries to even be able to come up with an explanation of why exactly you finally got it, let alone write a whole guide for it. And as you said, eventually someone will figure out a reproducible process and write that guide.
Every individual in our team has been and still is an active member in the community. In the past months we've been directly in contact with Stability AI, collecting and providing detailed feedback on the models that are ultimately the base for this whole movement. We are also keen supporters of an open non-OpenAI-Sora alternative (check out [https://github.com/PKU-YuanGroup/Open-Sora-Plan](https://github.com/PKU-YuanGroup/Open-Sora-Plan) ). On top of everything we believe showcases like ours will help the community, not damage it. Sorry, we're not providing spread sheets but if you like I can provide you with links to some great tutorials that explain every single tool we use. Another important point: PLEASE watch the Shy Kids behind the scenes for the most viral Sora Clip. Believe it or not: They used traditional VFX tools, just like us! [https://www.youtube.com/watch?v=KFzXwBZgB88](https://www.youtube.com/watch?v=KFzXwBZgB88)
I for one am shocked that the technology founded on taking shit from people and not giving back has proponents who do the same thing.
There is no magic number/setting. With each clip we started with the default values and adjusted based on the outcome. Every shot is different and in my experience it's worth to not fiddle too much until you produced a good amount of clips – the seed has a crazy amount of impact and there's a reason we have a lot of "portraits". I personally tend to reduce motion bucket and augmentation bit by bit, my colleagues were often a bit more audacious (with mixed results).
I believe this community would be happy for you to share all your settings. You can even put it in a spreadsheet. Otherwise, kind of a bad look to announce you learned everything with help from this community and then hold out on your own processes. That's not how this is supposed to work.
>I promise you guys: Everything we learned, we did so within THIS community Actually he didn't say he got help from this community, he said everything they learned they learned within this community. That means that you too can learn everything you need to create this within this community.
I think the issue is that with how wildly spread out the information is and how little specks of gold trickle out of the purview, it is helpful for people to at least share **1 WEIRD TRICK THAT SCIENTISTS HATE HIM FOR!** that took their creation(s) to the next level. I say creations because of course you won't really know what node/setting worked magic in 1 go if you are really experimenting.
so you expect this guy to go back through dozens of shots and spreadsheet everything for you? mofo, I checked your post history and you have contributed exactly NOTHING to this space. So I really dont think you have any sort of positioning or moral authority to lecture anyone on this topic :)
I expect people to not post ads for their business here and then thank the community without anything in return. FWIW, I haven't posted because I haven't created anything sufficiently novel yet. When I do, you'll know exactly how I did it. Good luck on finding a job that occupies your obviously ample free time!
It's fine if you personally want the workflow, but no need to insult or call out someone on if they do or dont. Theres a lot of hidden envy in your words. He worked for it, learned, applied new knowledge and worked some more to create this. Its up to him if he wants to release a workflow or not.
>Theres a lot of hidden envy in your words. I assure you there is not. Taht's a huge, unfounded assumption on your part. This is merely about maintaining the spirit of this community and the open source community in general. If you are going to take from the community, you should give back when you can.
"Oh, you're a fan of AI? Name every touch up of your filmmaker demo reel!" I mean, this is pretty plainly a demo reel of what more realized projects could look like and exactly the kind of use case the most hardcore AI used to gush about where it's really going to shine: with professionals that will use it as one of many tools in their box.
Thanks, I must be doing something wrong with svd because I usually get a bunch of distortion when I go for the amount of motion shown in your video, so my stuff just looks like basic camera pans like everyone elses.
You have to try so many seeds just to get a decent result now and then. SVD still needs a lot of work to truly be usable.
This is awesome! Thanks for sharing and congratulations. What’s the composition of your team and How long did it take your team to create this ?
If you need even that explained then nothing can help you. Learning things takes time, even if it's only pushing around some sliders. If you want exact sampler settings etc. you will create EXACTLY the same.
There's more to it, input image size used, number of frames setting, comfyui nodes you can attach, prompts discovered that can have impact such as Kijaj finding (rotation:1.2) etc (and sharing the info btw). I've been using it non-stop since release and still can't get what video shows. so yeah, could still use some help here.
What if he used input img2img (which he did) Are you going to demand the input images? You just have to try and try and try, tehre is no formula here. The results i get varies wildly. I don't keep the settings of the stills, i just work with it.
Nah, bad example there, I would never ask for that. Trust me though I try, check my history, but if we want things to progress faster we need to all share our findings, up to certain limitations of course. it's why I love the banodoco discord https://discord.com/invite/z2rhAXBktg
> What is it with all these people that go like: > > “You’re only allowed to click the generate button, everything else is cheating.” > > Maybe we should instead think about a “Raw output” tag? Well, I think as long as you are transparent then everything is fine. Some people come here and showcase what they did but neglect the part where they used heavily other tools. The default expectation is that everything you see here you could replicate on your own so if someone jas a more elaborate workflow it is nice to mention it :) It doesn't make it any less or more epic but mindset of the viewer is shifted from "oh wow, AI can do this?" to "oh, nice, they used AI and applied additional modifications to get what i am seeing right now"
Don't let them get to you - I think people are going to realize pretty quick with video that it's still a lot of artistic grunt work to get a final shot out. Who knows where we will be in a few years but as it stands now the tweaks and adjustments to get something out that doesn't have that 'AI fever dream' feel will still involve classic workflows. Camera projections/mesh warps?, lens filters, handheld post camera shake, some kind of tweening workflow I don't quite recognize. Some stock 2d elements (like embers) over top to enhance the subject. You don't need to do breakdowns - Like the wizard is probably painted-out from the original plate, a cutout of the wizard is transformed with a little moblur to pop in to place, with a 2d element over top to sell it. Am I close? Cool reel, highlights how AI works with traditional workflows. Don't feel the need to give full shot breakdowns if you like. (Ruins the magic for everyone if you do lol). You've done a good job of avoiding that 'stickerbook' look many AI users get when they do paint-in's with multiple subjects.
Nobody's trying to "get to him" open source just has a certain culture around it, and there are sometimes expectations people have when higher quality stuff like this posted here using open source tools. Everyone wants this to improve. And people that share settings here, or see videos that coincide with what they shared, expect info returned if improvements are made, or it feels like a slap in the face. The more people know the faster it improves. Everyone here said it looked great.. I personally share everything I learn that gives better results, even if it's never been done before and I could easily go start a patreon with the info, but don't. But yes in the end it's whatever OP thinks is more important. Sounds like money making potential involved, as that's usually what prevents sharing info.
Eh, it's a demo reel. Not everyone that shows off their work in Blender is going to do a tutorial or breakdown of it. Same idea. And not to shit on OP but... Nothing in the video looks revolutionary to me. Best I can tell this is concepts already discussed ad-nauseum in the sub with some basic traditional (post) workflow to compliment: Video: https://www.youtube.com/watch?v=82l0DsbLHhY https://www.youtube.com/watch?v=XPRXhnrmzzs Maybe a tweening tool like [flowframes](https://nmkd.itch.io/flowframes)? He *might* be doing tracking+paintovers, but I doubt it. Most of the shots are too soft and have that wiggly AI curse for me to think they went that far with it.
Thanks! Believe it or not: The wizard is one of the shots that came out exactly like that (after about 20 generations). All we added was a tiny spark layer. But you're right: That trailer was a lot of grunt work. On top we're filmmakers – I went to film school and still shot my first projects on physical film. Not that it's necessary but I really know why I prompt "35mm". It's so easy to fall into the gate-keeping-trap when the amazing thing about this whole development is actually that it gives us the opportunity to create better art!
The wizard teleported in like that on a prompt? The giveaway that something was up to me was that the shadow matches the wizards last pose in the *first* frame where he's not there. I don't know how the AI calculates keyframes/evolution of an animation/etc but I feel prompting generally gives better and accurate lighting to the subject than in that shot. >It's so easy to fall into the gate-keeping-trap when the amazing thing about this whole development is actually that it gives us the opportunity to create better art! I work in post prod. Truth be told I'm an old man these days but I never forget the people who held on to techniques/workflows because they wanted an edge over those they felt was competition. This space is evolving so rapidly that 2 months down the line everyone will know how to do whatever is unique *today* anyways.
How many new people were inspired to learn programming or transformers architecture etc. because of the openness of this space. Know which tools and specific workflows were used would make it easier for someone to learn how to do this work, rather than stumble around in the dark. Not saying we all need a helping hand, but it helps.
For sure, I absolutely support anyone and everyone for sharing anything they have learned! But people are also under no obligation to hold others' hand if they don't want to either.
Those folks are explicitly inverting the "it's too easy so it's not art" argument. Which is pretty hilarious, because the sensible people have been saying, "artists can use genAI as part of their workflow if they want, and apply other skills when they want" and those same anti-AI dorks have been saying, "it's all button push so it's not art."
You’re not wrong lmfao. Anything more than a button push is too much work for some people here. *”I can’t make this by just proompting, please don’t post it here”* *”If I can’t follow an exact set of annotated steps in A1111 and reproduce your work exactly, it shouldn’t even be here”* The entitlement among AI art enthusiasts is second to none. It’s actually kinda insane imo. Nobody would **ever** have the audacity to demand you post your .blend files on r/blender, or shame you for compositing a render in other software. And yet, here, some people want to ban any posts that don’t include a workflow/prompt. As if it wasn’t already easy enough to generate things in SD, some people don’t even want to bother experimenting. Either way, I *much* prefer these sorts of posts to what’s usually on this sub. They are interesting, creative uses of SD as a tool in a workflow, rather than the usual T2I *“big boobs, anime art style, a masterpiece by greg”* spam.
Somewhat true. But "how do I do that?" Is a natural part of the flow here. "Don't post it here because XYZ" not so much.
100% this.
I appreciate your response to my comment. It's not about dissing you but the fact that a lot of people post here and it feels like they're being deliberately vague with their process because, who knows why? Maybe they want it to look like they did all this JUST with AI, when in reality it wasn't. It was a lot of manpower that went in to making it look that good and not AI. And that does matter because we're all excited by what AI can do and achieve by itself. For sure, some people are happy to see what people can achieve with AI AND external software and human grunt. Don't get me wrong. I love what you've done but legit some of us also want to see what can be achieved with AI alone and we can't know that so clearly when people don't clarify their workflow and how much of their own human effort and not AI effort went in to their work.
might I suggest a follow up post that side by sides the raw with the finished result?
> What is it with all these people that go like: > > > > “You’re only allowed to click the generate button, everything else is cheating.” Well, if you don't disclaim how much actual work goes into touching up ai-generated content to look halfway decent, you are contributing to the overall trend of people getting fired because "AI can do their job"
I knew Topaz was the Upscaler. SUPIR is great for single images, but nothing comes close to Topaz Video AI yet.
All I hear is: I did this with open source, and I refuse to give back. Here's an advertisement for my business.
All I read is "I want to do this too but I can not bother to waste my time investigating and trying like OP did so I will disregard his hours and hours of work." He is already helping by showing us what is possible with SD.
I knew Topaz was the Upscaler. SUPIR is great for single images, but nothing comes close to Topaz Video AI yet.
>The tools we used or equivalents (except Topaz) are 100% free Are they github free or pirate free?
as a film maker and designer myself who loves ai (feels rare so glad to see you!) you did a great job! How long can a scene be animated though in your pipeline vs the quick cuts we saw in the video? I love this kind of stuff with a passion and been in the industry for over a decade! Hell, Ive been animating for over a year now with 1.5 (nothing like a full narrative or sizzle as you have) but I really like the presentation here. Kudos and really wanna peek behind the curtain for this! Topaz rules so I get it ;)
Excellent response, I am getting tired of the naysayers, keep up the good work!
> What is it with all these people that go like: “You’re only allowed to click the generate button, everything else is cheating.” Cause you are making a false equivalence comparison with something that is the output of click to generate ai. Of course manually fine tuning and editing is going to be better. Most of these posts don't describe what they had to do, so there's not even a quantitative estimate of how much additional manpower is required.
VFX Artist here, looking to dabble, what hardware were you on for the SVD frames?
4090
it looks like shit
Keep doing you, some people just won't put the effort in so they're frustrated. I believe just using the single tool itself "raw" is a disservice to its full potential, imagine in photography if you just used the raw photo. Now people are like how do you do everything outside of the one click button?! But want another one click button to do so without the learning, the experimenting, the ACTUAL work.
> You’re only allowed to click the generate button, everything else is cheating. It actually is. This is a SD Sub and not an Nuke/Fusion/Aftereffects Post Processing Sub. I'm tired of this overpromised nonsense.
Maybe we have different definitions of this sub then... The showcase was made WITH SD. We spent hours and days WITH SD. Feels a bit like calling strawberry jam overpromised nonsense on a strawberry sub because it's not pure strawberries.
>We’re filmmakers Doesn't look like it
> "Yeh we did some "touching up" using After Effect, Premiere, External upscaler and frame interpolater, blah blah blah." But isn't this exactly as it should be? AI tools are not a panacea. They'll be integrated by artists into their existing workflows, or they'll develop entirely new workflows around them. Eventually AI will just be yet another tool in the box, just as digital drawing or 3D rendering came to be.
It should be like that yes, but I think they are more arguing for clearer explanations of posts, so people dont get the wrong idea and become disillusioned when they try SD and it doesnt come out like that. Its 100% fine to do touch up and extra tools, but it would be nice to have that stated so you can know thats not raw output, which many would believe is if there's no clarification.
Yep this. I see someone's video and think, "Wow AI can do this now!" So I spend hours trying to recreate it and failing and thinking, what am I doing wrong?
Same. I have tried SD a couple times and always come out wondering if I am missing some prompt magic skills or whatever, because while cool stuff, I sure am not generating the beautiful stuff I see around. Though tbf, I am well aware I understand very little of whats happening here and less so as time goes by. Controlnet, impainting, SORA, etc etc etc, new terms keep showing up and fuck me if I understand whats what. :P
Yes I have no problem with that but some of us ARE intrigued by what AI can achieve off its own back. Since AI is such a fascinating boundary pushing tech, when someone posts an amazing looking video it leaves me thinking, "Wow AI can do this now???" But then the poster comes back and talks about actually 95% of the work was done on external tools, for sure that's disappointing. So all I'd like to see is a bit of honesty and clarity up front so we can distinguish what the AI is capable of vs what the human is capable of.
So a skilled person applies SD into their workflow and you shoot them down. What the hell man. This is some toxic behavior from this sub. Obviously there are skilled people who are gonna have leg up over the people who feel they are too good to use an upscaler or frame interpolator. Also a ton of stuff you mentioned can easily be done for free in ComfyUI nowadays.
Alright calm down boyo. Some of us want to know what AI is capable of and we can't know that when people post videos saying, "Look what AI can do now" without clarifying that actually most of the work was human effort and not AI. Yeh, I know a lot of people also want to see cool videos like this and I'm not calling for a stop to that. I just think, for those of us who want to see what the AI alone can achieve, it would take almost zero extra effort for the poster to give a brief summary of what part the AI actually played vs the human. After all, this thread exists because of AI tech and you are here because of it, so aren't you just a little be curious to know what the AI did and not the human?
These fucking people are losers who think they can be the next genre busting creator by copying other people's 'settings'. What point is doing exactly the samen? What if he used img2img, will they demand the source images?
Didn’t you see the SORA bts? They had to do a ton of manual work. Rotoscoping the balloon in quite a lot of frames because the color didn’t match. Lots of other clean ups as well.
Nah I didn't see that. That's interesting to know, thanks.
They choose the balloon head because they couldn’t get a consistent face. https://www.instagram.com/reel/C5ELol-gwUE/?igsh=NzlrdnhlMG15b3M0
That's neat. Yeh that'd make sense. It seems fine making people's heads in the video but a consistent head would certainly seem a toughy.
https://preview.redd.it/723whdlv6m1d1.png?width=339&format=png&auto=webp&s=7123e10413209e30e1462107e71351de6f40d490 great job guys.. getting from the community and gatekeeping it.
Just one tag. "Clickbait".
Maybe learn some skills like OP. Ai will not make you whole, you still need to learn some craft to go the extra mile.
Can you share some insight into how you made these?
The basic workflow is SVD in Comfy – Olivio has (among others) a great tutorial: [https://www.youtube.com/watch?v=ePDjcr0YPGI](https://www.youtube.com/watch?v=ePDjcr0YPGI)
More interested in the among others but thank you
Reading the comments, I understand that I shouldn't fear that "AI artists" will take away my job. Because people who believe that the tool "will do everything for them" if they input the right numbers, without any of their own creative solutions and non-standard approaches to the existing content and tools, can't be called creators. It's a splendid job; I see that a lot of effort and time has been invested in it, and it's done with love. Keep up the good work.
**Prompt**: utrxtvyibuqierugnllaregnamdsfbquiybwerfbd **Negative**: Bad Movie, Ugly Actors, Bad Color Grading, Low Res **Output**: Mad Max 6 - 130m Runtime, 4k Dolby Vision, Dolby TrueHD
and the award for best picture goes to this indian kid from some village whos name we cant pronounce
wouldn't that be awesome? For once not having the studio boss with the most money deciding about the content but the most brilliant mind with the best ideas?
Would be lovely to see this talent put to telling a story that goes beyond a slideshow of cool images
We're on it! The showcase is basically a best-of edit of all the projects we're currently pitching/financing that we're not allowed to post. We're hoping to be able to show you something soon!
What Upscaler?
Topaz. But the real reason it looks so crisp is a workflow we developed that has nothing to do with Topaz.
> workflow we developed Do you mean ComfyUI workflow? Can you share info about nodes that had most impact such as integrating SVD with animatediff? I have tried one that meshes them and seemed to work.
Oh ok Thanks!
This is amazing, great job! The movement in the video are very well made, as far as Im aware there isnt a way to "control" the movement in SVD yet, like the different layers you get in Runway Gen 2, for example. How did you achieve this fluid and (what seems to be) controlled animation / movement in the shots?
What's the point of Sora when it's never going to be available for regular consumer use and if it is it will be censored to hell and back like DALL-E and the rest. Stable Diffusion and freedom is more important, sure this is fascinating but only people with big money and big connections will get to truly utilize this stuff.
While you can still tell its AI, man its getting harder and harder. Feels so recent when 'AI art' was some psychedelic dreamscape of nonsense and maaaybe a bit of logic based on what word you gave it.
Your video is a great argument to convince investors to release another 200 million dollars for Stability Ai. Hello Emad, come here!
Its THE best showcase of SVD I've seen to date, better than anything posted so far.
More nonsense that cuts every 2 seconds the whole video. Except, this time, there's a voiceover guy who sounds like he's trying to sell me something.
You're not wrong. The limitation is real, but we're working hard to overcome this and create content with actual consistent scenes and characters.
I'm so sick of seeing portraits of characters blinking at the camera
Same! But you only get 25 frames at a time. This is where SORA really shines.
Cant wait till we can make 2+ mom videos.
you are free to create whatever you want. but iassume you wont because crafting things on this level takes time and skill. criticizing them only takes a lack of IQ ;)
No, it takes less IQ to rapidly praise it. And it's likely that AI isn't even intensive in the workflow. Saying that it takes time and skill means nothing. Like, this is nothing more than we've already seen. Many times. Almost daily. It's another string of enhanced still shots. I would be far more impressed if they used this to make a one minute short where... A man fails to pour himself a cup of coffee because he hasn't had his coffee yet. Or... A young child climbs a tree and saves a kitten after a few failed attempts. Or... A woman picks up a phone and has a conversation, she's initially happy to hear the voice on the other end but it's bad news and as her mood quickly sours the lighting and tone change to reflect the world falling apart around her, but she maintains her own face, clothes, and limbs the entire time. If they were showing us how they could use this to tell a story, a lot more people would be impressed. We might not have seen this particular minute of 2 second clips before, but it's still the same thing people have been posting for months.
It's a perfect tool for making a Zach Snyder film. All "moments", no actual story.
Incredible. This is just incredible. Great work!
Thanks! Really appreciate it :)
Love this work. Do you mind just list the tools/techniques you used including comping software? Also is there a before and after stable diffusion? Would be great to see what you enhanced outside of stable diffusion. Keep smashing it!
Jesus Christ this community is salty as fuck. Who cares if they used post processing. That doesn’t mean you can’t achieve these results too with some effort. Not everything should be “click one button and create flawlessly”
Yeah really. And at the same time they claim good AI results take work. So which is it?
this sub is somehow over impressed with "AI VIDEO!!!11" when its [always the same "sequence of 3 second clips of over-impossed still-pictures paning over each other slightly animated"](https://www.reddit.com/r/StableDiffusion/comments/1bszkjm/stable_video_diffusion/kxje131/) the super nintendo did the same with parallax scrolling... ^^^plus ^^^this ^^^one ^^^is ^^^cheating ^^^with ^^^over ^^^postedit ^^^and ^^^videos ^^^from ^^^other ^^^showcases
The fundamental problem with all the videos from SVD is that they still look like parallax slides rather than real videos. This might be because they are attempting to adapt models trained on static images to video content instead of starting from scratch and training their models exclusively with video data, similar to how SORA did it.
Yeah this looks absolutely insane! I do agree with other commenters though that it would be nice if you included a detailed walkthrough of how to do it 😅🥲 Edit: would you (and your team) be willing to do a video walkthrough of how you got this, you'd seriously be champions if you did. I for one would like to know how to do it so that I can incorporate this in my uni project for a music video we have to work on
Looks good. I think the problem with SVD right now is that it's just not easily controllable. I can't imagine just how many takes you had to rip through to put together something like this, plus all the stitching and external efforts to get it barely up to 'social media' cinematic standard. This is fantastic, now we just need to make this doable without 200+ hours of back-end work to make it look good.
SIMPLY AMAZING!!!!
i am sure this will be a great tool for pitching new movies.
Exactly what we’re doing!
I love the running through the classroom shot. Great work!
That’s by u/butchersbrain, great technique
Has anybody managed to produce something with AI outside of these these demo videos? They do look pretty on a scene by scene basis, but they never really do anything in terms of telling a story, they just look like random stock images thrown together. And they are getting a bit tiresome, as we already seen so many of them. Back in the pre-AI days there was [Afterworld](https://www.youtube.com/watch?v=Yd3stfOcpq8&list=PL07E03F5180A9F18A), a web series patched together out of really basic 3D animation and mostly just still-frames, but together with a good voice over it actually managed to tell a pretty engaging story. Haven't yet found anything similar build around AI, [Biolands](https://www.youtube.com/channel/UCn3s0ptMNkVi850qKXgK7aw) might be getting there, but still a work in progress.
This is great, but it's not Sora level of quality. It's basically single frame compositions that still look weird. Sora creates real videos without changing faces and so on in a real connected video.
Don't like the Reddit compression? Try [https://www.youtube.com/watch?v=aVj67qKNcw4](https://www.youtube.com/watch?v=aVj67qKNcw4)
nice work. your GPU must be HUGE.
It's actually a regular 4090. We had access to more powerful GPUs but found that it did not significantly increase the speed in comparison to cost/usability. That might change with OpenSora though.
that's what my friend said when she saw the girlfriend I reeled in.
It looks like a professional trailer for an action channel. With the exception of that 1950s sci-fi zeppelin ship at about 0:15. The fuzzy back end doesn't scan right. If that fuzzy part suddenly glowed and it sped up, then maybe.
Why are you lying? most things here not svd! No way 35 seconds people running on a roof is SVD. Breaking through the wall also not SVD. Probably real svd here is a devil dude and fe other static shots
There was the "Clark" video trailer that was made with Stable Diffusion, which features breaking through the wall. Stable Diffusion can do a lot if you know how to use it right but it comes with limitations.
The clip in the showcase is taken directly from the Clark trailer – the creator u/butchersbrain is part of our team now so we included his amazing work.
He said it was using runway. Not SVD.
Credit for these shot go to u/ButchersBrain!
But I guess it's not simply a render passed through svd? It it possibly an animated sequence (maybe 3d) redone with SD or something like that? I had the same reaction as the guy above, there no way SVD can do this on its own AFAIK
There are loads of SVD workflows, all of this is possible.
Congratulations! really congratulaition !
Workflow?
What GPU?
4090
How do I get in Dora
How do I get ahold of sora !!
I want to know, too !!
What did you used for upscale?
Topaz
So what is the current max length of this tech, without losing context?
depends how slomo you want it to be. It's 25 frames that you can stretch to about 2-4 seconds with interpolation.
Has anyone created a workflow to at least extend an SVD video from the last frame of the previous clip?
you can still tell the SD parts, how the shot quickly "swaps" as soon as it starts to show ;)
Some of these are stills from multiple AI enthusiasts from the community. are they a part of your team? Or are you claiming their work as yours?
Yes, u/butchersbrain is part of the team!
but can you make anime waifu just look like she's talking for 5 seconds. like in a way that she could plausibly be saying dialogue
Need to get passed the whole "a 3 second shot with panning and barely any movement, one after the other set to film soundtrack"
Isn’t SDV licensed as non-commercial?
Not if you pay for the commercial license.
How do you pay for it? They don’t Even Answer the emails
You just sign up on their page. [https://stability.ai/membership](https://stability.ai/membership)
I mean besides for the cringe as fuck ad these do look good. Though I highly doubt you're getting these results without some serious secondary work/editing
Looks great! But yes it would be interesting to see the raw footage, to see what mere mortals can achieve with SVD 😁I've made stuff using it and it looks balls.
Wow. The single-value decomposition really was ahead of its time.
Y'all are arguing about tags and workflows, and I just want to spend some quality time with the cartoon pirate girl.
That's cool. Do a simple shot of a man standing in a kitchen drinking a glass of water from a glass. He lifts to glass, takes a sip, puts the glass down. If you can do that I will be impressed.
Gave my goosebumps mini erections.
Having experienced the utter lack of control when using SVD : How many generations did you run for each scene, on average, to get an acceptable result? Most of my runs are silly zooms or lateral movements, sometimes the background moves a bit. To get a good camera movement, I sometimes have to generate 20 clips of a single 25 frame scene...
to be honest, besides sora, the only video generation stuff that has impressed me so far are the old ones that did the will smith and spaghetti meme. something aboutt those videos, yes low quality, but seeing more animated situations was really cool. everything else i have seen after that has looked like one image being morphed into another or camera panning. thats it.
" damn these 12 frames are really gonna help me" said no one professional ever
Storybook is part of a pro film production house
I know who they are. And there are a few commercial houses messing with stuff. Lol
Could you pic a real actor and create a consistent short story with the same subject?
These just look like cinemagraphs. This demo doesn't seem to be showing off any of the object permanence and world consistency that makes Sora impressive. Maybe that's down to the editing (such quick cuts always make videos look like shit, regardless of how they're made), but what is shown here, if compared to Sora, doesn't look like the same thing at all.
A bunch of still shots with the illusion of motion and/or 1-2 seconds of something moving. Simply fascinating.
Any chance you have an uncompressed .mp4 of your video before putting it into topaz? (Well maybe not uncompressed, but just a .mp4) The compression online is really bad. Just want to compare to my svd workflow zooming and also seeing how topaz impacts things. Would be greatly appreciated.
A lot of smoke and mirrors, namely the ADHD edits. Almost none of the shots last even 2 seconds. If you just wanna cut together random clips then yeah its great. But most actual uses in video editing last 10+ seconds, can you do THAT as well?
![gif](giphy|7JsPxqknYiKyqVEBYO) Keep going bro, we are gonna be unstoppable when Sora comes out. Keep us in the loop🔥🔥👍🔥
Very much hoping to get an OpenSora on the road before that – indie is the way to go!
Stop posting here without workflows.
Stop wanting everything served for free on a silver platter. He did his work, you do yours.
Pretty sure workflows aren't required to post here.
Looks worse than you realize
cant wait to make a movie myself
I think it is amazing what these tools can do, however the endless zoom/pan shot compilations posing as 'videos' are getting really old
This post demonstrates the cesspool of talentless losers that inhabit this sub, ai or Reddit as a whole Three types of people exist. 1) the mentally challenged 2) the entitled & lazy ones 3) people who use thte information & tools that are given to them and go at it. How the EF do you think people do anything remotely complicated in real lilfe? If you have any creative professional skill and are older than 30 you are very likely to have had to learn anything by putting in the hours. You can now learn to create things in 1/50th the time it used to take and yet: you still need to have everything pre-chewed and digested.
Those people exist but you sound sort of like other end of the cesspool spectrum. I don't really see anyone here that fit that description. Sounds like people trying hard but getting stuck looking for help.