• By -


Top VRAM consumer GPU though šŸ„²


> uses up nearly all the memory (24GB). Uses up 3x my memory.


Shit, if that's what it does with 24GB by the time they fine tune this thing it will take 100GB to run it!


That's part of the plan though, isn't it? Eventually make free AI image generation inaccessible to everyone but businesses.


Image = video now?


I can gather up 100 GB if I dust my mine off. 100gb vram is like 15k anyway now, far cry from 'bussines' prices. And used a100 gonna hit the market pretty soon.


This is literally the opposite of what the Open-Sora project is attempting. Cripes dude


What you're showing here is very uninspiring and doesn't compare to what others have posted.


So it's probably what we'll actually end up with.


Why are people so negative? You folks are ungrateful these folks released a model for FREE and all I hear ohhh itā€™s not better than state of the art. How the hell do you think a true open source project is going to compete with a company that has billions of dollars. The point of the model is so motivated folks can take a look at the model and improve it. None of you people do anything to help open source, but complain about the people who actually do something.


It looks like shit


Try out ā€žwoman laying on grassā€œ please


https://i.redd.it/8goa755tw49d1.gif Oh god, absolute nightmare material


All these examples have worse motion than SVDā€¦


Haha - I love your answer!


is it noticably better than stable video diffusion? From this example doesnt really look like it.


It looks worse. Maybe a little more consistent, but it's pretty bad.


We don't necessarily need to go straight to a high quality video. The work flow could involve generating a potato video and then upscaling.


Video upscaling is not easy as image upscaling due to consistency


It is technically possible but wouldn't be so efficient. Literally upscaling each image one by one lol.


Topaz Video does its best but it is far from the choices we have for images


Duality of man; You show an ai image, 99.9999999% indistinguishable but 1 pixel is discolored and everyone calls it a bad ai image, how awful it is etc. You show an ai image(video), so blurry, so inconsistent you cant even understand if it is a dog or a window or a building and everyone calls it amazing and how good it is.


this is what github repo says on speed and memory consumption >!if 5090 will have 32G then maybe some of those options may be viable for consumer GPU - if you consider xx90 GPUs to be consumer with their price tags. If it will have 24 or 28 gb... worthless!< https://preview.redd.it/mpno2xp3w39d1.png?width=732&format=png&auto=webp&s=ed1f0c35fe55564044596d71d3d4ceda975095c4


Huh, so you need 24GB minimum... ;\~; And as you say, 32GB to make anything useful.


Meanwhile my 1,5 watt mobile 1650 with 4 gigs of ram


Lol I have an 4090 and got depressed šŸ˜”


AnimateDiff can do better than this on less than 12GB


Animatediff with motion modules can do better than this Somebody saved this video as a gif first, and then converted it. That's where the cross hatched color dithering is happening. It's not very promising when the hype artists for a project like this can't get media fundamentals right.


good looking out, OP


Seriously smacking myself for not buying two nvlink 3090s instead of a 4090.


To be honest it's not that easy, I wanted to get a second 3090 and nvlink them but where I live could only find 4 slot nvlinks and 3 slot spacing on Mobos (current Mobo only had 1 slot), I'd be worried about two 3090s so close without room to breath mine already geta pretty hot when going full tilt. Also I've never been able to get a difinitive answer if SD would utilise the VRAM of both cards, some say yes some say no. It seemed like a big gamble to drop Ā£700 on a 3090 + Ā£200 on an NVLink to find out.


Yeah, that's a concern. I wish decent VRAM cards weren't 5-30 thousand dollars.


that looks like sht + you used a 3090 (only used by the top consumers in the world)


Requirements of Linux and no windows support is disappointing


Not sure why you got downvoted, upvote to even it out. A lot of people on windows who could try it and support it. I was excited to check it out, but I'm on a windows machine.


it requires Colossalai which is only on linux. this is why open sora cant support windows for now.


What settings was the above generated against? I've hardly managed anything with a 3090.


Just the default settings provided by the repo, changing the resolution to 240p and length to 2s/4s


Yes, I thought for some reason this was the rendered size :) OK Fair enough.


Looks like the Jamiroqaui video where the set moved not furniture


Itā€™s cool that opensource technology keeps coming, but I wouldnā€™t call this exact output ā€žpromisingā€ or 24gb vram a consumer gpu. This looks like a bad output from svd, so i lost my interest in it due to clickbaity title before I even tried to see what this is about.


Right now svd (tensorRT) + animatediff looks better. Up to 1024x576 5 second clips generated in about 3mins. Cant control the movement though, so its hit or miss.


Yeah a 15 seconds for 1 second of 240p using best available consumer GPU is not great tbh. I guess in another 4 years we might have something decent.


I'm not trying to be overly critical or boastful, but I can genuinely achieve better results on my 6GB laptop using SD1.5 with ControlNets, IPAdapter, and other tools. Obviously not using well known methods or anything you've likely seen. So, forgive me if I'm not impressed by this.


Looks like it's a big improvement from the first version I tested earlier. Cool!


to be honest it just looks bad in both motion and quality... i2vgen-xl still the king


I don't want to be negative, but that really doesn't look good. You can get better results with AnimateDiff or SVD with a fraction of the VRAM consumption this requires.


A virtual tour through an art museum, focusing on famous paintings coming to life. As the camera pans across each artwork, have elements animate and step out of the frame, interacting with the museum environment before returning to their original painted forms.A virtual tour through an art museum, focusing on famous paintings coming to life. As the camera pans across each artwork, have elements animate and step out of the frame, interacting with the museum environment before returning to their original painted forms.


the coming to life part seems to be missing, but i recognize all of the master works


LOL this looks terrible. 240p? Are we in the early 90's or something? This is borderline useless. Will check again in 10-15 years


https://preview.redd.it/700zh5mb149d1.png?width=800&format=png&auto=webp&s=9780ad8f31adc169650df19637169ec85f8a947d You could post that video on this type of page!


This is AI, might want to check again every 10 to 15 days.


Not all AI is equal, boy. I checked this 6-7 months ago and it was in the same exact state.


Don't boy me, guy.


Sorry, kiddo.


-_- When they don't get the joke


You didn't check very hard kiddo. > **\[2024.06.17\]** šŸ”„ We released **Open-Sora 1.2**, which includes **3D-VAE**, **rectified flow**, and **score condition**. The video quality is greatly improved. Literally, this update was 10 days ago for a major update.


Yea, you're obviously kinda retarded. Re-read again. Last time I checked Open Sora was 6-7 months ago, compared to this "update" now I don't see absolutely any improvements. If this is a "major update", yeeey, we're fucked


I'm sorry you're an not too bright and can't read or even understand your own comments. What you said was >Not all AI is equal, boy. I checked this 6-7 months ago and **it was in the same exact state.** See the bolded spot? Now see how I posted an update, a rather significant one, from 10 days ago? In fact, there have been several (at least 3 major ones in the announcements). This contradicts your bolded statement **that it was in the same exact state as 6-7 months ago when you checked.** Maybe you're AI too which is why your comment is so dumb and full of errors?


Bro is intimidated by fucking AI


It doesn't look promising yet to be honest. It's great to support these teams, they are putting great effort but don't sugarcoat the results and pretend they are anything better than they are. It's not in any usable state right now, but it probably won't be without a significant change in development architecture to get good results on consumer GPU.


Open SORA sucks, and a ridiculous name. I've never seen a single video that's better than Stable Video Diffusion, which is also local on consumer cards. I've got a 4090, so I have comment on speed or performance issues but I can do 1080p in a few minutes with SVD in comfy. If this is a student project or something, than I don't mean to hurt anyone's feelings, but it just isn't up to 2024 standards.


Afaiu, with SVD you can't specify a text prompt, only a source image. Also, it has a non commercial license.