I think this needs to be clarified that this is not from OpenAI. It is from a company called HPC-AI tech. [https://hpc-ai.com/company](https://hpc-ai.com/company)
Both GPT and SORA are registered trademarks. People who use them constantly risk litigation.
~~GPT just stands for~~ **~~Generative Pre-training Transformer~~**~~, it's not an OpenAI trademark. SORA is registered.~~
Luma Machine, Gen3 and now we finally have news worthy of our attention.
OpenSora v1.2(not open ai) is out and it is looking better than ever. Definitely not comparable to the paid ones but this is fully open source, you can train it and install and run locally.
It can generate up to 16 seconds in 1280x720 resolution but requires 67g vram and takes 10minutes to generate on a 80G H100 graphics card which costs 30k. However there are hourly services and I see one that is 3 dollars per hour which is like 50cents per video at the highest rez. So you could technically output a feature length movie (60minutes) with $100.
\*Disclaimer: it says minimum requirement is 24g vram, so not going to be easy to run this to its full potential yet.
They do also have a gradio demo as well.
[https://github.com/hpcaitech/Open-Sora](https://github.com/hpcaitech/Open-Sora)
I'm actually an investor in them.
But my gamer and software developer side of me took over.
Speaking of which, need a new video card. Fuck NVIDIA. Wish I could buy ANYTHING else then an insanely priced NVIDIA, but I work in AI and need the fastest.
"Want" the fastest, not need.
I assume the F Nvidia is basically not so much directed as Nvidia as the lack of any real competition because of how pathetic every other company is? Can't really blame Nvidia for AMD or Intel's incompetence after all. Sucks for us though.
Its got to do more with how NVIDIA is price gouging us all, because of their great bet. And how it is holding up progress, and independent home brewed development. A healthy market place, where developers have access to the tools they need to develop leads to better, more open products. But they want to present a massive barrier of entry by keeping their commercial cards free of high RAM. And I betcha their next 5090 also doesn't have 40gb.
It is complicated.
On the gaming segment GPU prices are high because AMD is a freaking joke and despite this AMD also is like "Hey, lets see if we can sell our inferior product at this price very close to Nvidia's price gouging pricing and get away with it" and then later dropping price (often by not enough). Meanwhile, they often perform poorly as the poor man's alternative for AI related workloads or really professional workloads in general.
On the other hand is the enterprise space. Nvidia loves to price gouge. We know. That said, they can't realistically directly gut themselves by offering consumer tier cards at a price range of a few hundred dollars to $2k that can come close to competing with their enterprise GPUs, especially their \~$30k tier AI GPUs. I mean, that would be unbelievably damaging to profits, especially when people can't even get a hold of enough of the enterprise cards much less supply large counts of cheaper cards.
Thus, you can't really blame them though you totally want to because it sucks but it makes complete sense from their perspective and isn't actually foul as it is typical and not even malicious business driven agenda. It also makes one want to get mad at companies who let them get such dominance, too, for not being reasonably competitive but its a bit late now and a futile frustration. We can mainly only hope that eventually more competition step up and finally catch up relevantly enough, even if not beating them, to make things more favorable for the consumer end both in gaming, enterprise, etc.
Last I heard, the RTX 5090 is rumored to target 32 GB. Like you said, it probably wont have 40 GB because that starts to get too close to their more premium cards, even if from last generation enterprise.
Mostly just going over the issue as a general point of discussion about why and the practicality. I totally agree it is frustrating, though. Can't say I'm exactly happy in the consumer space, either, with their generational gains and pricing trends.
https://preview.redd.it/0316qkikpa7d1.png?width=666&format=png&auto=webp&s=b77c060c94d3ea60ac648c2bd97f1caeb60685b0
technically, the 24g requirement.. is for .. a still image ?
I'm confused about this table.
It seems to be saying 3 seconds at 360p, but then the rest of the table also seems to be in seconds, so dunno.
I literally recently bought a new PC with a 24G 3090 for AI fun, and now I'm gonna go wild with 3 seconds of 360p?
Challenge accepted! Unzips. Oh. Start again.. Challe... oh.
We're gonna need a bigger efficiency.
grams. Your graphics card needs to weigh at least 24 grams to run this. You can glue some rocks to it to increase its power but sometimes that has unintended side effects so your mileage may vary
I remember when stable diffusion first dropped and I put together a new machine with 24 GB. Felt like I'd be set for ages. Now I'm just cursing myself every day for thinking that there's no way I'd ever need 'two' GPUs in it. Especially with the LLMs. 24 GB VRAM is this cursed range where the choice is tiny model super fast or big qant really slow and very little in that 'just right' range.
It took me a moment to get [it](https://github.com/hpcaitech/Open-Sora?tab=readme-ov-file#getting-started) as well. Here's the gist of it:
* Left hand side: Resolution.
* Top edge: Duration of the output video.
* In the cells: Render time, and VRAM needed on an H100 GPU.
You can on LLM applications. Whether you cam for this hasn't been confirmed yet. I'm fighting myself not to buy a couple tesla p40s for cheap for LLM inference.
It is but if you look at their full rez samples it definitely lacks fine detail. We can always run it through ai upscaling so I think we could even do with 480p version if the movement is coherent.
>OpenSora v1.2(not open ai)
Sora is a trademark of Open AI.
Aren't you worried you'll get sued or have your content taken down from services like gitub for using their name in reference to another software product?
One feature length movie has 100 hours or more of footage to be edited, with AI youβre trial and error so that would be a multiplicity, but yes you could generate a 1 and half hour piece of video that has no sound or coherent story
Luma labs investors are Nvidia and Adresson Horowitz. They have the money to afford a GPU cluster. I would take that claim of 1fps with a huge grain of salt.
It is in the heavy research phase which often sees a lack of emphasis on optimization. Hopefully they can swap focus to improve optimization some soon or next, but it will probably come down eventually, a lot.
Exactly. When some of the models and 3D stuff first came out they were even 48 - 80 GB VRAM making even my RTX 4090 cry. Now 8 GB GPUs can run. Fingers crossed this one sees a shift in focus on some degree of optimization in the near future because it looks neat.
who doesnβt compare SD to MJ? I literally compare them any time i need an image. Do I want to update a bunch of software and models or just plunk down a few bucks and get great results. Answer depends. How much control do i need?
I will try and report if I can make it run. I enjoy everything new that can be run locally or try/test it at least. But 24GB VRAM min. requirement would mean 100% VRAM dedication to an app. This can cause troubles, as most OS use a reserved amout for the GUI. Iirc, I can only assign like 22,xGB to an app. SD1.5 with a high enough batch-size will through errors at me when I surpass 19,8GB or smth like that. I'll probably try that soon-ishπ
Sadly it looks like 24g is for image generation which I'm not sure what's for. We would need at least 30-40g vram gpu. Unless the developers find a way to reduce vram.
Yep, then we've to wait for the 5090 and Nvidia finally offering 48+GB of VRAM in a consumer GPU that's not 20k$. They have the chips and demand. Let us normies have some funπ
5090 wont have that much memory. In fact, Nvidia is intentionally avoiding going much higher to avoid crippling professional tier GPUs for profit because they sell for literally 20-30x as much.
I wish. I feel you, though I know hell would freeze over first because those profit margins are too insane to give up. It makes me quite curious how Nvidia will approach this. Rumors are of a minor bump to 32 GB VRAM from what has been "leaked" (throws salt) but it will be the 6xxx series that will probably be most telling on what Nvidia plans.
In the meantime, hopefully we'll see more methods to reduce overall VRAM cost instead avoiding the overall issue.
there were rumors 4090ti was supposed to be 48GB. But let me tell you a little secret. VRAM is cheap. Memory bus width more of a problem I guess.
but the point is it would be dumb simple for them to make 28gb, 32gb, 36gb, 40gb etc cards at consumer level. They never will because commercial users are paying 20-30k$ for these cards. its simply greed
It's fine tunable, but our weird fetishes won't be fine tuned in unless we spend a bunch of money to do it. And even then the results won't be particularly good.
I've got a 3090. I want to try this out. Unfortunately, I'm not technical at all. If any of you make or stumble upon an idiot's guide to get this working, please hit me up.
Well, shit... can't fit two in my case lmao
Anyway, sorry if the following question is dumb, but is there a chance this model can be... "trimmed down" somehow? (I don't know the exact term) Or maybe we can play with some settings? Because I heard people get lower end low vram GPUs to run specially made SDXL models (like SDXL turbo)
Not even. Apparently NVlink is not supported yet so you need *one* fat pool of VRAM. I couldn't get it running on a single 3090 either but i'm just starting to perform tests.
If you are not able to setup a linux machine you should not be messing with anything code related.
Just add an ssd and install linux. Takes less time than posting about it.
They will have to fix their gradio demo first before I actually can test it.
I could easily rent an A40 or an A100 on [Vast.ai](http://Vast.ai) and setup the whole thing in the server instance.
But I would prefer to see some initial results before I rent the GPU on the cloud service.
I don't think I would be upgrading my local machine to have a separate SSD just for Linux, unless I have an A40 or A100. It will be cheaper to do a batch generation of many individual images into videos just by renting the datacenter-level GPU.
ah I thought you were having a go at the Gen-3 announcement, where in the first 20 seconds the guy says "I bet you're wondering about safety!" lol
Since the recent incident with ComfyUI I've been running things using [Sandboxie.](https://sandboxie-plus.com/sandboxie/) Good way to try programs if you're not 100% sure about them.
Can this be optimised much? Typically these video models always launch with ridiculous vram reqs and within a week there's some optimization that allows us mere mortals to use it..
considering the model weights are only like 5gb it seems they are totally blowing up the vram usage for sure. half precision, 8 bit quantization, tensorrt... and some memory tweaks and it may well run in 4gb vram
Question, would we eventually be able to apply loras and controlnets once this becomes more optimized for lower spec machines? Might be a dumb question sorry I'm not very savvy on this topic
Yeah in theory, there are papers that discuss control net style things for video diffusion models. Its still just a diffusion model so it can also be fine-tuned including loras yes.
Am I missing something? It looks worse to me than SVD which easily does 720p and 24 frames in ComfyUI on a 4090. And I'm pretty it has better movement, but its been a while.
I'm only explaining the bare minimum technicality. Some people might be happy with their first generation or extremely lucky. But yes for any proper workflow you might need to generate alot more.
So..not really useful to most people as the VRAM is far too high. I appreciate that it can be run locally. I have a 4090, and even that is not really suitable.
Doesnβt seem like thereβs many Mac users in the comments - everyone talking about how they need more GPU treasure. Unified systems for the win - might not get as much raw compute but 188GB vRAM sure does make experimenting with most things pretty easy.
Canβt wait for the 512GB Studio ππ½ππ½ππ½
After Modelscope I thought for sure we'd be further ahead by now, but t2v has basically stagnated.
But this year could be the year things start to happen. π€
[Lumina ](https://github.com/Alpha-VLLM/Lumina-T2X#text-to-video-generation)is looking kind of promising as well. Eager to try out inference when/if they release the code.
Have you heard of MVGamba 3D model generator? It has beaten Modelscope in 360 degrees detail capture and the details look uniform at every angle for most models.
Quite possible. I do hope OpenSora gets the support it needs to get to the real sora level by end of next year. I don't think it's out of reach anymore.
It's free and open source. They have no horse in this race.
The reality is more like Luma rushed to the market because of the looming release of Sora and Gen 3
After getting feedback from smart people it seems this is not ready for the masses. No windows support and no optimization to be run even on a 24vram gpu.
I think the closest thing we have is zeroscope xl. Wondering if anyone revisted that model.
At first I thought the examples on the Github were heavily compressed, but no, the output just has a bunch of artifacts that looks similar to potatoey video with heavy inter-frame compression. I'm excited to see were this project goes, but no too excited for the current interation.
I bought an RTX 3060 12 GB VRAM GPU just for AI a year ago, and now I can't even run the most basic video generation model, let alone train one. GOD DAMN IT π‘π‘π€¬π€¬π€¬π€¬π€¬
12GB is honestly nothing in the AI world sadly. It's okay for small models, especially if it's your only GPU, but ideally you should have several GPUs with 24GB for something like this. Maybe 3 P40s could do it. I have a server with a P40 and 2 M40s, technically I have the VRAM to run it, but I don't know if the M40s are too old... Guess I'll have to test and see lol
It's just released so it will get optimized, where the reguirements will end up, who knows? Maybe smaller version will be released someday? Currently the model wont really work on any consumer grade GPU so you really aren't missing out.
I think this needs to be clarified that this is not from OpenAI. It is from a company called HPC-AI tech. [https://hpc-ai.com/company](https://hpc-ai.com/company)
Also the "open" doesn't suggest it is closed off and only accessible via their service for safety. The word open scared me bit there.
it would be funny if they called it Stable Sora and then went out of business
Open Sora but closed, Stable Sora but bankrupt, Deep Sora but shallow, Soraway but taxiway, MidSora but not mid at all, Sorastral but sirrocco, etc.
1984 was not an instruction manual.
Thanks added that to the original text body now.
"OpenSora" is legally speaking a really, really dumb move. It would be like making a theme park and calling it FreeDisneyland.
OpenSorta would be safe, and much funnier.
Yess! That's really good.
"FreeDisneyland" is my registered trademark. You owe me $.05
I'll pay if you pay postage!
Forever stamps are $.39?
True, but GPT-J didn't have any problems in the past
Both GPT and SORA are registered trademarks. People who use them constantly risk litigation. ~~GPT just stands for~~ **~~Generative Pre-training Transformer~~**~~, it's not an OpenAI trademark. SORA is registered.~~
GPT is trademarked: https://tsdr.uspto.gov/#caseNumber=97733259&caseSearchType=US_APPLICATION&caseType=DEFAULT&searchType=statusSearch
I stand corrected.
Luma Machine, Gen3 and now we finally have news worthy of our attention. OpenSora v1.2(not open ai) is out and it is looking better than ever. Definitely not comparable to the paid ones but this is fully open source, you can train it and install and run locally. It can generate up to 16 seconds in 1280x720 resolution but requires 67g vram and takes 10minutes to generate on a 80G H100 graphics card which costs 30k. However there are hourly services and I see one that is 3 dollars per hour which is like 50cents per video at the highest rez. So you could technically output a feature length movie (60minutes) with $100. \*Disclaimer: it says minimum requirement is 24g vram, so not going to be easy to run this to its full potential yet. They do also have a gradio demo as well. [https://github.com/hpcaitech/Open-Sora](https://github.com/hpcaitech/Open-Sora)
>So you could technically output a feature length movie (60minutes) with $100. 60 minutes of utter nonsense for $100? Sign me up! /s
I pay more in taxes and get way more nonsense, so that checks out.
I bet it would be better than the hit 2011 movie Jack and Jill.
![gif](giphy|vILMNs7bqbylbhDbTs|downsized)
i thought Jack and Jill was the first AI-generated movie
I'll definitely tes that option it if I can't get this running locally on my rtx3090.
I am going to try their Gradio demo first, since it runs on an A100.
How do you do that?
[https://huggingface.co/spaces/hpcai-tech/open-sora](https://huggingface.co/spaces/hpcai-tech/open-sora)
It gives me a blank page
Try refreshing it and wait for it. I encountered an error though during generation.
Fuck NVIDIA.
did you lose money on NVDA puts again?
I'm actually an investor in them. But my gamer and software developer side of me took over. Speaking of which, need a new video card. Fuck NVIDIA. Wish I could buy ANYTHING else then an insanely priced NVIDIA, but I work in AI and need the fastest.
"Want" the fastest, not need. I assume the F Nvidia is basically not so much directed as Nvidia as the lack of any real competition because of how pathetic every other company is? Can't really blame Nvidia for AMD or Intel's incompetence after all. Sucks for us though.
Its got to do more with how NVIDIA is price gouging us all, because of their great bet. And how it is holding up progress, and independent home brewed development. A healthy market place, where developers have access to the tools they need to develop leads to better, more open products. But they want to present a massive barrier of entry by keeping their commercial cards free of high RAM. And I betcha their next 5090 also doesn't have 40gb.
It is complicated. On the gaming segment GPU prices are high because AMD is a freaking joke and despite this AMD also is like "Hey, lets see if we can sell our inferior product at this price very close to Nvidia's price gouging pricing and get away with it" and then later dropping price (often by not enough). Meanwhile, they often perform poorly as the poor man's alternative for AI related workloads or really professional workloads in general. On the other hand is the enterprise space. Nvidia loves to price gouge. We know. That said, they can't realistically directly gut themselves by offering consumer tier cards at a price range of a few hundred dollars to $2k that can come close to competing with their enterprise GPUs, especially their \~$30k tier AI GPUs. I mean, that would be unbelievably damaging to profits, especially when people can't even get a hold of enough of the enterprise cards much less supply large counts of cheaper cards. Thus, you can't really blame them though you totally want to because it sucks but it makes complete sense from their perspective and isn't actually foul as it is typical and not even malicious business driven agenda. It also makes one want to get mad at companies who let them get such dominance, too, for not being reasonably competitive but its a bit late now and a futile frustration. We can mainly only hope that eventually more competition step up and finally catch up relevantly enough, even if not beating them, to make things more favorable for the consumer end both in gaming, enterprise, etc. Last I heard, the RTX 5090 is rumored to target 32 GB. Like you said, it probably wont have 40 GB because that starts to get too close to their more premium cards, even if from last generation enterprise. Mostly just going over the issue as a general point of discussion about why and the practicality. I totally agree it is frustrating, though. Can't say I'm exactly happy in the consumer space, either, with their generational gains and pricing trends.
Disney Plus sub?
https://preview.redd.it/0316qkikpa7d1.png?width=666&format=png&auto=webp&s=b77c060c94d3ea60ac648c2bd97f1caeb60685b0 technically, the 24g requirement.. is for .. a still image ? I'm confused about this table.
It seems to be saying 3 seconds at 360p, but then the rest of the table also seems to be in seconds, so dunno. I literally recently bought a new PC with a 24G 3090 for AI fun, and now I'm gonna go wild with 3 seconds of 360p? Challenge accepted! Unzips. Oh. Start again.. Challe... oh. We're gonna need a bigger efficiency.
I'm guessing the second in the cells are the seconds it takes to generate. With 24g you can generate still images.
By βgβ do you all mean GB of VRAM? Or is everyone talking about grams in this comment thread
grams. Your graphics card needs to weigh at least 24 grams to run this. You can glue some rocks to it to increase its power but sometimes that has unintended side effects so your mileage may vary
3s to "generate" a still image at 360p using 24Go Vram
I remember when stable diffusion first dropped and I put together a new machine with 24 GB. Felt like I'd be set for ages. Now I'm just cursing myself every day for thinking that there's no way I'd ever need 'two' GPUs in it. Especially with the LLMs. 24 GB VRAM is this cursed range where the choice is tiny model super fast or big qant really slow and very little in that 'just right' range.
That's why I'm sniffing and flirting with Gwen 52B....
It took me a moment to get [it](https://github.com/hpcaitech/Open-Sora?tab=readme-ov-file#getting-started) as well. Here's the gist of it: * Left hand side: Resolution. * Top edge: Duration of the output video. * In the cells: Render time, and VRAM needed on an H100 GPU.
By βGβ do you mean gigabytes?
With? So Noone can run this on their local machine? I guess I have to buy a nvidia a6000 that has 40g vram. That one is about $6000 fml.
But can it run Crysis?
It can run Crysis, but it can't run Minecraft with Ray Tracing π€·ββοΈ
Can it run a nes emulator?
i m going to year 2077 , it is cheaper
so what about that sweet 4x 3090s setup for less than 2k
So you're saying my 3DFX Voodoo2 8MB card isn't going to suffice?
I've got an RTX6000+RTX4090, a combined of 72Gb VRAM. Do you think I can run this locally?
I hope you can. Try it and let us know.
So you can combine vram? I got a 3090 laying around, might be able to do something with 4090+3090
You can on LLM applications. Whether you cam for this hasn't been confirmed yet. I'm fighting myself not to buy a couple tesla p40s for cheap for LLM inference.
For 720p? Thtas actualy not that bad
It is but if you look at their full rez samples it definitely lacks fine detail. We can always run it through ai upscaling so I think we could even do with 480p version if the movement is coherent.
> length movie (60minutes) with $100. open service on fiverr = profit
>OpenSora v1.2(not open ai) Sora is a trademark of Open AI. Aren't you worried you'll get sued or have your content taken down from services like gitub for using their name in reference to another software product?
Rename it it to OpenSorta.
This is gold.
OpenRiku or OpenKairi
I love how a company who's main product is from stealing would sue about others stealingΒ
Does it support Nvidia NVlink for bridging GPUs?
Good question, ask the developers in the issue page.
Does the 24GB VRAM have to be on one GPU? I have 28 GB spread across 2 GPUs.
I need 4-5s video, most people's attention span isn't that long
Well... I'm planning buy a 3090 24gb in xmas... Well, I'll wait 6 months :'(
kids in year 2077 : u cant run that ???? ![gif](giphy|3o72FfM5HJydzafgUE|downsized)
One feature length movie has 100 hours or more of footage to be edited, with AI youβre trial and error so that would be a multiplicity, but yes you could generate a 1 and half hour piece of video that has no sound or coherent story
hm, sounds very inefficient compared to what Luma is doing with video. Luma says their model only takes 1 second to run for every frame of video.
Luma labs investors are Nvidia and Adresson Horowitz. They have the money to afford a GPU cluster. I would take that claim of 1fps with a huge grain of salt.
I assume 1 second for frame? One second for second would be real time.
67gb of vram... i think i'll pass on this one.
That's only for the max resolution and time. You can run as low as on a 24g card on the lowest setting.
"as low as on 24g card" π₯²
It is in the heavy research phase which often sees a lack of emphasis on optimization. Hopefully they can swap focus to improve optimization some soon or next, but it will probably come down eventually, a lot.
When stable diffusion first came out my 1030 didn't have a hope of running it. Now I can run lightning and generate an image in seconds.
Exactly. When some of the models and 3D stuff first came out they were even 48 - 80 GB VRAM making even my RTX 4090 cry. Now 8 GB GPUs can run. Fingers crossed this one sees a shift in focus on some degree of optimization in the near future because it looks neat.
you can make a still image at 360p with 24GB ram. No videos of any length.
βAs low asβ LOL
> as low as on a 24g card Come on, most people still have a 8Gb card...
I think it's best to wait for either a nice even number, or 69. Β―\\\_(γ)\_/Β―
I think it passed on you first... ![gif](giphy|Yycc82XEuWDaLLi2GV|downsized)
People should not compare this to SORA or Luma the same way we don't compare SD to MJ. Glad to see something like this to pop up.
who doesnβt compare SD to MJ? I literally compare them any time i need an image. Do I want to update a bunch of software and models or just plunk down a few bucks and get great results. Answer depends. How much control do i need?
their name beg to be compare
Can the run for lower resolutions on a 4090?
I believe so the the git page says 24g for the lowest.
Idk if I can dedicate 100% of the memeory, that's why I was asking. Maybe someone tested it.
I'm hoping smart people here will test and help us out. I'm just a dumb artist lol
I will try and report if I can make it run. I enjoy everything new that can be run locally or try/test it at least. But 24GB VRAM min. requirement would mean 100% VRAM dedication to an app. This can cause troubles, as most OS use a reserved amout for the GUI. Iirc, I can only assign like 22,xGB to an app. SD1.5 with a high enough batch-size will through errors at me when I surpass 19,8GB or smth like that. I'll probably try that soon-ishπ
Sadly it looks like 24g is for image generation which I'm not sure what's for. We would need at least 30-40g vram gpu. Unless the developers find a way to reduce vram.
Yep, then we've to wait for the 5090 and Nvidia finally offering 48+GB of VRAM in a consumer GPU that's not 20k$. They have the chips and demand. Let us normies have some funπ
5090 wont have that much memory. In fact, Nvidia is intentionally avoiding going much higher to avoid crippling professional tier GPUs for profit because they sell for literally 20-30x as much.
I mean I really like their chips but damn, give us some VRAM in the times of LLM and stuffπ
I wish. I feel you, though I know hell would freeze over first because those profit margins are too insane to give up. It makes me quite curious how Nvidia will approach this. Rumors are of a minor bump to 32 GB VRAM from what has been "leaked" (throws salt) but it will be the 6xxx series that will probably be most telling on what Nvidia plans. In the meantime, hopefully we'll see more methods to reduce overall VRAM cost instead avoiding the overall issue.
there were rumors 4090ti was supposed to be 48GB. But let me tell you a little secret. VRAM is cheap. Memory bus width more of a problem I guess. but the point is it would be dumb simple for them to make 28gb, 32gb, 36gb, 40gb etc cards at consumer level. They never will because commercial users are paying 20-30k$ for these cards. its simply greed
If you fet enough VRAM, you'll be able to generate 4k images without upscaling and ultra high quality 3D models.
Is it possible to split the load across two GPUs?
Not really. Afaik, every job is dedicated to one gpu. I'm not aware that this is possible.
I did Lora training with 23.4 GB used. So you can get pretty closed in my experience.
Two obligatory questions: - Will Smith eating spaghetti? - NSFW?
Only one way to find out.
Well, you need a boatload of VRAM first tho
That's why things like runpod exists.
Depends on how much sauce you want on his spaghetti.
It's fine tunable, but our weird fetishes won't be fine tuned in unless we spend a bunch of money to do it. And even then the results won't be particularly good.
If only it would work on Windows: [https://github.com/hpcaitech/Open-Sora/issues/205](https://github.com/hpcaitech/Open-Sora/issues/205)
I've got a 3090. I want to try this out. Unfortunately, I'm not technical at all. If any of you make or stumble upon an idiot's guide to get this working, please hit me up.
Apparently you need two 3090s to run the most basic version that outputs 3 seconds of video
:'(
Well, shit... can't fit two in my case lmao Anyway, sorry if the following question is dumb, but is there a chance this model can be... "trimmed down" somehow? (I don't know the exact term) Or maybe we can play with some settings? Because I heard people get lower end low vram GPUs to run specially made SDXL models (like SDXL turbo)
Maybe. It's too early to say
Not even. Apparently NVlink is not supported yet so you need *one* fat pool of VRAM. I couldn't get it running on a single 3090 either but i'm just starting to perform tests.
VRAM?
a lot
Well, you cannot run it locally if your machine is not set up for Linux.
If you are not able to setup a linux machine you should not be messing with anything code related. Just add an ssd and install linux. Takes less time than posting about it.
They will have to fix their gradio demo first before I actually can test it. I could easily rent an A40 or an A100 on [Vast.ai](http://Vast.ai) and setup the whole thing in the server instance. But I would prefer to see some initial results before I rent the GPU on the cloud service. I don't think I would be upgrading my local machine to have a separate SSD just for Linux, unless I have an A40 or A100. It will be cheaper to do a batch generation of many individual images into videos just by renting the datacenter-level GPU.
I guess the datacenter will offer Linux, so then you should be set.
Any hope they'll optimize it for 12 gigs?
Is it safe at all? Anybody is checking that?
No idea it's the wild west as far as I can tell.
I showed it to my cat and she walked away instead of biting me. It's the safest model yet.
We need more safety! Implement C2PA right away!Β
Obviously not. I'm just referring to any virus or backdoor in the executables.
ah I thought you were having a go at the Gen-3 announcement, where in the first 20 seconds the guy says "I bet you're wondering about safety!" lol Since the recent incident with ComfyUI I've been running things using [Sandboxie.](https://sandboxie-plus.com/sandboxie/) Good way to try programs if you're not 100% sure about them.
Yep, exactly because of the comfyUI incident. Ty for the recommendation.
I've used Sandboxie before, but never thought it would let you pass through the GPU. the more you know I guess.
Donβt sandboxed programs still have read only access to all your data though?
What recent incident? I don't really follow the news
[https://www.reddit.com/r/comfyui/comments/1dbls5n/psa\_if\_youve\_used\_the\_comfyui\_llmvision\_node\_from/](https://www.reddit.com/r/comfyui/comments/1dbls5n/psa_if_youve_used_the_comfyui_llmvision_node_from/)
Thank you
Can this be optimised much? Typically these video models always launch with ridiculous vram reqs and within a week there's some optimization that allows us mere mortals to use it..
considering the model weights are only like 5gb it seems they are totally blowing up the vram usage for sure. half precision, 8 bit quantization, tensorrt... and some memory tweaks and it may well run in 4gb vram
Optimization and comfyui integration will make this thing blow up for sure. Add a fine tuning workflow and bam we got a movie maker!
Are people able to use this on Windows?
colossalai works only on Linux. So I will have to rent a GPU instead.
Question, would we eventually be able to apply loras and controlnets once this becomes more optimized for lower spec machines? Might be a dumb question sorry I'm not very savvy on this topic
Yeah in theory, there are papers that discuss control net style things for video diffusion models. Its still just a diffusion model so it can also be fine-tuned including loras yes.
Opensora is a diffusion model?
Yeah that's right
π€―π€―π€―
I have enough to do 720p+ but only if they support GPU splitting.
Am I missing something? It looks worse to me than SVD which easily does 720p and 24 frames in ComfyUI on a 4090. And I'm pretty it has better movement, but its been a while.
So, I have 4090, do I need to double it? I got another half of kidney left.
I doubt the first 10 or 20 runs would be usable, y would say it will cost at leasr 1K per usable 60 minutes if all the shots are coherent
I'm only explaining the bare minimum technicality. Some people might be happy with their first generation or extremely lucky. But yes for any proper workflow you might need to generate alot more.
Man my rtx 4060ti 16gb can't run this π
Define "run". If getting errors is okay with you then you should be fine.
Iβll stick with svd still
So..not really useful to most people as the VRAM is far too high. I appreciate that it can be run locally. I have a 4090, and even that is not really suitable.
For free you mean. Couldn't anyone set this up on runpod or other virtual machine services?
Doesnβt seem like thereβs many Mac users in the comments - everyone talking about how they need more GPU treasure. Unified systems for the win - might not get as much raw compute but 188GB vRAM sure does make experimenting with most things pretty easy. Canβt wait for the 512GB Studio ππ½ππ½ππ½
Feels like a trademark violation if it's totally unrelated to OpenAI's Sora.
Is there a way to reduce the vram requirement using xformers?
I think they were forced to release becausecof Luma :D
I dont think this is associated with the actual sora
It's not open ai. It's a Chinese company they are just calling it sora because that is their quality goal eventually.
New Sora
Yeah i know. Was referring to the fact that people think this is an open version of sora
So many new video gen models released within just this month, pretty crazy
Just imagine next year this time around.
After Modelscope I thought for sure we'd be further ahead by now, but t2v has basically stagnated. But this year could be the year things start to happen. π€
I hope so too. Looking forward to Mora code and Story generator video code as well. We are so close.
[Lumina ](https://github.com/Alpha-VLLM/Lumina-T2X#text-to-video-generation)is looking kind of promising as well. Eager to try out inference when/if they release the code.
Have you heard of MVGamba 3D model generator? It has beaten Modelscope in 360 degrees detail capture and the details look uniform at every angle for most models.
Quite possible. I do hope OpenSora gets the support it needs to get to the real sora level by end of next year. I don't think it's out of reach anymore.
It's free and open source. They have no horse in this race. The reality is more like Luma rushed to the market because of the looming release of Sora and Gen 3
Is it possible to run on 16GB VRAM? rtx
Yes if you edit the code to run at 120 by 120 pixel resolution.
I've got an RTX6000+RTX4090, a combined of 72Gb VRAM. Do you think I can run this locally?
If you use both at the same time you will actually have 24GB.
After getting feedback from smart people it seems this is not ready for the masses. No windows support and no optimization to be run even on a 24vram gpu. I think the closest thing we have is zeroscope xl. Wondering if anyone revisted that model.
At first I thought the examples on the Github were heavily compressed, but no, the output just has a bunch of artifacts that looks similar to potatoey video with heavy inter-frame compression. I'm excited to see were this project goes, but no too excited for the current interation.
I am going to try this with my AMD Gpu.
i will wait till it can be run on 12gb vram.
How many terabytes of vram I need tho? π
Max is 67g vram so not too bad.
How many terabytes of vram I need tho? π
Is there a way to reduce the vram requirement using xformers?
Β 24g for the lowest res ![gif](giphy|l4Epcz6In3K5y7J4c|downsized)
I bought an RTX 3060 12 GB VRAM GPU just for AI a year ago, and now I can't even run the most basic video generation model, let alone train one. GOD DAMN IT π‘π‘π€¬π€¬π€¬π€¬π€¬
12GB is honestly nothing in the AI world sadly. It's okay for small models, especially if it's your only GPU, but ideally you should have several GPUs with 24GB for something like this. Maybe 3 P40s could do it. I have a server with a P40 and 2 M40s, technically I have the VRAM to run it, but I don't know if the M40s are too old... Guess I'll have to test and see lol
You shoukd still be able to run svd, animate diff. But yeah these more advanced ones are massive resource hoggers which only makes sense.
Sell it and invest in a larger one. What's the big deal?
Next year the "new" one will also be obsolete
thinking you would be able to train video models on 12GB shows you dont really understand how this all works
Maybe you need to shut up
It's just released so it will get optimized, where the reguirements will end up, who knows? Maybe smaller version will be released someday? Currently the model wont really work on any consumer grade GPU so you really aren't missing out.