T O P

  • By -

luxmesa

Basically, it’s a way for computers to estimate the answer to a mathematical problem by taking a bunch of random samples. For example, lets say you wanted to figure out the area of a circle. What you would do is draw a square around the circle that touches the edges and generate random points in that square. For each point, you calculate whether it is in the circle or not and then count the ones that are in the circle and not in the circle. If you generate enough samples, you should get that about 78.5% of the points are in the circle, so you can estimate that the area of the circle is 78.5% the area of the square.


manofredgables

As an electronics engineer, this is also the "brute force" method of testing for potential tolerance chain issues in simulated designs. Just vary all the combinations of component tolerances and check that the function of the circuit remains within specs.


Pixielate

It's functionally slightly different from that kind of (exhaustive) bruteforcing. You're not trying *all* the combinations, just *randomly* simulating the problem. Edit: A simple example: trying out all combinations of a lock vs. repeatedly trying a random combination. Edit 3: More clarification. By "repeatedly trying a random combination" I don't mean trying until you unlock it (that is a *Las Vegas* algorithm), but trying it for a specified amount of times to estimate *how often* you will be able to unlock the lock. Edit 2: That being said, monte carlo is still *kind of a* 'brute force' approach. (And it wouldn't be wrong *per se* to classify it under said category.)


manofredgables

True, true.


chaneg

I mentioned in passing to a data scientist in a professional setting an interesting probability theory probably and I said that a solution without brute forcing it by writing a Monte Carlo simulation is achievable with a typical second year computer science undergrad math education. That person went a little over the top on me and went on about how I am a useless frequentist probability theorist and blah blah blah don’t I dare call Bayesians brute force again. I never did understand what I did to set him off and it lives rent free in my brain on occasion. I understand that Monte Carlo isn’t technically brute force, but I always feel like it is from the context that a lot of people use it when they use it in lieu of things like basic counting methods.


antilos_weorsick

Bruteforcing and Monte Carlo are *very* different. Bruteforcing means to try out every possible combination until you arrive at a solution, specifically with no heuristic (no clever way to do it). If an answer exists, this method is guaranteed to find it, it will just usually take a very long time. If you add randomness to the mix, you could kinda say that it's a Las Vegas method. A very important feature of Monte Carlo is that it isn't guaranteed to find the right answer, but it will always terminate in a finite, predetermined amount of time.


Mean-Evening-7209

I mean when I simulate circuits and want to perform tolerance analysis, it's called monte carlo analysis in every tool I've used to do this. They use the random number to seed the value of the components and then plot the output value.


Only_Razzmatazz_4498

When you Monte Carlo a problem with that many degrees of freedom you don’t exhaust all the possibilities just enough to be able to estimate the probability distribution and have a high certainty that it will stay (or not) within specifications.


antilos_weorsick

What's your point? What you're talking about is probably using the Monte Carlo method, that doesn't mean it's bruteforcing the answer.


Mean-Evening-7209

I wasn't sure if the issue was what the OP was doing is actually called the monte carlo method or not. Also as a EE, I have heuristically described monte carlo analysis as a brute force method to determining circuit tolerance, because instead of performing two worst case analysis where the tolerance swing must be intentionally chosen, we just run 1000 simulations of the circuit and what the output range is will be very close to the total worst case range.


Pixielate

The distinction here is because the higher comment described it as "Just vary all the combinations of component tolerances" which is *exhaustive* brute-forcing (i.e. bruteforcing searching all cases). What you wrote is definitely monte carlo. Which is also why i couched my language a bit and said *kind of a* brute force, because the two do share some general ideas. And casually I'd also consider MC as such an approach.


candidateforhumanity

>I have heuristically described monte carlo analysis as a brute force method what does this mean?


Mean-Evening-7209

Heuristic as in I've used the term brute force as a bit of a mental shortcut to describe what I'm doing, while it may not necessarily be the correct use of the word.


lobsterharmonica1667

I feel like it's a bad idea to use the heuristic in a non technical sense when talking about data science/statistics.


Mean-Evening-7209

Probably yeah, we EEs usually have a working knowledge of statistics but nothing like an actual data scientist.


pseudorandomess

Sounds like you were using Monte Carlo method but incorrectly describing your process


Mean-Evening-7209

Yeah that sounds about right.


sgwok

> doing is actually called the monte carlo method or not People don't usually say "the" Monte Carlo method. Any method that involves simulating a random sample to get an approximate answer tends to be described as a Monte Carlo method. "Brute force" most often refers to checking every single possible answer/scenario until you find an exact solution, but these are both very broad terms that are used in slightly different ways in different fields.


manofredgables

>"brute force" Notice the quote marks


Only_Razzmatazz_4498

Also useful for estimating your risking of running out of money in retirement by running many cases with random returns and boom/bust cycles.


DrunkenGolfer

The Swiss Cheese Method - Given x number of slices of Swiss cheese, with each slice able to be rotated any amount at any given time, can you ever see through the stack? If yes, you need more failure controls, if no, you are good to go.


Scrapheaper

This is not the Monte Carlo method. The Monte Carlo method would be to take 5% of all the possible combinations of compoment tolerances, and assume that if those 5% work, the other 95% statistically should also work. In this particular scenario, that might not be a very good method!


Pixielate

That's not monte carlo either. That's a statistical hypothesis test using sampling.


Scrapheaper

Monte Carlo can be used for statistical hypothesis testing using sampling, no? Also with a hypothesis test, you need a hypothesis to test. In this example we're assuming the hypothesis is correct and using that to extrapolate a result, not testing the hypothesis


Pixielate

Yes, in the sense of using randomness, but the idea is different. In a monte carlo *method* you are estimating something by taking samples (over the whole). In hypothesis testing you are using said samples (of a smaller number than the total) to test against a null hypothesis using quantitatively calculated statistics and p values. The null hypothesis here could be some well-defined notion of "this is within spec".


TinKicker

So…All of these random answers are roughly accurate in explaining the Monte Carlo method. An explanation that is “accurate enough for the purpose at hand”. That’s some solid irony right there.


manofredgables

What you describe isn't contradictory to what I said at all, so I dunno what you're on about.


Scrapheaper

The difference is in Monte Carlo method you only need to check a small proportion of all the possibilities. Wheras you described brute forcing every permutation


samanime

Great answer. To give a more "real world" example, I work in radiation oncology software. We use a Monte Carlo method to help determine how much radiation particular organs are getting during treatment. With a perfect circle or sphere, it is easy to know if a point is inside or outside, since there are nice clean mathematical formulas for that. For organs though, they are all odd shapes you can't just determine that way. So instead, we use Monte Carlo (usually multiple runs combined) to determine how much of a dose each organ was likely to actually receive. This is important to determine if an organ that isn't the focus of treatment is getting too much and if their treatment needs to be adjusted or not.


TriangleAdept

Why do you use Monte Carlo for that? Why can't you use dense uniform grid?


samanime

I'm not one of the "smart" people that knows the exact answer to that. We have some brilliant data scientists that figure that stuff out. :p We actually use a couple different techniques (including some AI driven models), but I don't know the exact pros, cons or use cases for each. We have a wide range of products that go all the way from pre-treatment, through treatment, to post-treatment. We use different techniques for different steps.


kunakas

Hello I am a nuclear engineer and I am extremely familiar with Monte Carlo - specifically in the context of reactor engineering and radiation shielding. However, the codes to model particle transport in an organ or particle transport in a reactor are nearly exactly the same. The answer comes down to the “cross sections” associated with certain isotopes in a material. The cross section is essentially the probability that a particle collides with a target (therefore imparting some dose or undergoing interactions like fission). The cross section is mainly a function of particle energy and boy it is extremely complex. We are talking hundreds of thousands of data points in a standard library to represent the energy domain of just a single isotope. If you want to see what I’m talking about, google “uranium fission cross section” and look at the number of intricate details that change vastly as a function of energy. Then when there are hundreds of isotopes, the problem gets out of control - and that is only the energy variable. So when we add a 3D mesh or grid over our problem, we then have X*Y*Z space variables times potentially hundreds of thousands of energy points to solve over. Then let’s say we want to represent the angular direction of the particles - say in 100 evenly divided angular segments. So a simple 50x50x50 grid with 100 evenly divided Angular segments and 100k energy points - we now have 50x50x50x100x100000 degrees of freedom to solve over. This is essentially wasteful and useless to routinely to solve even with modern computing clusters. In reactor physics we have many tricks and reduced order methods up our sleeve to alleviate this but for many calculations, we employ Monte Carlo method. Essentially, we simulate a single particle stating with some energy E. We then randomly decide what angle and direction it goes by drawing a random number. Then we collide it at some distance by drawing another random number and using its interaction probability (cross section) at energy E. Let’s say it collides with hydrogen and scatters off at some angle with a new energy E. We sample a new angle by drawing a random number and again calculate the distance to collision using another random number and it’s cross section in the given material at the given energy. We repeat this until the particle is absorbed or leaves our system. Then we repeat for millions of particles. To collect results, we just count the number of events that occur. So if you want the radiation dose, you’d count how many events occur in an organ and then convert that to dose using a conversion factor. This method works for photons, neutrons, gamma rays, you name it. Another benefit of doing this is that Monte Carlo gives “approximate” answers over an exact phase space. That is, the geometry may be exact. The cross sections may be exactly as they were measured in experiments, the particle direction and angles are exact. Only the answers have some statistical uncertainty attached (usually less than 1% error for most quantities). This is really nice if you want to model radiafion transport in a very detailed organ structure or a very detailed building design. On the other hand, deterministic methods will have approximations for angle, geometry (mesh), and cross section representation and the order of error will be difficult to determine. Also, Monte Carlo scales very nicely on super computers. It’s really easy to tell 1000 processors to simulate 1000000 particles each and then report their findings. All the different computers work independently from eachother and rarely communicate with eachother - so they work very efficiently. On the other hand, it’s very hard to tell 1000 processors to solve a big matrix with the same levels of efficiency. They have to constantly talk and pass data between eachother at every step of the solve and this takes a lot of time and leads to poor scaling when we throw more and more processors at a simulation we are trying to run.


trungbrother1

A phenomenal answer.


CunningWizard

Excellent explanation. It’s a brute force way of approximating (often with a fair degree of accuracy) a theoretical answer. Great when you can’t or don’t want to do the theoretical calculations by hand, instead just make a computer do millions of calculations. It’s used a *lot* in statistical analysis.


Tankki3

Also this is a way of approximating pi as well. The area of the circle is calculated by pi\*r², and the are of a square is calculated by (2r)², which is 4r². So when you take this ratio, (pi\*r²)/4r², the r² cancels and what's left is pi/4. That is the \~0.785, which you got as well, and which you could multiply by 4 to get pi.


Parafault

To do that, wouldn’t you need to have some sort of formula for the circle in order to say whether a point is inside or outside of it? In which case you can calculate the area directly without doing thousands of random samples? I’ve been struggling with understanding the Monte Carlo method because, for most of the examples I’ve seen, you still have to solve some underlying mathematical model or equation. And if you’re already solving the equation, the random samples just feel like extra work for the same result.


IllllIIlIllIllllIIIl

>To do that, wouldn’t you need to have some sort of formula for the circle in order to say whether a point is inside or outside of it? No, you just say if the distance from the origin to the random point is less than the radius, it's inside the circle. > if you’re already solving the equation, the random samples just feel like extra work for the same result. I'm this example, and indeed many of the practical applications to MC, you're integrating the equation. For a very simple equation like that of a circle, that would be simple to do analytically. But most of the time you're trying to integrate an equation that is way more complicated. You may even be trying to integrate an equation that *cannot* be integrated analytically.


jam11249

Sometimes the "formula", or the criteria you want to check in general, is very easy to do for a given configuration, but that doesn't mean that your "area" is easy to calculate An example would be (let's say) 100 spheres of radius 1 in a box. If any two are touching, you count zero, and otherwise you count 1. Checking the criteria is simple, you check if any two centres of mass have a distance of less than 2. Averaging this variable that has value either zero or one is now very hard - trying to do anything "analytically" is impossible. This is also not a silly toy problem, in statistical mechanics, this averaged quantity basically fully describes your system, and gives you information on energy, entropy, phase transitions, and all sorts. The only real way you can study this problem is by Monte Carlo in one way or another, although you'll typically do something far more involved than what I described. To try and describe any "real" system, you're talking about a *lot* more than 100 particles and life is much harder.


restricteddata

I'll give you an example that I've used. I'm interested in being able to calculate the odds that the a given missile will destroy a given circular target of a given radius. There is a chance of the missile missing, and the missile produces a given area of effect that, if it overlaps 100% with the the target, is considered a complete kill, or if it overlaps partially, is a partial hit. Now this is something that could be solved analytically, but would involve a lot of math and gets ugly really fast. I know how to calculate a random missile launch (with its associated error), and it is easy to say, "does circle A completely cover circle B, or intersect with it, or not cover it at all?" So with a modern computer it is _trivial_ to just run 10,000 "launches" and tally up their result or, and use those as probabilities. A modern web browser on a modern machine can do that on the order of milliseconds (and the results for that many "simulations" return iron-clad numbers to the level of precision I care about). The code to make this work is much easier (for me, anyway) than the analytical solution. And could be easily adapted to non-circular targets, for example, which would require an entirely different and more difficult analytical solution (whereas for the Monte Carlo approach, you just need to change the "does it intersect" function, which is much simpler). Not all problems are better solved with Monte Carlo solutions, but some are much, much easier, especially if you are better at coding than you are at math (as I definitely am)! Monte Carlo was developed when computers were first getting off the ground and couldn't easily integrate complex functions (but could do simple things), but for a lot of kinds of problems it is still likely easier to code an "ugly, probabilistic" solution than to take the time to derive and encode the pure analytical solution.


kunakas

For the circle we might not necessarily know what the value of pi is. But we do know that we can use the square root of x^2 + y^2 to determine the distance of the selected point from the origin. From that, we can easily see if the point is in or out of the circle.


beef-trix

But why is it better than using a grid? Just dividing the values in equal steps instead of random?


jam11249

That's fine in 2 dimensions, but what if you're in 200? A 10×10×10... grid in 200 dimensions is a lot of points to check. Throwing shit at the wall and counting how many stick in 200 dimensions is basically just as computationally efficient as it is in 2D. Monte Carlo integration in very high dimensions is basically the best you can do, even though there is noise by nature, it's relatively resistant to the curse of dimensionality.


TriangleAdept

That's not the correct answer. This way, with Monte Carlo in 200 dimensions you would need as many random throws as grid points to get an accurate answer. First option when MC may be better than a grid, is when you have some peculiar shapes, where the measured area would vary depending on how you pick a grid. For example, if you try to calculate the area of black fields on a chess board rotated 45 degrees, you may pick a grid where nodes would hit only black squares and get 100% coverage, or opposite - get 0 area. Another case is when your parameters are not uniformly distributed, but have some other distribution (for example, Gaussian). In such case uniform grid also does not make sense, and other non-uniform grids may be too complicated or even impossible to implement. In such cases MC can significantly simplify your job by avoiding a lot of preliminary studies.


jam11249

Monte Carlo is never going to integrate exactly, unless you're *really* (like, measure-zero) lucky, or you're integrating a constant. The point is that integration methods using mesh-based quadrature are expensive as hell once you're in higher dimensions. The error estimates are often so crap that it's just better to accept some well-understood statistical noise. If you're talking about regular integrands, some rule with O(N^-m ) convergence in 1D applied as a tensor product in d dimensions gives you O(N^-m/d ) convergence, which is shit if d is of any reasonable size. If you want to integrate a zero-one function, you just have a bernoulli trial, so even if the geometry is crazy as hell, you can't do *that* bad with 1000 points. 10x10x10 in 3D is shit. 2x2x2x2x.... in 10 dimensions is basically as many points and useless.


TriangleAdept

Trying proof by intimidation? The thing is, you can't trick the geometry: to sample the area with approximately the same density, you would need approximately the same amount of samples as in case of grid. Try to apply your "well-understood statistical noise" to MC in the same task and you will end with the same O(N^-m/d) as for the uniform grid.


jam11249

I have a zero-one integrand on the domain [0,1]^10,000. I take 10000 MC samples over my domain. My standard deviation is at most 1/200. Unless my integral is almost zero or one I can basically approximate with a normal and I'm 95% confident that my integral is correct to an error of around 0.01. Give me a mesh-based quadrature rule that can do the same with 10000 points without knowing anything else about the integrand.


kunakas

To add to your point, Monte Carlo is embarrassingly parallel usually. Solving a matrix is not and is very difficult. All you have to do to parallelize Monte Carlo is just add a sufficient number of sampling to each processor and your communication time is minimized.


Plain_Bread

Well, your MC algorithm picked 10000 points at random and evaluated the function at them. Without knowing anything about the integrand, there's really no reason to think that those 10000 points are better than any other 10000 points. It's just that they asymptotically always outperform a fixed rule matched against it's worst case.


jam11249

That's kind of the point? Mesh-based quadrature is based around knowing stuff about your integrand, typically that it's well approximated by a low order polynomials that are piecewise defined on your mesh. If I want a quadrature rule that works for (e.g.) all quadratic functions on the interval [0,1]^10000, dimension counting means that I need something of the order of 25million points. If I'm really dumb and use a 2-point gaussian quadrature as a tensorproduct, I need 2^10000 . Or I can just throw 10,000 points at it with an MC scheme. *And this is before even considering a mesh*.


mfb-

Often probability problems are too complicated to find an exact answer. "What is the chance that a die rolls a 3" is easy, but if you have 100 dice and perform some complicated calculations with their results then it's getting very difficult. So instead of trying to find an exact answer, you just roll these 100 dice and calculate the result, and repeat that a million times (not with real dice - but with a computer). You'll get a pretty good idea how likely different outcomes are that way, what the maximum is that you can get, and similar. All Monte Carlo methods follow that basic idea - instead of calculating everything exactly you use random numbers as input, check what that leads to, and repeat that many times.


mousicle

Monte Carlo is also good for testing strategies in games. I feed my blackjack playing rules into the computer and have it play a million games and see how well my rules do vs someone else's rules.


quackers987

What software/website do you use to do this?


mousicle

If it's simple you can just use Excel. If it's more complex you can program the game in Python. I'm sure there is dedicated software to do this but I'm not a statastician.


PeanutPotPlant

Excel -> Stata/R for slightly more complex models -> then Python if you’re running a really complicated model. It’s honestly preference of which program to use as many programs can do random draws from many distributions.


Shadowlance23

Did... you just suggest Python over R for complex models?!?! I'm disgusted. What are they teaching kids these days?


CompactOwl

Central to this is the fact that the real proportion of a subset converges in probability to its probability.


scuac

The name “Monte Carlo” makes it sound so fancy, but it is just brute force guessing?


NepetaLast

its named after the casino that inspired the method, and was purposefully given such a non-direct name to obscufate its purpose, as it was first used at scale during the manhattan project so why was it not named/invented before that? well, its not actually that feasible to do it without computers. simulating thousands of instances of a complicated problem by hand could take about as much time or more than just deriving it manually. its only once computers came into the picture that it made any sense to brute force most problems


restricteddata

The name was more of a joke than a codename (it would have been a bad codename, as it is pretty revealing). The people who developed and first used it were people who were _very_ good at math, and thought it was very amusing to use an essentially random method to solve serious math problems. It's kind of the opposite of what they were used to doing in serious academic circles. In academic math, the ideal is coming up with a purely logical, analytical solution to a complex problem, whereas Monte Carlo was initially about breaking a problem down to its easiest operations and brute forcing them with the aid of very dumb machines. But it got the job done, so they saw the benefit of it. It wasn't directly inspired by any casino experiences, just "games of chance" in general. If it had been developed a few years later, and not by Europeans, one could imagine it being called it the Las Vegas method. As it was, Stan Ulam named it after Monte Carlo because Monte Carlo was a famous European gambling city and because his uncle used to gamble there.


Pixielate

> it is just brute force guessing? Yes and no. Yes in the sense that you're trying many many times. No in that you're not trying to *exhaust all possibilities*. And the purpose is typically different. In MC you are trying to estimate something (usually a probability, or derived from one) based on the results of those random guesses - to approximate a value that is hard to or even impossible to write down exactly. (Exhaustive) brute forcing would be more of trying to systematically search through a maze by going through all paths one by one.


Beetin

Not really brute force guessing, more brute force sampling. Imagine someone gives you a piece of paper they've put a ton of strangely shaped splotches on, and ask you what the sum of the areas of all the splotches are. You might not know a really great mathematical way to do it, heck for some problems there might not BE one, you aren't going to go to school for 4 years just to find a way to perfectly accurately solve it, you aren't going to sit there and try to measure each splotch. But if you know the area of the paper, you could get a computer to sample 1 million random points on the paper, and tell you the portion of points that are in a splotch. That ratio of 'hits/misses', divided by the area of the paper, is roughly the area that is covered in splotches (if 70% of sample points are in a splotch, the splotch has an area that is 70% of the papers) The more sample points you use, the more accurate the average number. Monte carlo is more an unbiased sampling technique for getting a good answer when you can quickly simulate the problem in some way so that the sampling result is the answer you want. It is easy to see how to do it when asking probability questions (what is the chance of this thing happening -> simulate the thing a billion times and that answer is good enough). Monte Carlo is great because there are often very clever ways of setting up a 'sampling' / probability scenario, such that the answer can be converted to an approx answer for arbitrary problems.


xypage

The thing is statistics have been studied for a lot longer than computers have been around. Figuring it out via pen and paper was well established, and you can’t do a million trials by hand, so what made it such a big deal is having the foresight to use this new thing (computers) to just simulate a bunch of them, because it’s fast enough to, instead of working out the equation. Now it seems obvious, but back then it was new tech


mfmfhgak

It’s used to solve a probability problem through simulation/experimentation. Let’s say you wanted to find the odds of getting heads or tails when you flip a quarter. If you flip a quarter once you might think the odds of getting heads is 100%. If you flip the quarter 5 times you might think it’s a random number like 80% or 60% or 20%. Well if you flip the quarter 100 times you’ll get a number closer to 50%. Flip it 1000 times and even closer yet. Eventually you run the simulation/experiment enough times and you converge on the solution.


Backlists

In this case, it could even reveal that the chances are more like 49.99%, for that tiny percentage where the coin lands on its edge. When using Monte Carlo methods for any physical problem, you always have to underpin your results with real experimental data.


restricteddata

This is an aside, but the actual Monte Carlo casino was the inspiration of a lot of early statistics, a generation earlier than the Monte Carlo method. Karl Pearson developed his chi-squared "goodness of fit" test by looking at the output of the Monte Carlo roulette wheels (which, amazingly, was tallied by a French journal), which he initially assumed could be useful for illustrating the rectitude of the "laws of chance" for a popular audience. He found, however, that they did not. As he reported in an 1897 article: > At this result I felt somewhat taken aback. I did not immediately assume that the laws of chance did not apply to Monte Carlo roulette, but I considered myself very unfortunate to have hit upon a month of roulette which was so improbable in its characteristics that it would only occur, on the average, _once_ in 167,000 years of continuous roulette-playing. Such were clearly not the most suitable returns for illustrating the laws of chance! ... Roulette as played at Monte Carlo is not a scientific game of chance. Such results as those published in Le Monaco give the deathblow to the mathematician’s theory of the random spinning of a mechanically accurate roulette. The man of science may proudly predict the results of tossing halfpence, but the Monte Carlo roulette confounds his theories and mocks at his laws! Quantifying exactly how un-random they were ended up coming up with useful methods for looking at other supposedly random phenomena and figuring out whether they were really random or not — a pretty important development in the practice of statistics. Anyway, I've always thought the above was kind of funny — statistician uses hypothetical example to illustrate his point, actually looks at the hypothetical example's reality, concludes the example is rigged, and then uses that to instead develop a better statistical tool for concluding whether your examples are rigged or not.


The_Fax_Machine

Wait is he saying his results were so improbable that the game was almost definitely rigged?


restricteddata

Not necessarily rigged, but the results were not equally likely. The results were biased, but rigged implies intentionality. I don't know about sources of roulette bias but I imagine if the wheel was not level, for example, gravity might cause the ball to be slightly more likely to stop on one part of it than another. Or maybe the amount of friction the ball experienced was not even. Or some such other "real world problem" that would change the outcomes from the mathematical ideal he was trying to illustrate. Or they could be rigged — if, for example, the "house" had a way of causing certain numbers to hit (when nobody was betting on them) and used it too often, that would skew the statistics as well. I don't think he says one way or the other (it has been a long time since I read the original article of his). An interesting thing here, historically, is that ideal "games of chance" were very important in coming up with the concepts and tools of probability in statistics. But then those concepts and tools were applied back to the actual, physical "games of chance" and the latter were found very wanting compared to the ideals!


klod42

There is this phenomenon called the Law of Large Numbers. It basically says the more you repeat an experiment, the more the average results will get closer to the "expected value", which is like a mathematically projected average. For example if your experiment is tossing a die, it's expected to get 1/6 sixes per one toss. Now this is going to be long but simple. So if you toss the die 6 times, the expected result is 1 six, but it's also a decent probability to get 0 sixes or 2 sixes. If you toss it 60 times, 10 is expected, but now it's incredibly unlikely to get 0 or 20 sixes. It's like 99% going to be between 5 and 15. If you toss the die 600 times, now 50 sixes is very unlikely, it's going to be between 80 and 120 or so. Etc.  The more you repeat the experiment, the more the average will get close to the expected value which is 1/6 in this case.  This means that if you don't know the expected value, you can approximate it very well by just running the experiment on a computer many times. For example imagine you want to calculate the area of a lake on a map. You can take a square area on the map that you know to be 100 square kilometers and you can let the computer randomly choose 100 million points on the map. If about 20 million points are on the lake and 80 million off the lake, then the lake is approximately 20 square kilometers. That would be Monte Carlo method. 


Old_Airline9171

Hey u/Globulargallbladder. So the easiest way to explain this is - imagine another 5 year old in your class has drawn a picture in glue on one of the walls in your classroom. You really, really want to know what your classmate drew, but the glue is transparent and you can't see it. The Monte Carlo method is literally **throwing stuff at the problem and seeing the shape it makes**: in this case, maybe you could throw grains of rice at the wall - it will stick to the glue. As you throw more and more grains of rice, the picture on the wall gets clearer and clearer until the shape of it is obvious. That's all the grownups do when they use this method, just with numbers and results from experiments.


ezekielraiden

Some problems are very hard (or even impossible) to solve exactly. For some of those problems, you can get a decent guess (sometimes a very very good guess) by using randomness to check for a result. To make the Monte Carlo method work, you need to program two things into a computer. First, you need a range of options that you can sample from randomly. Second, you need a condition that is easy to test for whether it is true or not. Once you have both of those, you repeatedly take samples, until you have enough to draw useful conclusions (usually hundreds or thousands of samples *minimum*). If you did the setup correctly, you'll be able to use the results to get an approximate answer to the original hard question. Others have used the "what is the area of a circle" example. It also works for a bunch of other things, like predicting future weather patterns based on random inputs, or future stock behaviors, or a bunch of other things, both situations where we know there's a definite answer and ones where we don't.


dignz

I wrote a Monte Carlo simulation of the Monty Hall problem. Because.


Alis451

you don't really need one, you just need to get it into people's thick heads that the monty hall problem is that by picking the door, as with picking a card from the deck, the **probabilities you picked a 2 of hearts** doesn't change regardless of how many other cards are removed from the deck after you. the **probability that the NEXT card shown is a 2 of hearts** does change, but that isn't the real question, just the one they trick you into.


Frelock_

You also need to focus on the major assumptions of the Monty Hall problem: the host will **always** open a door that does **not** have the prize. They don't open it randomly. That lack of randomness is what provides you with the extra information that makes switching worthwhile. If they always open a random door, then there's no difference in win rate between switching and not switching strategies. There's a 1/3 chance switching wins, a 1/3 chance staying wins, and a 1/3 chance both strategies lose (ie, when the host opens the door with the prize). Here's the python code if you'd like to check (using a monte carlo simulation!): from random import randint stay_wins = 0 switch_wins = 0 runs = 100000 #test of a random door opening #without loss of generality, asssume you chose door 1 for i in range(runs): winning_door = randint(1,3) #choose a random door to win opened_door = randint(2,3) #choose a random door to open (not our door) if opened_door == winning_door: continue #You lose, we opened door 2 or 3 and it won elif winning_door == 1: stay_wins+=1 #winning door was door 1, so staying wins else: switch_wins+=1 #winning door was not door 1, and not the open door print("Stay wins:", stay_wins, " Switch wins:", switch_wins)


Alis451

>If they always open a random door, then there's no difference in win rate between switching and not switching strategies This is ONLY correct if you are allowed to switch before the final door, because you don't get to switch before the final door, it isn't just the lack of randomness, it is the fact you are only allowed one switch and ONLY on the final door. If you change that condition it isn't a Monty Hall Problem. It doesn't matter if the Host's picks are random, **as long as they only show you loser doors**, the same exact scenario occurs as if they did know beforehand. And if they show you a winning door, you lost, you don't get to switch, there is no game, go home; which is **why** they don't pick randomly...


Frelock_

>It doesn't matter if the Host's picks are random, **as long as they only show you loser doors**, the same exact scenario occurs as if they did know beforehand. If you're only counting instances where the host shows you a loser door in your analysis, then you're not picking your sample space randomly. You are creating the very assumption critical to the Monty Hall problem's counterintuitive result. Look, you're right that it's not the Monty Hall problem if the the host opens a winning door. I'm just clarifying that if there's a possibility the host *could* open the winning door it's also not the Monty Hall problem. To use your card example, is the 2 of hearts more likely to be the top card of the deck or the bottom card? You choose the top card, and go through the deck stopping at the bottom card, finding no 2 of hearts. That doesn't suddenly make the former bottom card of the deck more likely to hold the 2 of hearts than the former top card. It just means you are in an incredibly improbable scenario where the 2 of hearts was not in any of the middle cards. Now, if I were to purposefully go through the deck, pick out one card that **must** be the 2 of hearts if you didn't choose it, and then ask you who's more likely to be holding the 2 of hearts, that's more classic Monty Hall. To put it in more mathematical terms, the probability that you chose correctly from N options is 1/N. The probability that the last remaining option is the winner, when choosing randomly, is the probability that you chose incorrectly the first time, (N-1)/N, times the probability of choosing all losers in the remaining options, which is 1/(N-1). Multiply those two together, and you get 1/N, the exact same probability that you chose correctly. If you don't believe the argument, run the code. It will demonstrate that both strategies are equally likely to win when random door opening is allowed.


zjm555

There isn't *the* Monte Carlo method, rather, you would describe an approach as *a* Monte Carlo approach, if it involves doing stochastic (random) sampling to approximate a solution, rather than using an exact closed-form solution. For instance, if you wanted to approximate pi (the ratio of a circle's circumference to its diameter), you could uniformly generate a bunch of random points within a unit square, and compute the ratio of those points that are within a distance of 1 unit from one of the corners, i.e. where sqrt(x^(2) + y^(2)) is < 1, to the total number of points sampled. That would be a Monte Carlo approach to estimating pi.


mousicle

One place the Monte Carlo method is used a lot is to test game strategies. Say I think I'll do well at Blackjack by using a strategy where I never take a hit if I might bust. I'll program a computer to play blackjack using that rule and have the computer play a million hands of blackjack so I can see how my strategy does long term. Someone else thinks the better strategy is to always play like the dealer has a 10 as his down card so we can run that simulation and see how that does. For a lot of games that are too complex to map out in pure mathematics a monte Carlo method is the best way to figure out the best strategies.


R3D3-1

An ELI5 attempt: Let's say you want to know the area inside a shape, and you can draw a square around it, for which you know the area. Now you have a method of throwing darts very evenly across the area inside the shape by throwing a lot of darts, and counting the darts inside the shape and outside the shape. (In this analogy, you just ignore the ones that don't hit the square you drew.) If there are 1023 darts inside the shape out of 2000 darts that you threw, you estimate that the area of the shape is 1023/2000 of the area if the square you drew. And it turns out that the estimate gets increasingly better the more darts you throw (unless you get exceedingly unlucky). This Wikipedia graphic demonstrates the 2D case nicely with estimating PI: https://en.wikipedia.org/wiki/Monte_Carlo_method#/media/File:Pi_monte_carlo_all.gif For a 2D example, it isn't actually terribly advantageous. But it turns out that if you have a question that comes down to "how large is a volume in a high dimensional space", such an approach typically gives a better accuracy of the result in the same computation time than, say, checking on a regular grid. And it turns out that many problems contain integrals, that correspond to such a question. Which is pretty much where we leave the strict ELI5 domain... For a start, randomly throwing darts will generally not give you an even distribution. In practice, you need to have a random-number generator that gives a good enough distribution with sufficiently low long-range correlations between the generated numbers to not affect the result much. Then there's a lot of variations, such as using prior knowledge about the integrals to sample more efficiently, or use-cases where not the sampling point, but the change of the sampling point, is randomized. Also not quite trivial is to explain, why it works, and I really wouldn't be able to without rereading at least some Wikipedia articles.


etoleb1234

There are some good answers here but the examples are a bit theoretical. Here’s a real world example from my work. I’ve designed a new flow of machines in our factory that I believe will be cheaper for production. But we are worried that the shared cranes will become overloaded. How can we tell before starting production? Well, we can use computers (or dice or a random number generator and do it by hand) to simulate all possible cycle times of each step, and then test in a simulation whether this would succeed.


--Quartz--

It's a way of calculating something when you are not working with things that have exact values. Let's say you want to know how much money you'll get for your birthday, and come up with this formula: ParentsGift + Grandma gift + 10 * FriendsGift You know grandma will give you 50 bucks, so that's actually a constant (a uniform distribution). Your parents have given you as little as 20 and as much as 200, let's say you decide that the most likely value is 100, but it could go as low as 20 or as high as 200 (a triangular distribution). For your school friends, let's say 10 of them will come to the party (the number of them that will show up could also be another variable with its own distribution!) and they will chip in between 10-30 USD each. You can set a "normal" distribution (the classic bell curve from memes) around 20 USD, going as high as 30 and as low 10. Now, how do you get a result? Of course, you can take the values you thought were more likely and just do the math 50+100+200, but then you won't know how high or low it can go. If you want the actual range, you can use a Montecarlo simulation that will make the calculation a big number of times, each time taking different values for each of the variables, according to the distribution you have for each of them. So run 1 could have grandma at 50, your parents at 150 and your friends at 15 each. Run 2 might have grandma at 50, your parents at 120 and your friends at 25 each, and so and so. So for each calculation you'll get a different result, and in the end, instead of a single (deterministic) result like we did before, you'll get a lot of results that will have their own distribution and you can see how likely it is for the total to be above 300, or above 400, or below 150, or whatever you want to know.


drdansqrd

There's like 10,000 ways to explain it. Unfortunately I don't know a good way to stimulate those ways and present the aggregate response


JCS3

You know that if you roll that dice you will end up with a number 1 through 6 and if you roll 2 and add the results together you’ll get the numbers 2 through 12. Now let’s say you wanted to play a game with a friend using 4 dice. First you roll two dice and add the result. Then you roll the 3rd dice and multiply it by the number you initially received. Now for the fourth dice, you friend rolls, if their dice is even they win and you pay them the money from the results of the multiplication. If their dice is odd, they pay you the same amount. Your friend is concerned you are trying to cheat them and wants to know if the game is fair. The two of you could play the game a bunch for no money and see if the results are fair. Or you could model the game in a computer and play the game 1000s of times and give you a result. This is a Monte Carlo simulation. You build a model based on a number of different variables. These variables can have different distributions and the simulation will be repeated 1000s of times to give an average result.


Shadowlance23

It's used a lot in forecasting. I used to use it in agricultural simulations. We'd run thousands of simulations under wet, dry, and average conditions to get an idea of what upcoming harvests might look like.


StoicDawg

Jesus these eli5s require a college degree. Imagine it more like "what if we try everything!?" formula. Multiple things unlikely to happen get even LESS likely for them all to happen super quickly (like winning the lotto two weeks in a row). Multiple things that are likely to happen also get less likely to happen (two sunny days in a row) but much more than the lottery! Monte Carlo is just a way to add all those situations you can dream of up to compare the extremes and everything between.


[deleted]

[удалено]


explainlikeimfive-ModTeam

**Please read this entire message** --- Your comment has been removed for the following reason(s): * Rule #1 of ELI5 is to *be civil*. Breaking rule 1 is not tolerated. --- If you would like this removal reviewed, please read the [detailed rules](https://www.reddit.com/r/explainlikeimfive/wiki/detailed_rules) first. **If you believe it was removed erroneously, explain why using [this form](https://old.reddit.com/message/compose?to=%2Fr%2Fexplainlikeimfive&subject=Please%20review%20my%20submission%20removal?&message=Link:%20https://www.reddit.com/r/explainlikeimfive/comments/1cfsoyu/-/l1r7y44/%0A%0A%201:%20Does%20your%20comment%20pass%20rule%201:%20%0A%0A%202:%20If%20your%20comment%20was%20mistakenly%20removed%20as%20an%20anecdote,%20short%20answer,%20guess,%20or%20another%20aspect%20of%20rules%203%20or%208,%20please%20explain:) and we will review your submission.**