Binomial Simulation

DESCRIPTION

This is a really simple simulation but helps you visualize statistical significance of a single binomial variable.

Let's say you have n trails and you observed k 'successes'. You often want to know, what is the success rate and the bounds around that success rate. As an example, let's say we toss a coin 444 times and we saw 386 heads. You may want to know if the bias of 386 / 444 ~ 0.87 a good estimate or do you need more tosses? Laterally related, you may want to know if the coin was indeed biased by 0.87, what is the band of likely outcomes around 0.87 - could it be from 0.6 to 0.95 or is it from 0.84 to 0.9?

This occurs A LOT in the real-world. For example, you see 444 patients and you saw 386 of a certain case. You have 444 users and 386 of them retained after a certain period of time. You have n machines and a bunch of them fail. And so on.

While you can always calculate standard deviation and standard errors and they of course give you analytically sound estimates, there is nothing quite like visualizing the outcomes for yourself and convincing yourself one way or another that it is indeed tight around the estimates or the bounds are too far to be statistically significant.

In the default setup n=444 and p=0.87 and 200 repeats from the settings tab on the left (of n repeats). See the band of outcomes for that run. Now change n=444 to n=44. Observe the band of outcomes change.

** Help **

sample(outcomes, probabilities)

Sample an outcome from the outcomes array based on the corresponding probabilities in the probabilities array.

radiobutton(name, options)

Creates a radio button parameter with name and have each option in the options array selectable. Returns the selected option.

textbox(name, default_value)

Create a textbox parameter with name and a required default value. Returns the value of the textbox.

scatter_graph(data, xlabel, ylabel)

Create a graph based on a data array and plot (data.xlabel, data.ylabel) on it. Can also pass xmin=null, xmax=null, ymin=null, ymax=null as added arguments.

stop_simulation()

Calling this function, stops the simulation.

average(values)

Returns the average of the values in the array. It can be handy to store the values of metrics from each simulation run and average it across all runs to obtain an estimate of the metric.

error_average(values)

Returns the estimated error (1.96 x standard error) of the average of the values at 95% confidence. In simple terms, you can assume that the estimate average(values) has error bounds +/- error_average(values). It can be handy to store the values of metrics from each simulation run and estimate both the average and the error of the average across all runs.

stdev(values)

Returns the standard deviation of the values in the passed in array.