Probability and Statistics¶

Probability¶

Each distribution is represented as an entity. For each distribution known to the system the consistency of parameters is checked. If the parameters for a distribution are invalid, the functions return Undefined. For example, NormalDistribution(a,-1) evaluates to Undefined, because of negative variance.

BernoulliDistribution(p)¶

Bernoulli distribution

Parameters:	p – number, probability of an event in a single trial

A random variable has a Bernoulli distribution with probability p if it can be interpreted as an indicator of an event, where p is the probability to observe the event in a single trial. Numerical value of p must satisfy 0 < p < 1.

See also

BinomialDistribution()

BinomialDistribution(p, n)¶

binomial distribution

Parameters:	p – number, probability to observe an event in single trial n – number of trials

Suppose we repeat a trial n times, the probability to observe an event in a single trial is p and outcomes in all trials are mutually independent. Then the number of trials when the event occurred is distributed according to the binomial distribution. The probability of that is BinomialDistribution(p,n). Numerical value of p must satisfy 0 < p < 1. Numerical value of n must be a positive integer.

Statistics¶

ChiSquareTest(observed, expected)¶

Pearson’s ChiSquare test

Parameters:	observed – list of observed frequencies expected – list of expected frequencies params – number of estimated parameters

ChiSquareTest is intended to find out if our sample was drawn from a given distribution or not. To find this out, one has to calculate observed frequencies into certain intervals and expected ones. To calculate expected frequency the formula \(n_i=n p_i\) must be used, where \(p_i\) is the probability measure of \(i\)-th interval, and \(n\) is the total number of observations. If any of the parameters of the distribution were estimated, this number is given as params. The function returns a list of three local substitution rules. First of them contains the test statistic, the second contains the value of the parameters, and the last one contains the degrees of freedom. The test statistic is distributed as ChiSquareDistribution().