Chapter 16
Probability Statistics and Stochastic Processes
This chapter briefly introduces the important contents of probability theory, in addition to introducing random events and their probability, random variables and distribution functions, numerical characteristics of random variables, probability generating functions, moment generating functions and characteristic functions, law of large numbers and central limit theorem, etc. In addition to basic concepts , the normal distribution table and the use of probability paper are also introduced. This chapter focuses on describing the commonly used mathematical statistics methods, including samples and their frequency distribution, interval estimation of population parameters, statistical testing, variance analysis, regression analysis, orthogonal experimental design, sampling inspection, quality assessment (process control), etc. Finally, the basic content of stochastic process theory is briefly described, and the more commonly used Markov processes and stationary stochastic processes are highlighted.
§ 1 Probability Theory
1.
Events and Probability
1. Random events and their operational relationships
[ Random event · inevitable event · impossible event ] Under certain conditions, the experimental results that may or may not occur are called random events, or events for short, represented by A , B , C , . . . There are two special cases of random events, namely inevitable events (events that must occur in each trial under certain conditions) and impossible events (events that must not occur in each trial under certain conditions), respectively recorded as Ω and Φ .
[ Operational relationship of events ]
1 ° contains When event B occurs , event A must also occur, then A contains B or B is contained in A , denoted as A B , or B A .
2 ° Equivalence If A B and A B , that is, events A and B occur at the same time or do not occur, then A and B are said to be equivalent, denoted as A=B .
The 3 ° product represents the simultaneous occurrence of events A and B , called the product of A and B , denoted as A B ( or AB ).
4 °Sum means the event that event A or event B occurs, called the sum of A and B , and denoted as A B ( or A+B ).
A 5 ° difference represents an event where event A occurs and event B does not, called the difference between A and B , denoted as A \ B (or A ).
6 ° Mutual exclusion If events A and B cannot occur at the same time, namely AB , then A and B are said to be mutually exclusive (or mutually incompatible).
7 ° Opposition If events A and B are mutually exclusive, and either A or B occurs in each trial , that is, A B = and A B = Ω , then B is called the opposite event of A , and denoted as B = .
8 ° Complete If events A 1 , A 2 , ··· , An occur at least one in each trial, that is , { A 1 , A 2 , ··· , An } is said to constitute a complete set of events . Especially when A 1 , A 2 , ··· , An are mutually exclusive in pairs, that is , A i A j = ( ij , i , j = 1 , 2 , ··· , n ), it is said that { A 1 , A 2 , ... , A n } is a complete set of mutually exclusive events.
2. Several definitions of probability
[ Frequency and Probability ] Whether or not a random event occurs in an experiment is an accidental phenomenon that cannot be determined in advance, but when repeated experiments are carried out, the statistical regularity of the probability of its occurrence can be found. Specifically, if n repeated trials are carried out under the same conditions , and event A occurs v times, then the frequency of event A in n trials appears stable when n increases infinitely. This statistical regularity shows that the possibility of event A happening is an objective attribute inherent in the event itself and not changed by people's subjective will. The probability of event A occurring is called the probability of event A , denoted as P ( A ) . When the number of trials n is large enough, the frequency of the event can be used to approximate the probability of the event, that is
[ Classical definition of probability ] Suppose a random experiment (an experiment whose outcome cannot be accurately predicted in advance and can be repeated under the same conditions) has only a finite number of different basic events ω 1 , ω 2 , . . . , ω n (A basic event is also a kind of event, a general event is always composed of several basic events), each basic event is equally possible * , the whole of the basic events is recorded as Ω , and it is called the basic event space, if Event A consists of k ( k n ) different basic events, then the probability P ( A ) of A is specified as
The probability of an impossible event is specified as
[ Axiomatic Definition of Probability ]
Definition 1 Let , F , if F satisfies the following conditions:
( i ) F ;
(ii) if F , then F ( ) ;
(iii) For any F ( n = 1, 2, . . . ) , we have
F
Then F is said to be an algebra in .
Definition 2 Let be a real-valued set function on an algebra F if it satisfies the condition:
( i ) for any F , there is 0 P ( A ) 1 ;
( ii ) ;
( iii ) For any F ( n = 1 , 2 , . . . ) , A i A j = ( i j ) has
P ( ) = An )
Then P ( A ) is called the probability measure on F , or probability for short. At this time, ω is called the basic event, A ( F ) is called the event, F is the whole of the event, P ( A ) is called the probability of the event A , and < , F , P> is called the probability space.
3 . Basic Properties of Probability
1 ° 0 P ( A ) 1
2 ° P ( inevitable event ) = P ( Ω )=1
3 ° P ( impossible event ) = P ( )=0
4 ° P(A B ) = P ( A ) + P ( B ) — P ( A B )
If A , B are mutually exclusive, then P ( A B ) =P ( A ) + P ( B )
If A 1 , A 2 , ... , A n are mutually exclusive, then
P ( ) =P ( A 1 ) +P ( A 2 ) + ... +P ( A n )=1
5 ° If A B , then P ( A ) P ( B )
6 ° If A B , then P ( A ) ( B ) =P ( A \ B )
7 ° For any event A , P ( ) = 1 ( A )
8 ° If A 1 , A 2 ,... , An is a complete set of events that are mutually exclusive in pairs, then
P ( ) =P ( A 1 ) +P ( A 2 ) + + P ( A n ) = 1
9 ° Let A n F , A n A n+ 1 , n= 1,2,... , let A= n , then
P ( A ) = (Continuity Theorem)
4. The formula for calculating the probability
[ Conditional probability and multiplication formula ] Under the condition that event B occurs, the probability of event A occurring is called the conditional probability of event A under the condition that event B has occurred, denoted as P ( A | B ) . When P ( B ) > 0 , it is specified that
P ( A|B ) =
When P ( B ) = 0 , it is specified that P ( A|B )=0 . This leads to the multiplication formula:
P ( A =P ( B ) P ( A|B ) =P ( A ) P ( B|A )
P ( A 1 A 2 ··· A n ) = P ( A 1 ) P ( A 2 | A 1 ) P ( A 3 | A 1 A 2 ) · P ( A n | A 1 A 2 · · · A n- 1 ) ( P ( A 1 A 2 ··· A n -1 )>0)
[ Independence formula ] If events A and B satisfy P ( A | B ) = P ( A ) , then event A is said to be independent of event B. Independence is the nature of each other, that is, A is independent of B , and B must be independent of A , or A and B are independent of each other.
The necessary and sufficient conditions for A and B to be independent of each other are:
P ( A B ) = P ( A ) P ( B )
If any m ( ) of events A 1 , A 2 ,..., A n satisfy the relation
A 1 , A 2 , ···, An are said to be independent in total, and abbreviated to be independent of each other.
[ Full probability formula ] If the event group B 1 , B 2 , satisfies
P ( ) = 1, P ( B i ) > 0 ( i = 1,2,...)
Then for any event A , we have
If there are only n B i , the formula is also established, and only n terms are added on the right side.
[ Bayesian formula ] If the event group B 1 , B 2 , satisfies
( i j )
,
Then for any event A ( P ( A )>0) , we have
P ( B i | A ) =
If there are only n B i , the formula is also established, and only n terms of the right-hand denominator are added.
[ Bernoulli's formula ] Assuming that the probability of an event A appearing in one trial is p , then the probability p n,k of event A appearing k times in n repeated trials is
p n,k = p k (1 ) nk ( k= 0,1,..., n )
where is the binomial coefficient.
When both n and k are large, there are approximate formulas
p n,k
In the formula , .
[ Poisson formula ] When n is sufficiently large and p is small, there is an approximate formula
p n,k
where = np .
2.
Random variables and distribution functions
[ Random variable and its probability distribution function ] The result of each test can be represented by the value of a variable. The value of this variable varies with random factors, but it follows a certain probability distribution law. This variable is called a random variable , represented by , . . . It is the ratio of numbers of random phenomena.
Given a random variable , the probability P ( x ) of an event whose value does not exceed a real number x is a function of x , which is called the probability distribution function, or the distribution function for short, denoted as F ( x ) , that is
F ( x ) = P ( ( (
[ Basic properties of distribution functions ]
1 ° ,
2 ° If x 1 <x 2 , then F ( x 1 ) F ( x 2 ) (monotonicity)
3 ° F ( x + 0) = F ( x ) (right continuity)
4 ° P ( a < =F ( b ) ( a )
5 ° P ( =F ( a ) 0)
[ Discrete distribution and probability distribution column ] If a random variable can only take a finite or listable number of values x 1 , x 2 ,..., x n , ... , it is called a discrete random variable. If P ( ) = p k ( k= 1,2,...) , the probability distribution of the values is completely determined by { p k } . Let { p k } be the probability distribution column. { p k } has the following properties:
1 °
2 ° =1
3 ° Let D be any measurable set on the real axis, then P (
4 ° distribution function
F ( x ) =
is a step function with jumps at .
[ Continuous distribution and distribution density function ] If the distribution function F ( x ) of the random variable can be expressed as
F ( x ) = ( p ( x ) is not negative)
It is called a continuous random variable. p ( x ) is called the distribution density function (or distribution density). The distribution density function has the following properties:
1 ° p ( x ) =
2 °
3 ° If p ( x ) is the distribution density of continuous random variables , then for any measurable set D on the real axis , we have
[ Distribution of a function of a random variable ] If the random variable is a function of a random variable
Let the distribution function of the random variable be F ( x ) , then the distribution function G ( x ) is
G ( x ) =
In particular, when it is a discrete random variable, its possible values are x 1 , x 2 , ···, and P , then
G ( x )=
When it is a continuous random variable , its distribution density is p ( x ) , then
G ( x ) =
[ Joint distribution function and marginal distribution function of random vector ] If ... , is related to n random variables under the same set of conditions , then ... , ) is an n -dimensional random variable or random vector.
If ( x 1 , x 2 , ··· , x n ) is a point on the n -dimensional real space R n , then the probability of the event " ···,
As a function of x 1 , x 2 , ···, x n , it is called the joint distribution function of the random vector ···,.
Suppose ( , is an m -dimensional random variable composed of m ( m n ) components arbitrarily taken out from ( , m n ) , then the joint distribution function of ( , is called the m -dimensional edge of ( , ). Distribution function.
At this time, if the distribution functions of ( ..., and ( ..., ) are respectively recorded as F ( x 1 , x 2 ,..., x n ) and , then
=F ( ··· ,x , ··· , , ··· ,x , ··· , )
[ Conditional Distribution Function and Independence ] Suppose it is a random variable and event B satisfies P ( B )>0 , then it is called
F ( x | B ) = P ( x | B )
is the conditional distribution function under the condition that event B has occurred.
1 ° Let ( , be a two-dimensional discrete random variable, and the possible values of sum are x i ( i = 1,2, ) and y k ( k = 1, 2, ) . ( , the joint distribution of
P ( = p ik
The two one-dimensional marginal distributions are
P ( = · = ( i= 1,2,···)
P ( = =
say
P ( | ) =
is the conditional distribution of a discrete random variable under condition. similar, say
P( | ) = ( > 0,
k= 1,2,...)
is the conditional distribution of a discrete random variable under condition.
2 ° Let ( ) be a two-dimensional continuous random variable, and its joint distribution density is f ( x, y ) , at point y , then it is called
is the conditional distribution function under the condition of =y , at point x , then it is called
is the conditional distribution function under the condition.
3 ° If the joint distribution function of ( ..., is equal to the product of all one-dimensional marginal distribution functions, i.e.
F ( x 1 , x 2 ,..., x n ) =
(It is equivalent to P ( , ···, n x n ) = then say , ··· , are independent of each other.
3.
Numerical Characteristics of Random Variables
[ Mathematical expectation (mean) and variance ] The mathematical expectation (or mean) of a random variable is denoted as E (or M ), which describes the value center of the random variable. The mathematical expectation of a random variable ( ) 2 is called the variance, denoted as D (or Var ), and the square root of D is called the mean square error (or standard deviation), denoted as = . They describe how densely the possible values of a random variable deviate from the mean.
1 ° If a continuous random variable has a distribution density of p ( x ) and a distribution function of F ( x ) , then (when the integral converges absolutely)
E =
D =
2 ° If it is a discrete random variable, its possible values are x k , k= 1,2,... , and P( = x k )=p k , then (when the series is absolutely convergent)
E
D = p k
[ Several formulas for mean and variance ]
1 ° D =E 2 - ( E ) 2
2 ° Ea=a , Da= 0 ( a is a constant)
3 ° E ( c ) =cE , D ( c ) =c 2 D ( c is a constant)
4 ° E (
5 ° If 1 , 2 ,..., n are n independent random variables, then
E ( 1 + 2 + ··· + n ) =E 1 +E 2 + ··· +E n
D ( 1 + 2 +...+ n )=
6 ° If 1 , 2 ,..., n are n independent random variables , then
E ( 1 2 ··· n ) = ( E 1 )( E 2 ) ··· ( E n )
D ( 1 + 2 + ··· + n ) = D 1 +D 2 + ··· +D n
7 ° If 1 , 2 ,..., n are independent random variables, and = 0, D k = ( k= 1,2,..., n ) , then the mean and variance of the random variables are
[ Chebyshev's inequality ] For any given positive number , we have
[ Conditional Mathematical Expectation and Full Mathematical Expectation Formula ] Let F ( x | B ) be the conditional distribution function of random variables to event B , then
is called (when the integral converges absolutely) the conditional mathematical expectation for event B. If a continuous random variable has a conditional distribution density p ( x | B ) , then
If it is a discrete random variable, its possible values are x 1 , x 2 ,... , then
If B 1 , B 2 ,..., B n is a complete set of mutually exclusive events, then there is a full mathematical expectation formula
[ Relationship between median, mode and mean ] Satisfaction
P ( , P (
The number m is called the median of the random variable . In other words, m satisfies the following two equations:
P (
P (
To maximize the value of the distribution density function, that is
p( )= maximum value
is called the mode of a random variable .
For a unimodal symmetrical distribution function, m = = (mean)
For asymmetric unimodal distribution functions, m lies between and .
[ Higher-order origin moment and central moment ] When r , the mathematical expectation (assuming existence) of the random variable and ( is called the r -order origin moment and the r -order central moment of the random variable, respectively, denoted as sum . In particular, it is the mean value, which is variance.
1 ° If a continuous random variable has a distribution density of p(x) , then
2 ° If it is a discrete random variable, its possible value is x k ( k = 1,2,...) , and P ( =x k ) =p k , then
3 ° When r , the mathematical expectation of the random variable and (assuming they exist) are called the r -order absolute origin moment and the r -order absolute central moment of the random variable, respectively. And there are similar formulas corresponding to 1 ° and 2 ° .
4 ° The origin moment and the central moment satisfy the following relationship ( r is a positive integer);
where is the binomial coefficient.
[ Covariance and Correlation Coefficient ] Assuming that
both the mean and variance of the sum of random variables exist, then the covariance or Cov of the sum ( for
= E [(
and the correlation coefficient is
=
4.
Probability Generating Function Moment Generating Function Characteristic Function
[ Probability generating function of an integer-valued random variable ] If only random variables with non-negative integer values are taken, the mean of the random variable function is called the probability generating function of the random variable. Write P= ( =k ) =p k ( k= 0,1,2,...) , then the probability generating function is
P ( ( - 1
set , then
(1) =E
(1) =E [
・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・
P
in turn have
[ Moment generating function ]
If it is a random variable, it is called the mean value of the random variable function
is the moment generating function. If there is an origin moment of any order . . . ) , then
1 ° If it is a discrete random variable, its possible values are x 1 , x 2 , · · · , then
2 ° If a continuous random variable has a distribution density of p ( x ), then
[ Characteristic function ] If it is a random variable, it is called
the mean value of the complex -valued random variable e
( i = )
for the characteristic function. If there is an origin moment of any order ( k =1,2,...) , then
If it is a discrete random variable, its possible values are x 1 , x 2 ,..., P( , then
2 ° If a continuous random variable has a distribution density of p ( x ) , then
[ Relationship between probability generating function, moment generating function and characteristic function ]
P ( e t )=
P ( e it )=
Five,
commonly used distribution functions
1. Commonly used discrete distributions
name notation |
Probability distribution and its domain parameter conditions |
mean |
variance |
probability generating function
|
moment generating function
|
Characteristic Function
|
icon |
|
binomial distribution
|
positive integer |
|
|
|
|
|
|
|
Poisson distribution
|
positive integer |
|
|
|
|
|
|
|
geometric distribution
|
|
|
|
|
|
|
|
|
Negative binomial distribution
|
positive real number |
|
|
|
|
|
|
|
single point distribution
|
positive integer |
|
|
|
|
|
|
|
name notation |
Probability distribution and its domain parameter conditions |
mean |
variance |
probability generating function
|
moment generating function
|
Characteristic Function
|
icon |
|
log distribution
|
|
|
|
|
|
|
|
|
hypergeometric distribution
|
positive integer
|
|
( for the hypergeometric function) |
|
||||
2. Commonly used continuous distribution
name notation |
Distribution Density and Its Definition Domain parameter conditions |
mean |
variance |
moment generating function |
Characteristic Function |
icon |
Uniform function |
|
|
|
|
|
|
Standard normal distribution |
|
0 |
1 |
|
|
|
normal distribution |
|
|
|
|
|
|
Rayleigh distribution |
|
|
|
|
|
|
index distribution |
|
|
|
|
|
|
beta distribution |
|
|
|
(Kummer function) |
|
|
Gamma distribution |
|
|
|
|
|
|
Lognormal distribution
|
|
|
|
|
|
|
distribution (degrees of freedom is
|
n is a positive integer |
n |
2 n |
|
|
|
Distribution (with degrees of freedom ) |
n is a positive integer |
0 ( n> 1) |
|
is the Neumann function |
|
|
F distribution (degrees of freedom ( m,n ) ) F ( m,n ) |
m, n are positive integers |
|
|
(Kummer function) |
|
|
Wilbur distribution |
shape parameter , scale parameter , positional parameters |
|
|
|
|
|
Cauchy distribution |
|
does not exist |
|
|
6. The Law of Large Numbers and the Central Limit Theorem
[ Law of Large Numbers ]
1 ° Bernoulli's Theorem The frequency of random event A in n independent trials converges to the probability p of event A according to probability , that is, for any ,
2 ° Independent random variables ... if (i) has a mean variance. Note E , )
; or ( ii ) have the same distribution with finite mean E. then
Convergence to the mean of random variables according to probability , that is, for any ,
3 ° If the mean and variance of random variables with the same distribution are independent of each other , remember that
converges to the variance of the random variable according to the probability , that is, for any ,
[ Central Limit Theorem ]
1 ° If the mean and variance of random variables with the same distribution are independent of each other , remember , then the random variable
Asymptotically follow the standard normal distribution N(0,1) , that is
2 ° under the condition of 1 ° , there is
or
Seven,
the use of the normal distribution table
In practice, many random phenomena follow a normal distribution, or asymptotically follow a normal distribution with appropriate transformations. This manual accompanies the normal probability integral
table of values, and the integral
( K
Median and value correspondence table. Using them, the following problems can be calculated:
1 ° The probability of a random variable following a standard normal distribution falling within the interval is
The one-sided probability is
(or from the values found in the value-to- value correspondence table ).
2 ° known , determine the integral
in . by symmetry
Find out from the value-to- value correspondence table , then .
3 ° The probability that a random variable that follows a normal distribution falls within the interval is
The one-sided probability is
* (In applications, when one event is more likely to occur than the other for no reason, it is considered that these two events are equally likely)