ABSTRACT
The Central Point Theory And Its Effect In Predictive Theory
William Ayine Adongo
University for development Studies
Faculty of mathematical Sciences
Department of Statistics (Option: Actuarial Science)
ID:FAS/0912/06
Email: Ayinewilliam@yahoo.com.
This work is an investigation of the central point theory which I am developing and its effects in predictive theory. Through the knowledge of central point theory, new predictive theory is devised to substitute the existing general regression theory, since experts are more interested in finding averages in all practical fields, in order to make their works accurate, concise and predictable. This predictive theory is named as Central Prediction Theory.
The work focus on using the central prediction theory to predict several mean response variables from only one mean explanatory variable that provided no room for random error(except complicated cases). This model includes the mean component only and can be used to hypothesize an exact relationship between phenomena that cannot be modeled or explained when using the existing "general regression model".
The review of linear equation, central point theory and regression theory in two more variables are discussed fundamentally. The work precisely explained the validity of the central prediction theory and its important as compared to the existing general regression theory. The conclusion is that no text-statistics is needed to evaluate the validity of the central prediction model (except complicated cases) since it does not account for
random error.
The study found that the central prediction theory can be used to construct models in econometrics, logistic prediction, survival prediction, time series prediction that is almost certainly have no less than one response variables and no variation due strictly to random phenomena, that cannot be modeled or explained when using the existing general regression model.
Recommendations are made to better the development of the central prediction theory for academic research, learning reference and teaching.
CHAPTER ONE
1.0 Introduction
This project is to investigate the effect the central point theory of linear equations in two or more variable in predictive modeling. The review of linear equations in two or more variables will be discussed fundamentally. The review of Central Point Theory will also be discussed fundamentally.
The work focus on using the basic concept of the Central Point Theory of linear equation in two or more variables to create new concept in predictive theory. The study is to be found as whether the Central Point Theory has positive effect in predictive modeling and how useful it is as compared to the existing methods of analysis in regression.
1.1 Brief History Of Me
The attainment of modern science would be impossible without an algebraic equations, theories and rules. Two statements and only two can be made about an equation; either it can be solved or else we can prove it insolvable.
At my early manhood I overturned the theories and methods of 1840’s-2006’s mathematics of linear algebraic in two or more variables and following my own revolutionize theories of algebra (linear algebra), opened the way to a mid 21st –century mathematical analysis. Although I contributed significantly to pure mathematics, I also made practical application of importance for 21st –century physics, economics, statistics or actuarial science, mathematical biology etc.
I am the original developing of the Multi-combinational Mathematics, Central Point Theory and Least Whole Normal mathematics. The achievement of me in pure mathematics, physics and statistics between the year 2007-2008 was to discover a unique point of a line (Or linear equation) in a plane as central point (Or point of gravity). This unique point of linear equation in a plane made it possible for me to discover the central (or Gravitation) point values interval theorem. This particular theorem helps us to understand the coordinate system very well as long as analytical geometry and astronomy is concerned. The central (gravitational) point theory also has great effect in predictive modeling and this is what we are to be considered in this project.
1.2 The Fundamental of the Central (Gravitational) Point Theory (Reference: adongoayinewilliam13.blogspot.com)
1.2.1 It is characterized that at central (gravitational) point, the axial term of x is equal to the axial term of y in a line equation Ax +By = C
A plane, the value of the axial coordinates x and y is calculated as;
Xg=C/2A
yg= C/2B
1.2...3 It is characterized that at central (gravitational) point of a line equation Ax +By=C in a plane; the slope of a line is given as;
S g=-(Y g/X g)
1.2.4 It is characterized that at central (gravitational) point of a line equation Ax +By= C in a plane, the intercept of a line is given as;
(Xg; 0)Є (xg 0) = (2xg, 0)
(0, Yg)Є (0, yg) = (0, 2Yg)
1.2.5 Adongo’s (me) Central (Gravitational) Point values interval theorem
The theorem states that at central point xg and yg in the order pairs (x,y), xg is adding itself to infinity when yg is subtracting itself to infinity and vice-versa.
1.3 Research Objective:
Looking at the selected topic of this project, this research of the central point theory is examined:
(1) What is the meaning of central point theory?
(2) What is the effect of the central point theory in predictive models?
(3) Can new form of predictive models exist with the help of the central point theory?
(4) Is the availability of the central point theory affect positively in predictive models?
(5) Can we name this form of predictive model as Central Prediction Model?
1.4Problem Statement
Since actuaries, statisticians, economists, engineers, sociologists, and scientists are traditional gatherers for constructing predictive models for practical use; they find difficult to use a concise predictive model that can use the mean component of the dependent and the independent variables to hypothesized an exact relationship between random phenomena that cannot be modeled or explained when using the existing predictive model theme as a “regression models”.
They also find difficult to relate several dependent variables with only one independent variable for predictive purpose.
The availability of the central predictive theory will truly help actuaries, statistician economists, engineers, sociologists and scientist to solve useful practical problem in real life situations.
1.5Research Methodology
(1) Application of the central point theory in predictive modeling
(2) Using the central predictive model to analyze data.
(3) Significant of the central predictive model
(4) Comparing the central predictive model to the general regression model.
(5) Recommendation and conclusion of the project.
CHAPTER TWO
2.0 Literature Review
According to legend, the French mathematician and philosopher Rene Descartes in 1637 published the work La Geometrie in which he devised out the foundation for one of the most important invention of mathematics namely Cartesian – coordinate system in analytical geometry. Descartes based his system on a relationship between point in a plane and ordered pairs of real numbers.
In 1843, William Ran Hamilton introduced quanternions that describes mechanics in three dimensional spaces which laid out one of the most important history of development of linear equations in two or more variables.
The subject of system of linear equation received a great deal of attention in nineteenth –century mathematics. However, problems of this type are very old in the history of mathematics. The solution of simultaneous systems of equations was well known in China and Japan.
The great Japanese mathematician Seki Shinsuku Kowa (1642-1708) wrote a book on the subject in 1083 that was well in advance of European work on the subject.
Gauss who is one of the greatest mathematicians developed Gaussians elimination around 1800 and used it to solve least squares problem in celestial computation and later in computation to measure the earth and its surface. Even though Gauss name is credited with this method for successively eliminating variable from systems of linear equations, Chinese manuscripts from several centuries earlier have been found that explained how to solve a system of three equations in three unknown:
In the 1940’s, solution of large system of equations became a part of the new branch of mathematics called operational research. Operational research is concerned with deciding how to best design and operate man-machine system, usually under conditions requiring the allocation of scare resources.
The availability of linear equations in two or more variables open the way for linear regression and correlation analysis as long as applied statistics is concerned. The earliest form of regression was the method of least squares which was published by Legendre in 1805 and by Gauss 1809. Legendre and Gauss both applied the method to the problem of determine, from astronomical observations, the orbits of bodies about the sun. Gauss published a further development of the theory of least squares in 1821, including a version of the Gauss- Markov theorem.
The term ‘regression’ was named by Francis Galton in the nineteenth century to describe a biological phenomenon. The phenomenon was that the heights of descendants of all ancestors tend to regress down towards a normal average. The phenomenon is also known as regression towards the mean. The term regression had only this biological meaning, but his work later extended by Udny Yule and Karl Pearson to a more general statistical contact. In the work of Yule and Pearson, the joint distribution of the response and explanatory variables is assumed to be Gaussian. This assumption was weakened by R.A Fisher in his works of 1922 and 1925. Fisher assumed that the conditional distribution of the response variable is closer to Gauss’s formulation of 1821.
Regression methods continue to be an area of active research. In recent decades, new method have been developed for robust regression, regression involving correlated responses such as time series and growth curves, regression in which the predictor or responses variable are curves, images, graphs, or other complex data objects, Bayesian methods for regression, regression in which the predictor variables are measured with error, regression with more predictor variable than observations and causal inference with regression.
Even though these great mathematicians and statisticians have had contributed a lot to the development of linear equations in modern time, they failed to accomplish the true concept of the linear equation in two or more variable.
From the time of Rene Descartes to present, all mathematicians hold the idea that we cannot locate a unique point in a linear equation Ax+ By =C as our central, since linear equation Ax +By =C has an infinite possible solutions or points and has no end: Unless a range of values are given that we can truly determine the end of line equation Ax+ By = C. They also hold the idea that we cannot find the possible solution of the, y- variable if the assumed values of x- variable is not given. Not until the year 2007 that I discovered that we can determine centre of a line, even though a line has infinite points. I also discovered the Central Point Values Interval Theorem, which disproves the idea that we can only find the possible values of x if the values of y are given.
Through the knowledge of the central point theorem, I was able to invent a method that is used to assume the average relationship between the response variables x2, x3, --, xj-- xr and the single predictor x1. This invention is named by me as Central Prediction Theory.
(Reference: adongoayinewilliam13.blogspot.com)
(Reference: adongoayinewilliam13.blogspot.com)
2.1 Fundamental Proof of My Central Point Theory (CPT)
Below are the presentations of the proofs of my Central Point Theory (CPT)
2.1.1 My First Proof:
At central point (xg, yg) in the ordered pair (x,y) of straight line equation Ax+ By = C, the formula for xg and yg are given as;
xg= C/2A
yg = C/2B
At central point (xg,yg), in the ordered pair (x,y) can be proved by equally pairing x to y
i.e Ae = C- By---(3)
By = C-Ax---- (4)
Equating Ax in equation (1) to C- Ax in equation (2), we have
xg = C/2A
Also, equating by in equation (2) to C-By in equation (1), we have
yg = C/2B
2.1.2 My Second Proof:
At central point, the slope of straight line equation Ax+ By =C is given as m = - (yg/xg) where xg and yg denoted as central point pairs and m denoted as slope.
The slope of the straight line can be determined by moving term Ax to right hand side and divide both side by the constant B.
y= - (A/B) x + C/B
But let m= A/B (slope) and b= C/B (intercept), hence we have
y= mx+ b --- (1)
Equally pairing m to y, we have
y=mx+b---- (2)
-mx = b- y --- (3)
Equating y in equation (2) to b-y in equation (3), we have
y= b ------ (4)
Also, equating –mx in equation (3), to mx+ b in equation (2), we have
m =-b/2x—(5)
Putting equating (4) into (6) we have
m = -y/x or m = -yg /xg
2.1.3 My Third Proof:
The third proof states that, at central point (xg, yg) in the ordered pair (x,y) of straight line equation Ax + By= C; xg is adding itself to infinity when yg is subtracting itself to infinity and vice –versa.
(xg, yg) = [(xg, xg+xg, xy+xg+xg...); (yg, yg -yg, yg-yg-yg,)]
Or [(xy, xg+xg, xg+ xg+xg,), (yg, yg- yg, yg-yg- yg,..)]
(Reference: adongoayinewilliam13.blogspot.com)
3.0 CHAPTER THREE
This chapter precisely explains the methodology used in chapter one of this project. Below, the central point theory is practically applied in predictive modeling that hypothesized the exact relationship between variables that have unexplained variation or explained variation. This predictive model is called Central Prediction Model.
3.1 Application of Central Point Theory
Based on the central point theory I am able to develop a theory called Central Prediction Theory which is purposeful for constructing models that hypothesized an exact relationship between variables.
If we believe there will be unexplained variation in response variable perhaps caused by important but unincluded variables or by random phenomena we still use the central prediction model since this model does not account random error. This probability model includes the mean component only and can be used to hypothesize an exact relationship between random phenomena that cannot be modeled or explained when using the existing “General Regression Model”.
3.1.1 Central Point-Line (Two Variable) Probability Model
Given, x2m = β0 + β1x1m
Where x2m= mean dependent or mean response variable
x1m= mean independent or mean predictor variable
β0 (beta zero) = constant = ∑x2m/2n
β1 (beta one) = predictive coefficient = ∑x2m/2∑x1m
The mean component point (x1m, x2m) or (-∑x/2n, ∑y2m/4n) determines the line.
3.1.2 Data Analysis One (Two variables)
In real life situation, suppose a fire insurance company wants to relate the amount of fire damage in major residential fires to the nearest fire station. The study is to be conducted in a large suburb of a major city; a sample of 14 recent fires in this suburb is selected. The amount of damage x2m and the average between the fire and the nearest fire station x1m are recorded for each fire.
Table 1.0: Fire Damage Data
Distance from fire station x1 (miles)
|
Fire damage x2 (thousand of cedi )
|
3.4
1.8
4.6
2.3
3.1
5.5
0.7
3.0
2.6
4.3
2.1
1.1
6.1
4.8
|
26.2
17.8
31.3
23.1
27.5
36.0
14.1
22.3
19.6
31.3
24.0
17.3
43.2
36.4
|
Step 1
We first, hypothesize a central prediction model to relate mean fire damage, x2m, to mean distance from the nearest station x1m. We hypothesize a central predictive probabilistic model:
x2m = β0+ β1x1m
Step 2
We estimate the predictive coefficient β1 and the constant β0. Hence, we have:
β1 = ∑x2/2∑x1 = 370.1/2(45.4) = 4.00
β0 = ∑x2/2n= 370.1/2(14)== 13.22
And the central predictive equation is
x2m = 13.22 + 4.08x1m
Step 3
Use the fitted central prediction equation to predict the exact average amount of fires damage, if the exact average distance from fires stations is 3.2 (miles)
xm = 13.22 + 40.08(3.2) = 26.28.
Hence, the exact average amount of fires damage is 26.28 thousand cedis if the exact distance from the fires stations are 3.2 miles.
3.1.3 Data Analysis Two (Two Variables)
The fertility rate of a country is defined as the number of children a woman citizen bears, on average, in her life time. Scientific American (December.1993) reported on the researchers found that family planning can have a great effect on fertility rate x2, and contraceptive prevalence x1 (measured as the percentage of married woman who use contraceptives) for each of 27 developing countries.
Table 1.1
Country
|
Contraceptive Prevalence x1
|
Fertility Rate x2
|
Mauritius
Thailand
Colombia
Costa Rica
Sri Lanka
Turkey
Peru
Mexico
Jamaica
Indonesia
Tunisia
El Salvador
Morocco
Zimbabwe
Egypt
Bagladesh
Botswana
Jordan
Kenya
Guatemala
Cameroon
Ghana
Pakistan
Senegal
Sudan
Yemen
Nigeria
|
76
69
66
71
63
62
60
55
55
50
51
48
42
46
40
40
35
35
28
24
16
14
13
13
10
9
7
Total=
|
2.2
2.3
2.9
3.5
2.7
3.4
3.5
4.0
2.9
3.1
4.3
4.5
4.0
5.4
4.5
5.5
4.8
5.5
6.5
5.5
5.8
6.0
5.0
6.5
4.8
7.0
5.7
Total=
|
Step 1
We first, hypothesize a central prediction model relating average fertility rate, x2m, to average contracephic prevalence, x2m’
X2m =β0+β1x1m
Step 2
We estimate the predictive coefficient β1 and this content β0. Hence, we have
β1= 0.058
β0 = 2.34
and the central prediction equation is
x2m = 2.34 + 0.058x1m
Step 3
Using the fitted central prediction equations to estimate the exact average fertility rate of the developing countries in December 1994, if the exact average contraceptive prevalence for the developing countries in December 1994 is 41%
x2m= 2.34 +0.058 (41) =4.72
Hence the exact average fertility rate for the developing countries in December 1994 is 4.72%, if the exact average contraceptive prevalence for the developing countries in December 1994 is 41%.
3.2 The General Central Prediction Model
X2m=β0+β1x1m
X3m=β0+β1x1m+β2x2m
X4m=β0+β1x1m+β2x2m+β3x3m
. ………………………………………………
Xjm= β0+ β1x1m+β2x2m+β3x3m+…+β(j-1)x(j-1)m
………………………………………………………………
Xrm=β0+β1x1m+β2x2m+β3x3m+…+β(j-1)x(j-1)m+…+β(r-1)x(r-1)m
Where, x2m, x3m, x4m, …,xjm,…,xrm are dependent variables and x1m is an independent variable.
The mean portion of the model β(j-1) determines the contribution of the independent variable xjm
3.2.1 Data Analysis Three(Five Variables)
The central prediction analysis can be employed to investigate the determinants of survival size of nonprofit hospital. For a given sample of hospitals, survival size, x5m is defined as the largest size hospital (in terms of number of beds exhibition growth in market shore over a specific time interval. Suppose to states are randomly slated and the survival size for all nonprofit hospitals in each state is determined for two time periods five yours apart, gelding two observations per state. The 20 survival sizes are hosted in the table below, along with the following data for each state , for the second year in each time internal:
x1= Percentage of beds that are for-profit hospitals
x2= Ration of the number of persons enrolled in health
Maintenance organizations (HMO) to the number of persons covered by hospital insurance
x3=state population (in thousands)
x4 =percent of state that is urban
x5=survival size
We first, hypothesize a central prediction model relating averages x2m, x3m, x4m, x5m, x5m, to average x1m. Hence, the model is
x2m=β0+β1x1m
x3m=βo+β1x1m+β2x2m
x4m=β0+β1x1m+β2x2m+β3x3m
x5m=β0+β1x1m+β2x2m+β3x3m+β4x4m
Step 2
We estimate the predictive coefficient β1, β2, β3, β4 and the constant β0.
Step3
Use the fitted central prediction equation to estimate the exact averages of x2m, x3m, x4m and x5m, if the exact average of x1m is 0.12.
.
3.2Application Of The Central Prediction
Introduction
The study of this project found that the central prediction theory can be applied in logistic prediction, survival prediction, time series prediction etc. Brief explanations of the topics mention above are discussed fundamentally.
3.2.1 Central Binary Prediction
Imaging Standard Charted Bank wants to determine which customers are most likely to repay their loan. Thus, they want to record a number of independent variables that describe the customer’s reliability and then determine whether these variables are related to the binary variables x2m =1 if the customer repays the loan and x2m=0 if the customer fails to repay the loan. This simply telling us that when the mean response variable x2m is binary, the distribution of x2m reduces to a single value, the probability p=pr(x2m=1).
In this central logistic model, the natural logarithm of odds ratio is the mean explanatory variables by a central logistic model. Here, we are to consider the situation where we have a single mean independent variable, but this model can be generalized by relating the binary mean response variables x2m, x3m, .., xjm, ,.., xrm to single mean predictor variable x1m.
Considered p(x1m)=x2m be the probability that x2m equals 1 when the mean independent variable equals x1m. By modeling the log-odds ratio to a model in x1m, a simple central logistic model is given as:
P(x1m)=[℮β0+β1X1m]/[1+℮β0+β1X1m]
Assuming a doctor recorded the level of an enzyme, Creatinine Kinase(CK), for patients who he suspected of having a heart attack. The Objective of the study was to asses whether measuring the amount of CK on admission to the hospital was a useful diagnostic indicator of whether patients admitted with a diagnosis of a heart attack had really had a heart attack. The enzyme CK was measured in 360 patients on admission to the hospital. After a period of time a doctor review the records of these patients to decide which of the 360 patients had actually had a heart attack. The data are given in the table below with CK values given as the
midpoint of the range of values in each of 13 classes of values.
CK Values
|
Number Of Patients With Heart Attack
|
Number Of Patients Without Heart Attack
|
20
|
2
|
88
|
60
|
13
|
26
|
100
|
30
|
8
|
140
|
30
|
5
|
180
|
21
|
0
|
220
|
19
|
1
|
260
|
18
|
1
|
300
|
13
|
1
|
340
|
19
|
0
|
380
|
15
|
0
|
420
|
7
|
0
|
460
|
8
|
0
|
500
|
35
|
0
|
We use the formula to calculate the exact probability that a patient had a heart attack when the
CK level in the patient was 160. And by calculation, we have:
β0=230/2*13=8.45
β1=230/2*3380=0.034
P(160)=[℮8.45+0.034*160]/[1+℮8.45+0.034*160]=1
NOTE: This central logistic model does not account random error, since it includes the mean component only.
Also, the actuary who concerns with the contingencies of death, retirements, sickness, withdrawals, marriage, etc. may want to know the mean (or exact) probabilities or mean (or exact) rates as a representative of individuals occurrence of such events in order to predict the exact future occurrence so as to calculate exact premiums and exact annuities for insurance and other financial operations without account of random errors.
Taking into consideration, the mortality rates over certain range of ages can be fitted as Central Binary logistic prediction model to a given set of data so as to determine future exact estimates of the actual deaths dxm, future exact crude rates qxm, provided the exact exposed to risk Exm, for each year of age is known.
Algebraically, the central binomial logistic prediction model is given as:
q*xm=1/[1+e-(α+βxm)]
α=Σd/2n
β=Σd/2Σ(x)
d*xm=Exmq*xm
d and (x) represent death and age respectively.
Example, mortality rates over 30-34 were estimated fitting the central binary logistic prediction model to the data below.
Ages(x)
|
Deaths (d)
|
30
|
335
|
31
|
391
|
32
|
428
|
33
|
436
|
34
|
458
|
If the mean expose to risk is 140000 we estimate:
1) The parameters α and β.
2) The exact crude rate of an insured of exact age 42
3) The exact estimate of actual deaths of an insured of exact age 42
as follow:
1)α=204.8
β=6.4
2) q*42=1
Even though binary logistic prediction model is applied here, the Central Poisson Prediction Model is appropriate.
4.0 CHAPTER FOUR
4.1Significance Of The Central Prediction Model
Introduction
This chapter precisely explained the significance of the central prediction model. It precisely explained the validity of the central prediction model as compared to the general regression model.
4.1.1Central Prediction Model Assumptions
(1) For any given set of values x1, x2, x3,---, xj, ---, xr, the mean error (ME) equal zero,(except complicated cases).
(2) The variance or variability of the mean error (ME) is equal zero,(except complicated cases).
(3) It is nonrandom model,(except complicate cases).
We already know that if there is unexplained variation in the independent variables perhaps caused by important but unincluded variables or by random phenomena, we ought to use model that accounts for this random error. One of this model is the “general regression model”. But, it is possible to use the central prediction model? Since this model contains only the mean component and does not account for random error(except complicated cases). Yes! The central prediction model is a substitute of regression model with special validity than the regression model. Under central prediction model, the mean responses must fall exactly on the line because the model leaves no room for random phenomena that cannot be modeled or explained when using the regression model(except complicated cases): In this case, we need no test-statistics to evaluate the validity of the central prediction model since it does not account for random error(except complicated cases).
5.0 CHAPTER FIVE
5.1Conclusion.
In this project, the central point theory which I am developing is applied in prediction models. Through the knowledge of the central point theory, I devised a predictive model called Central Prediction Model which can be used in logistic models, survival models, econometrics , time series model etc.
This work also precisely explains the validity of the central prediction model and how important it is as compared to the general regression model.
5.2Recommendation
This work precisely explained the important of the Central Point Theory and Central Prediction Theory which I am developing. This work revolutionized the existing linear equation in two or more variables and the general regression theory’
Recommendations are made to better the development of the central prediction theory for academy research, learning and teaching.
REFERENCE
*Rene Descartes.(1637)."The Geometry".
*Galton, Francis.(1886). 'Regression Towards Mediocrity in Hereditary stature'. Volume 15.
REFERENCE
*Rene Descartes.(1637)."The Geometry".
*Galton, Francis.(1886). 'Regression Towards Mediocrity in Hereditary stature'. Volume 15.
No comments:
Post a Comment