Polynomial regression is a method of least-square curve fitting. It will take a set of data and produce an approximation. More specifically, it will produce the coefficients to a polynomial that is an approximation of the curve. The number of coefficients can determine the degree of the polynomial and how accurate the curve can be fit. A degree of zero (1 coefficient) is a simple mean average. First degree (2 coefficients) is also known as linear regression. Second and higher degrees will produce non-linear polynomial regression.
Not all data sets can be accurately modeled with polynomial regression regardless of the degree polynomial selected. It helps to have an understanding of the origin of the data and the function attempting to be modeled. Polynomials are one method of curve fitting. Some other methods are implemented with the Gauss-Newton class.
There are some non-linear functions that can be linearized and then solved with polynomial regression. There has been a fair amount of interest in having these functions, so they have been added to a second package called Linearized Regression. The non-linear functions implemented are:
These classes are in beta. More testing is needed to determine if this is the best way to add this functionality. In addition, the beta versions do not use BC math. There is little risk of overflow in most applications since these equations are solved as 1st degree polynomials.
For more about how these functions can be linearized, consult this article.
Weighted regression is a method by which some input terms are considered more strongly than others. It can be applied to any regression method. Each input term is assigned a weight. If the weights are all equal then the results are the same as unweighted regression. For more information on the mathematics of weighted regression refer to this write-up.
The key to weighting is to register a weighting technique. These are classes that implement the interface Weighting. Three such classes are provided:
In addition the weighting interface is very simple and new custom methods of weighting can easily be added.
Polynomial regression requires PHP 5.0 or above, since the code is entirely object-oriented. PHP must be compiled with BC math library, which is standard with most builds of PHP. The use of BC arbitrary precision arithmetic is almost always necessary for regression of degrees higher then 4, or data sets with thousands of points. The numbers simply get very large or very small and exceed what can be represented by conventional floating-point values.
To calculate polynomial regression online from your web browser, use the Online Polynomial Regression calculator page based on this library.
Documentation is available online, generated from the source code using phpDocumentor. The documentation can be recreated from the source if needed.
Version 1.2.1, Released Feburary 17, 2015.
MD5: d3bc84c6445a8d77e78ed1a1c64392a4
SHA1: 4229b43d7f0d87f6d5e608570373e38cabd82d52
MD5: db5ecefe1e98d93f92515e409c53402f
SHA1: f8cdde1ec875dcad3880e0ba96c7c4321bafe57b
Version 1.2, Released Feburary 13, 2015.
MD5: 99339673b4f7e65a3996afe6604d237e
SHA1: f0eec0f17a357595df67f7ba49c4f37bde9420d2
MD5: c8b491d23d37da3f864b823add3a0860
SHA1: a8c73c1e33245562cfbcc47a3ac6d21b81cd6543
Version 1.1, Released May 5, 2014.
MD5: 799a8c57e9730bbb9618999e2d6d0287
SHA1: 6e1defdc5754670156578578dea477198f7f735d
MD5: 9357794584c282279e9d6738b4a08831
SHA1: 4735f430731eeb94c0441cbfdc59f6e93e452c59
Version 1.0, Released December 29, 2013.
Added support for forced coefficient.
MD5: b9b9d9083d6cb0d1e9a2c6b8a3066ac8
SHA1: 3c7153945b5d90c6e176eebb5f820f7e31f292dd
MD5: fb5e505d7fbd0915239e54a84bac005a
SHA1: 4feef4f46a36bcf3a27ca254afd1eed4857098ac
Version 0.91, Released May 18, 2013.
Library renamed to better clarify it's function.
MD5: cc8284107dae45b60cd5e07fe6152991
SHA1: a3027e7bb9783d06e6f506712aaf4158db8baf52
MD5: be854d254ffd844d20df4c6676cad80b
SHA1: 5b8e59d218968cdf8f593ef1dad4b3ca5541ebbf
Version 0.9, Released June 16, 2012.
Improved performance using Gaussian elimination rather than Cramer's rule to solve the system of equations.
MD5: ea3aacc18c1b3086df502823704639d2
SHA1: 69038bd1414777e0e936ea8701a85cecb363ebc1
MD5: 4193ddd4323277b67acdca3e59dda1f8
SHA1: 3491860264c8caa077fd4ea9517c8ebc139d8b60
Version 0.8, Released June 1, 2009.
MD5: 602a40cfba4de4751edf409a7c3e0854
SHA1: c5495d644e32e7a20f0af812675da6eb96619485
MD5: 17ceb039ff4474dcbf86adac8f103753
SHA1: e8e887a8cd504c32cc8c37c4c2b02364b6b44ee3
The download package contains three directories:
These are designed to be unpacked into a directory called Includes. If you are only using the base class PolynomialRegression, you can place PolynomialRegression.php wherever you like as it does not include any other files.
The file RootDirectory.inc.php defines a variable called $RootDirectory that contains either the absolute or relative path. The paths for the rest of the files are all based on this. This file must exist in any path that includes these classes (with the exception of PolynomialRegression). The file should look something like this:
<?php
$RootDirectory = "./";
?>
This include system allows PHP code to exist in any number of sub- directories and always correctly resolve the path to include files. If you do not like this style of relative paths, or have an other convention you use, you can modify the first couple lines of each class and manually place files where you desire them.
Several of these example include some functions from plot.php. It uses XY Plot to draw charts.
Linear regression is one of the simplest forms of polynomial regression. It produces a 1st degree polynomial with the two coefficients usually called slope and intercept.
<?php
// Load the polynomial regression class.
require_once( 'RootDirectory.inc.php' );
require_once( $RootDirectory . 'includes/PolynomialRegression/PolynomialRegression.php' );
$data =
array
(
array( 0.00, 27.3834562958158 ), array( 0.02, 38.2347360741764 ),
array( 0.04, 42.5632501679666 ), array( 0.06, 19.4638760104114 ),
array( 0.08, 42.690858098909 ), array( 0.10, 25.330634164557 ),
array( 0.12, 49.6507591632989 ), array( 0.14, 34.3502467856792 ),
array( 0.16, 52.5267153107089 ), array( 0.18, 34.5528919545231 ),
array( 0.20, 44.3220950255077 ), array( 0.22, 44.7805694031715 ),
array( 0.24, 32.9090525820585 ), array( 0.26, 56.7941323051778 ),
array( 0.28, 48.7192221569495 ), array( 0.30, 48.7964850888813 ),
array( 0.32, 56.8905173101315 ), array( 0.34, 66.0107252116092 ),
array( 0.36, 74.3149331561425 ), array( 0.38, 52.9076168019644 ),
array( 0.40, 64.3463647026162 ), array( 0.42, 50.0776706625628 ),
array( 0.44, 62.3527806092493 ), array( 0.46, 75.9589658430523 ),
array( 0.48, 69.280743962744 ), array( 0.50, 74.4868159870338 ),
array( 0.52, 76.4548504742096 ), array( 0.54, 82.9347555390181 ),
array( 0.56, 83.9546576353049 ), array( 0.58, 83.6379624022705 ),
array( 0.60, 92.6278811310654 ), array( 0.62, 84.3395153143048 ),
array( 0.64, 86.832363003336 ), array( 0.66, 105.66563124607 ),
array( 0.68, 100.175129109663 ), array( 0.70, 82.0781941886623 ),
array( 0.72, 95.9916212989616 ), array( 0.74, 87.5853932119967 ),
array( 0.76, 93.5435091554247 ), array( 0.78, 98.0622114645327 ),
array( 0.80, 118.067000253198 ), array( 0.82, 98.2918886287489 ),
array( 0.84, 111.027863906934 ), array( 0.86, 113.1135947538 ),
array( 0.88, 117.777915259186 ), array( 0.90, 108.621331147219 ),
array( 0.92, 112.979639159754 ), array( 0.94, 122.065499190418 ),
array( 0.96, 116.136221596622 ), array( 0.98, 111.215762010712 ),
array( 1.00, 122.743302375187 )
);
// Precision digits in BC math.
bcscale( 10 );
// Start a regression class of order 2--linear regression.
$PolynomialRegression = new PolynomialRegression( 2 );
// Add all the data to the regression analysis.
foreach ( $data as $dataPoint )
$PolynomialRegression->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
// Get coefficients for the polynomial.
$coefficients = $PolynomialRegression->getCoefficients();
// Print slope and intercept of linear regression.
echo "Slope : " . round( $coefficients[ 1 ], 2 ) . "<br />";
echo "Y-intercept : " . round( $coefficients[ 0 ], 2 ) . "<br />";
?>
In this example, 50 data points are used to construct linear regression. The slope and y-intercept of the trend are then displayed.
The image above was created in a spreadsheet with the data points from the example. The linear regression trend line is displayed, along with the trend line's function.
This is the output from the example. Note how the slope and intercept values match those of the function in the spreadsheet created chart.
There is not much reason to use this library to compute linear-regression as there are far faster implementations. However for data sets that have very large numbers or when high accuracy is needed this library may be useful.
<?php
// Load the polynomial regression class.
require_once( 'RootDirectory.inc.php' );
require_once( $RootDirectory . 'includes/PolynomialRegression/PolynomialRegression.php' );
require_once( $RootDirectory . 'includes/plot.php' );
// Data created in a spreadsheet with some random scatter. True function should be:
// f( x ) = 0.65 + 0.6 x - 6.25 x^2 + 6 x^3
$data =
array
(
array( 0.00, 0.65646507 ), array( 0.05, 0.61435503 ),
array( 0.10, 0.63151965 ), array( 0.15, 0.57711365 ),
array( 0.20, 0.58534249 ), array( 0.25, 0.54148715 ),
array( 0.30, 0.43877649 ), array( 0.35, 0.39516968 ),
array( 0.40, 0.24977940 ), array( 0.45, 0.24246690 ),
array( 0.50, 0.07730788 ), array( 0.55, 0.03633931 ),
array( 0.60, 0.08980716 ), array( 0.65, 0.07562991 ),
array( 0.70, 0.11196788 ), array( 0.75, 0.15086596 ),
array( 0.80, 0.19979455 ), array( 0.85, 0.34683801 ),
array( 0.90, 0.48338650 ), array( 0.95, 0.59196113 ),
array( 1.00, 0.99233320 )
);
// Precision digits in BC math.
bcscale( 10 );
// Start a regression class with a maximum of 4rd degree polynomial.
$polynomialRegression = new PolynomialRegression( 4 );
// Add all the data to the regression analysis.
foreach ( $data as $dataPoint )
$polynomialRegression->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
// Add raw data to plot.
plotRenderData( $data, $colorMap[ "Red" ] );
$plot->setX_Span( 0, 1 );
$plot->setY_Span( 0, 1 );
$Y_MajorScale = 0.1;
$Y_MinorScale = $Y_MajorScale / 5;
$X_MajorScale = 0.1;
$X_MinorScale = $X_MajorScale / 5;
plotAddScale();
// Render the points to image
$plot->setCircleSize( 8 );
$plot->renderPoints();
// Get coefficients for the polynomial.
$coefficients = $polynomialRegression->getCoefficients();
$functionText = "f( x ) = ";
foreach ( $coefficients as $power => $coefficient )
{
if ( $power > 0 )
$functionText .= " + ";
$functionText .= round( $coefficient, 2 );
if ( $power > 0 )
{
$functionText .= "x";
if ( $power > 1 )
$functionText .= "^" . $power;
}
}
// Place text
imageString
(
$image,
2,
$leftMargin + 2,
$topMargin + 2,
$functionText,
$colorMap[ "Black" ]
);
plotRenderRegression( $polynomialRegression, $coefficients, 0, 1, $colorMap[ "LightRed" ] );
// Output image
header( "Content-Type: image/png" );
imagePNG( $image );
?>
This example starts with the knowledge the data was generated by some function that is a 3rd degree polynomial. The data is formed by 21 samples close to the function f( x ) = 6 x3 - 6.25 x2 + 0.6 x + 0.65 with some random noise added. The regression analysis attempts to reconstruct the coefficients of the original function.
The graph shows the input data as red circles, and the regression plot as the red line. The function with the interpolated coefficients is printed at the top. The coefficients of this function are fairly close to the original.
<?php
// Load the polynomial regression class.
require_once( 'RootDirectory.inc.php' );
require_once( $RootDirectory . 'includes/PolynomialRegression/PolynomialRegression.php' );
$data =
array
(
array( 0.00, 27.3834562958158 ), array( 0.02, 38.2347360741764 ),
array( 0.04, 42.5632501679666 ), array( 0.06, 19.4638760104114 ),
array( 0.08, 42.690858098909 ), array( 0.10, 25.330634164557 ),
array( 0.12, 49.6507591632989 ), array( 0.14, 34.3502467856792 ),
array( 0.16, 52.5267153107089 ), array( 0.18, 34.5528919545231 ),
array( 0.20, 44.3220950255077 ), array( 0.22, 44.7805694031715 ),
array( 0.24, 32.9090525820585 ), array( 0.26, 56.7941323051778 ),
array( 0.28, 48.7192221569495 ), array( 0.30, 48.7964850888813 ),
array( 0.32, 56.8905173101315 ), array( 0.34, 66.0107252116092 ),
array( 0.36, 74.3149331561425 ), array( 0.38, 52.9076168019644 ),
array( 0.40, 64.3463647026162 ), array( 0.42, 50.0776706625628 ),
array( 0.44, 62.3527806092493 ), array( 0.46, 75.9589658430523 ),
array( 0.48, 69.280743962744 ), array( 0.50, 74.4868159870338 ),
array( 0.52, 76.4548504742096 ), array( 0.54, 82.9347555390181 ),
array( 0.56, 83.9546576353049 ), array( 0.58, 83.6379624022705 ),
array( 0.60, 92.6278811310654 ), array( 0.62, 84.3395153143048 ),
array( 0.64, 86.832363003336 ), array( 0.66, 105.66563124607 ),
array( 0.68, 100.175129109663 ), array( 0.70, 82.0781941886623 ),
array( 0.72, 95.9916212989616 ), array( 0.74, 87.5853932119967 ),
array( 0.76, 93.5435091554247 ), array( 0.78, 98.0622114645327 ),
array( 0.80, 118.067000253198 ), array( 0.82, 98.2918886287489 ),
array( 0.84, 111.027863906934 ), array( 0.86, 113.1135947538 ),
array( 0.88, 117.777915259186 ), array( 0.90, 108.621331147219 ),
array( 0.92, 112.979639159754 ), array( 0.94, 122.065499190418 ),
array( 0.96, 116.136221596622 ), array( 0.98, 111.215762010712 ),
array( 1.00, 122.743302375187 )
);
// Precision digits in BC math.
bcscale( 10 );
// Start a regression class of order 2--linear regression.
$leastSquareRegression = new PolynomialRegression( 2 );
// Add all the data to the regression analysis.
foreach ( $data as $dataPoint )
$leastSquareRegression->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
// Get coefficients for the polynomial.
$coefficients = $leastSquareRegression->getCoefficients();
// Print slope and intercept of linear regression.
echo "Slope : " . round( $coefficients[ 1 ], 2 ) . "<br />\n";
echo "Y-intercept : " . round( $coefficients[ 0 ], 2 ) . "<br />\n";
//
// Get average of Y-data.
//
$Y_Average = 0.0;
foreach ( $data as $dataPoint )
$Y_Average += $dataPoint[ 1 ];
$Y_Average /= count( $data );
//
// Calculate R Squared.
//
$Y_MeanSum = 0.0;
$Y_ErrorSum = 0.0;
foreach ( $data as $dataPoint )
{
$x = $dataPoint[ 0 ];
$y = $dataPoint[ 1 ];
$error = $y;
$error -= $leastSquareRegression->interpolate( $coefficients, $x );
$Y_ErrorSum += $error * $error;
$error = $y;
$error -= $Y_Average;
$Y_MeanSum += $error * $error;
}
$R_Squared = 1.0 - ( $Y_ErrorSum / $Y_MeanSum );
echo "R Squared : $R_Squared<br />\n";
?>
This example shows how to compute the C oefficient of determination (generally called R-Squared) after the coefficients have been calculated. This value is one representation of the goodness of fit. The closer this value is to 1.0, the better the fit.
There are times when it is known that the intercept of the function is zero, but the calculated coefficient for the offset is not. For this one can use the function setForcedCoefficient( 0, 0 ). This is a typical example involving linear regression of a noisy set of data points.
<?php
// Load the polynomial regression class.
require_once( 'RootDirectory.inc.php' );
require_once( $RootDirectory . 'includes/PolynomialRegression/PolynomialRegression.php' );
require_once( $RootDirectory . 'includes/plot.php' );
$data =
array
(
array( 0.05, 0.1924787314 ), array( 0.10, 0.4586186921 ),
array( 0.15, 0.1318838557 ), array( 0.20, 0.1865927433 ),
array( 0.25, 0.4667421897 ), array( 0.30, 0.1027880072 ),
array( 0.35, 0.5599968985 ), array( 0.40, 0.6605423892 ),
array( 0.45, 0.620103306 ), array( 0.50, 0.4445367125 ),
array( 0.55, 0.5912679423 ), array( 0.60, 0.7942020837 ),
array( 0.65, 0.8694575373 ), array( 0.70, 0.4146043937 ),
array( 0.75, 0.6604661468 ), array( 0.80, 0.9138025779 ),
array( 0.85, 0.8124334151 ), array( 0.90, 0.7998087715 ),
array( 0.95, 0.7391285236 ), array( 1.00, 0.9012208138 ),
);
// Precision digits in BC math.
bcscale( 10 );
// Add raw data to plot.
plotRenderData( $data, $colorMap[ "Red" ] );
$plot->setX_Span( 0, 1 );
$plot->setY_Span( 0, 1 );
plotAddScale();
// Render the points to image
$plot->setCircleSize( 8 );
$plot->renderPoints();
// Start a regression class of order 4, one with no forcing coefficients,
// one with two forced coefficients.
$regression1 = new PolynomialRegression( 2 );
$regression2 = new PolynomialRegression( 2 );
$regression2->setForcedCoefficient( 0, 0 );
// Add all the data to both regression analysis.
foreach ( $data as $dataPoint )
{
$regression1->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
$regression2->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
}
// Get coefficients for the polynomial.
$coefficients1 = $regression1->getCoefficients();
$coefficients2 = $regression2->getCoefficients();
// Plot each of the curves.
plotRenderRegression( $regression1, $coefficients1, 0, 1, $colorMap[ "Green" ] );
plotRenderRegression( $regression2, $coefficients2, 0, 1, $colorMap[ "LightBlue" ] );
// Output image
header( "Content-Type: image/png" );
imagePNG( $image );
?>
In the graph, the green line shows linear-regression where the blue line shows linear regression with the intercept forced to zero. The actual slope is 1, but there is a very small signal-to-noise-ratio.
In addition to being able to force a zero offset, it is possible to set any coefficient to a known value. This will allow the other coefficients to be determined by the regression analysis. Thus if it is known that one of the coefficients must be a specific value, the remaining coefficients will take this into account.
As with forcing an intercept, the function setForcedCoefficient is used. The first parameter is witch coefficient is to be forced to the known value, and the second parameter is the value. More than one coefficient may be forced if desired.
<?php
// Load the polynomial regression class.
require_once( 'RootDirectory.inc.php' );
require_once( $RootDirectory . 'includes/PolynomialRegression/PolynomialRegression.php' );
require_once( $RootDirectory . 'includes/plot.php' );
$data =
array
(
array( 0.00, 0.65379741 ), array( 0.05, 0.64074062 ),
array( 0.10, 0.72833783 ), array( 0.15, 0.44629689 ),
array( 0.20, 0.45174500 ), array( 0.25, 0.34161602 ),
array( 0.30, 0.78621158 ), array( 0.35, 0.38960121 ),
array( 0.40, 0.14126441 ), array( 0.45, 0.38123106 ),
array( 0.50, 0.20605429 ), array( 0.55, 0.02456525 ),
array( 0.60, 0.48434811 ), array( 0.65, 0.21453304 ),
array( 0.70, 0.54765807 ), array( 0.75, 0.41625294 ),
array( 0.80, 0.78163483 ), array( 0.85, 0.71306009 ),
array( 0.90, 0.53515664 ), array( 0.95, 0.98918384 ),
array( 1.00, 0.93061202 )
);
// The actual coefficients for the above data (without noise).
$trueCoefficients = array( 0.9, -2, 0.6, 1.5 );
// Precision digits in BC math.
bcscale( 10 );
// Add raw data to plot.
plotRenderData( $data, $colorMap[ "Red" ] );
$plot->setX_Span( 0, 1 );
$plot->setY_Span( 0, 1 );
plotAddScale();
// Render the points to image
$plot->setCircleSize( 4 );
$plot->renderPoints();
// Start a regression class of order 4, one with no forcing coefficients,
// one with two forced coefficients.
$regression1 = new PolynomialRegression( 4 );
$regression2 = new PolynomialRegression( 4 );
$regression2->setForcedCoefficient( 1, -2 );
$regression2->setForcedCoefficient( 3, 1.5 );
// Add all the data to both regression analysis.
foreach ( $data as $dataPoint )
{
$regression1->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
$regression2->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
}
// Get coefficients for the polynomial.
$coefficients1 = $regression1->getCoefficients();
$coefficients2 = $regression2->getCoefficients();
// Plot each of the curves.
plotRenderRegression( $regression1, $trueCoefficients, 0, 1, $colorMap[ "LightRed" ] );
plotRenderRegression( $regression1, $coefficients1, 0, 1, $colorMap[ "Green" ] );
plotRenderRegression( $regression2, $coefficients2, 0, 1, $colorMap[ "LightBlue" ] );
$y = $imageHeight - $bottomMargin;
printFunction( $y, 3, $colorMap[ "LightRed" ], $trueCoefficients );
printFunction( $y, 2, $colorMap[ "Green" ], $coefficients1 );
printFunction( $y, 1, $colorMap[ "LightBlue" ], $coefficients2 );
// Output image
header( "Content-Type: image/png" );
imagePNG( $image );
function printFunction( $y, $line, $color, $coefficients )
{
global $image;
global $leftMargin;
$functionText = "f( x ) = ";
foreach ( $coefficients as $power => $coefficient )
{
if ( $power > 0 )
$functionText .= " + ";
$functionText .= number_format( $coefficient, 2 );
if ( $power > 0 )
{
$functionText .= "x";
if ( $power > 1 )
$functionText .= "^" . $power;
}
}
// Place text
imageString
(
$image,
2,
$leftMargin + 2,
$y - $line * imagefontheight( 2 ),
$functionText,
$color
);
}
?>
In this example regression is preformed on a 3rd degree polynomial set of noisy data. The true coefficients are (0.9, -2, 0.6, 1.5). The coefficients are first determined without any forced terms, and then by forcing two of the terms to known values.
The graph displays the input data as red circles. The red line is the true curve. The green line is the regression with no known coefficients, and the blue line is the regression with two forced coefficients. As expected, the blue line conforms more closely to the true curve represented by the red line.
The linearized regression classes are children of the polynomial regression class and overload some of the functions in order to preform linearization. So their operation is almost identical to the polynomial regression class.
<?php
// Load the polynomial regression class.
require_once( 'RootDirectory.inc.php' );
require_once( $RootDirectory . 'includes/PolynomialRegression/PolynomialRegression.php' );
require_once( $RootDirectory . 'includes/LinearizedRegression/ExpRegression.php' );
require_once( $RootDirectory . 'includes/plot.php' );
$data =
array
(
array( 0.00, 0.024094775 ), array( 0.05, 0.0390894172 ),
array( 0.10, 0.0524281705 ), array( 0.15, 0.0094749558 ),
array( 0.20, 0.1342814605 ), array( 0.25, 0.0181198568 ),
array( 0.30, 0.032552131 ), array( 0.35, 0.0227223143 ),
array( 0.40, 0.1169744975 ), array( 0.45, 0.1226243145 ),
array( 0.50, 0.1427587983 ), array( 0.55, 0.1497210208 ),
array( 0.60, 0.1727192031 ), array( 0.65, 0.3031739468 ),
array( 0.70, 0.2400640511 ), array( 0.75, 0.3650339253 ),
array( 0.80, 0.4659496711 ), array( 0.85, 0.5082614871 ),
array( 0.90, 0.6841058006 ), array( 0.95, 0.7940730517 ),
);
// Precision digits in BC math.
bcscale( 10 );
// Add raw data to plot.
plotRenderData( $data, $colorMap[ "Red" ] );
$plot->setX_Span( 0, 1 );
$plot->setY_Span( 0, 1 );
plotAddScale();
// Render the points to image
$plot->setCircleSize( 8 );
$plot->renderPoints();
// Create instance of linearized regression.
$regression = new ExpRegression();
// Add all the data to both regression analysis.
foreach ( $data as $dataPoint )
$regression->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
// Get the resulting coefficients.
$coefficients = $regression->getCoefficients();
// Plot each of the curves.
plotRenderRegression
(
$regression,
$coefficients,
0,
1,
$colorMap[ "Green" ],
"ExpRegression"
);
$string =
"f( x ) = "
. number_format( $coefficients[ 0 ], 4 )
. " exp( "
. number_format( $coefficients[ 1 ], 4 )
. " x )";
// Place text
imageString
(
$image,
2,
$leftMargin + 2,
10,
$string,
$colorMap[ "Green" ]
);
// Output image
header( "Content-Type: image/png" );
imagePNG( $image );
?>
The basic mechanics of the polynomial regression class are used because in the linearized form, this function turns into a 1st degree polynomial. For this reason the number of coefficients does not need to be specified.
The graph shows a noisy signal and the logarithmic curve calculated from y = a eb x. The actual noisy data used y = 0.18 e4 x.
One application of weighting is useful for assisting the linearized version of the power function. This function often responds poorly to noise and does not resolve to a good fit because the linearized version is being minimized, not the actual function. A workaround solution is to unequally weight the terms before solving.
<?php
// Load the polynomial regression class.
require_once( 'RootDirectory.inc.php' );
require_once( $RootDirectory . 'includes/PolynomialRegression/PolynomialRegression.php' );
require_once( $RootDirectory . 'includes/LinearizedRegression/PowRegression.php' );
require_once( $RootDirectory . 'includes/Weighting/ExponentiationWeighting.php' );
require_once( $RootDirectory . 'includes/plot.php' );
$data =
array
(
array( 0.05,0.00604730001 ),
array( 0.10,0.00368496403 ),
array( 0.15,0.00149732550 ),
array( 0.20,0.00750937272 ),
array( 0.25,0.01402765100 ),
array( 0.30,0.00460214218 ),
array( 0.35,0.01895682587 ),
array( 0.40,0.04611466211 ),
array( 0.45,0.06140241681 ),
array( 0.50,0.05753703495 ),
array( 0.55,0.10084107155 ),
array( 0.60,0.14016251588 ),
array( 0.65,0.18072751735 ),
array( 0.70,0.23557998528 ),
array( 0.75,0.30045147211 ),
array( 0.80,0.40979875947 ),
array( 0.85,0.51324006361 ),
array( 0.90,0.65069131055 ),
array( 0.95,0.81135826051 ),
array( 1.00,1.00234398314 ),
);
// Precision digits in BC math.
bcscale( 10 );
// Add raw data to plot.
plotRenderData( $data, $colorMap[ "Red" ] );
$plot->setX_Span( 0, 1 );
$plot->setY_Span( 0, 1 );
plotAddScale();
// Render the points to image
$plot->setCircleSize( 8 );
$plot->renderPoints();
// Create two power-of regression classes. One will be weighted, the other
// will not.
$regression1 = new PowRegression();
$regression2 = new PowRegression();
// Use exponentiation weighting on first regression class.
$powerWeightedRegression = new ExponentiationWeighting( 4 );
$regression1->setWeighting( $powerWeightedRegression );
// Add all the data to both regression analysis.
foreach ( $data as $dataPoint )
{
$regression1->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
$regression2->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
}
// Get coefficients for the polynomial.
$coefficients1 = $regression1->getCoefficients();
$coefficients2 = $regression2->getCoefficients();
printFunction( 0, $colorMap[ "Red" ], array( 1, 4 ), " true" );
printFunction( 1, $colorMap[ "Green" ], $coefficients1, " weighted" );
printFunction( 2, $colorMap[ "Blue" ], $coefficients2, " unweighted" );
// Plot each of the curves.
plotRenderRegression
(
$regression1,
$coefficients1,
0,
1,
$colorMap[ "Green" ],
"PowRegression"
);
// Plot each of the curves.
plotRenderRegression
(
$regression2,
$coefficients2,
0,
1,
$colorMap[ "Blue" ],
"PowRegression"
);
// Output image
header( "Content-Type: image/png" );
imagePNG( $image );
function printFunction( $line, $color, $coefficients, $text )
{
global $image;
global $leftMargin;
global $topMargin;
$functionText = "f( x ) = "
. number_format( $coefficients[ 0 ], 6 )
. " x^"
. number_format( $coefficients[ 1 ], 6 )
. $text;
// Place text
imageString
(
$image,
2,
$leftMargin + 2,
$topMargin + $line * imagefontheight( 2 ),
$functionText,
$color
);
}
?>
Here exponentiation weighting is used to weight the term on the right side much more than those on the left.
After declaring an instance of the weighting class it can be assigned to the regression class using the setWeighting function.
In this example the red dots represent noisy data. The blue plot is the standard linearized regression, which produces a poor fit. The green trace shows a weighted regression which fits much better.
In this example the unique weighting class is applied to a data set.
<?php
// Load the polynomial regression class.
require_once( 'RootDirectory.inc.php' );
require_once( $RootDirectory . 'includes/PolynomialRegression/PolynomialRegression.php' );
require_once( $RootDirectory . 'includes/LinearizedRegression/PowRegression.php' );
require_once( $RootDirectory . 'includes/Weighting/UniqueWeighting.php' );
require_once( $RootDirectory . 'includes/plot.php' );
// The data consists of three columns: x, y, and weight
// y is a noisy version of 0.5 x^2 - 0.2 x - 0.1.
$data =
array
(
array( -1.00, 0.429082372117, 0.569892634364 ), array( -0.98, 0.436345120319, 0.636378047300 ),
array( -0.96, 0.946300417341, 0.223095864079 ), array( -0.94, 0.488585724422, 0.881383015552 ),
array( -0.92, 0.507196009419, 0.999988028306 ), array( -0.90, 0.521656533309, 0.894012248939 ),
array( -0.88, -0.432993635624, 0.001118592602 ), array( -0.86, 0.620720965283, 0.553547494305 ),
array( -0.84, 0.307855315214, 0.697994672029 ), array( -0.82, -0.093627344638, 0.129686879222 ),
array( -0.80, 0.359517424353, 0.939802287599 ), array( -0.78, 0.097336113317, 0.400537391569 ),
array( -0.76, 0.358160609220, 0.948817112269 ), array( -0.74, 0.285387206942, 0.894691015908 ),
array( -0.72, -0.274064421286, 0.075545117800 ), array( -0.70, 0.277578108855, 0.977899171137 ),
array( -0.68, 0.233848248484, 0.903244664976 ), array( -0.66, 0.658390358233, 0.206854608847 ),
array( -0.64, 0.221625014019, 0.966848287456 ), array( -0.62, 0.389360858446, 0.565279299925 ),
array( -0.60, 0.189432485206, 0.968631292625 ), array( -0.58, 0.180331548781, 0.988439483196 ),
array( -0.56, 0.164248498341, 0.986407549234 ), array( -0.54, 0.097501356331, 0.840434240181 ),
array( -0.52, 0.119582406844, 0.942294220586 ), array( -0.50, 0.076778447408, 0.862199166178 ),
array( -0.48, 0.111391748860, 0.999424863716 ), array( -0.46, 0.218096278888, 0.680783916604 ),
array( -0.44, 0.220927612807, 0.644686798859 ), array( -0.42, 0.050800637625, 0.937165911538 ),
array( -0.40, 0.133490346077, 0.795334545733 ), array( -0.38, 0.050325561687, 0.993636859372 ),
array( -0.36, 0.125172166553, 0.757622265013 ), array( -0.34, 0.028406160314, 0.992201877572 ),
array( -0.32, -0.632114422481, 0.043869542050 ), array( -0.30, -0.377195572311, 0.235805021628 ),
array( -0.28, -0.838761757082, 0.004577458194 ), array( -0.26, 0.025756790679, 0.884855470507 ),
array( -0.24, -0.129885768151, 0.712873973182 ), array( -0.22, -0.091615936230, 0.831072011789 ),
array( -0.20, -0.042166426936, 0.993514789242 ), array( -0.18, -0.731100282906, 0.031764573338 ),
array( -0.16, -0.074678621279, 0.942694995709 ), array( -0.14, -0.010666450671, 0.853229614140 ),
array( -0.12, -0.423784984117, 0.268354866384 ), array( -0.10, -0.014367098462, 0.828907433941 ),
array( -0.08, -0.079386336384, 0.995765001662 ), array( -0.06, 0.366272140354, 0.164141601459 ),
array( -0.04, -0.035810132564, 0.842864571750 ), array( -0.02, 0.173016713505, 0.390911788143 ),
array( 0.00, -0.190364896737, 0.752664850487 ), array( 0.02, -0.507882832432, 0.211620478052 ),
array( 0.04, -0.028640420194, 0.782351244421 ), array( 0.06, -0.107881081963, 0.993059365560 ),
array( 0.08, -0.024250171192, 0.757179404521 ), array( 0.10, -0.117606369137, 0.992201254363 ),
array( 0.12, -0.102269730291, 0.957039509328 ), array( 0.14, -1.106658103724, 0.000001537558 ),
array( 0.16, -0.120761711644, 0.995322178087 ), array( 0.18, 0.759083522349, 0.001776681974 ),
array( 0.20, -0.094338197702, 0.924973278381 ), array( 0.22, -0.486209665402, 0.254346420737 ),
array( 0.24, -0.343395568066, 0.466935365843 ), array( 0.26, -0.624836207159, 0.120088611245 ),
array( 0.28, 0.661480114573, 0.010899684735 ), array( 0.30, -0.504487870789, 0.227553169938 ),
array( 0.32, -0.112587885487, 0.999363791430 ), array( 0.34, -0.042354488563, 0.809960212155 ),
array( 0.36, 0.858765224438, 0.000039424725 ), array( 0.38, 0.710171402360, 0.006437824548 ),
array( 0.40, 0.196668403984, 0.347920792140 ), array( 0.42, 0.195016059482, 0.356678291182 ),
array( 0.44, 0.139607332753, 0.455098502332 ), array( 0.46, 0.732007467101, 0.006007975072 ),
array( 0.48, -0.062308039377, 0.945543652579 ), array( 0.50, -0.064679333277, 0.969356448998 ),
array( 0.52, -0.235834408073, 0.577937913809 ), array( 0.54, -0.062405883212, 0.999382477518 ),
array( 0.56, 0.295241751701, 0.274065460164 ), array( 0.58, -0.048056778183, 0.999229863239 ),
array( 0.60, -0.044261192643, 0.987270817986 ), array( 0.62, 0.183358354779, 0.483443937529 ),
array( 0.64, -0.338719007719, 0.320689084223 ), array( 0.66, 0.823250995966, 0.004294898834 ),
array( 0.68, -0.005131977430, 0.999004398299 ), array( 0.70, 0.538155869311, 0.101745616887 ),
array( 0.72, -0.062350563324, 0.784924183587 ), array( 0.74, 0.473045854508, 0.168886923699 ),
array( 0.76, 0.216363321550, 0.552249336945 ), array( 0.78, -0.559543696016, 0.060354519356 ),
array( 0.80, 0.060103424898, 0.999689757395 ), array( 0.82, 0.679733968306, 0.060451380394 ),
array( 0.84, 0.458152379065, 0.246076525234 ), array( 0.86, 0.968899880624, 0.002141706519 ),
array( 0.88, 0.262240173749, 0.611873181275 ), array( 0.90, 1.034976322902, 0.000729575505 ),
array( 0.92, 0.188591612117, 0.859023265306 ), array( 0.94, 1.037115142351, 0.001588705877 ),
array( 0.96, 1.006945736352, 0.004240064205 ), array( 0.98, 0.236838302140, 0.850251616410 ),
array( 1.00, 0.190190185081, 0.970858308627 ),
);
// Precision digits in BC math.
bcscale( 10 );
// Add raw data to plot.
plotRenderData( $data, $colorMap[ "Red" ] );
$plot->setX_Span( -1, 1 );
$plot->setY_Span( -1, 1 );
plotAddScale();
// Render the points to image.
$plot->setCircleSize( 3 );
$plot->renderPoints();
// Create two power-of regression classes. One will be weighted, the other
// will not.
$regression1 = new PolynomialRegression( 3 );
$regression2 = new PolynomialRegression( 3 );
// Setup unique weighting on first regression.
$weighting = new UniqueWeighting();
$regression1->setWeighting( $weighting );
// Add all the data to both regression analysis.
foreach ( $data as $dataPoint )
{
// Set weighting term before adding data.
$weighting->setWeight( $dataPoint[ 2 ] );
$regression1->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
$regression2->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
}
// Get coefficients for the polynomial.
$coefficients1 = $regression1->getCoefficients();
$coefficients2 = $regression2->getCoefficients();
$coefficients3 = array( -0.1, -0.2, 0.5 ); // <- Actual coefficients.
// Display the functions for each plot.
$y = $imageHeight - $bottomMargin;
printFunction( $y, 1, $colorMap[ "Red" ], $coefficients3 );
printFunction( $y, 2, $colorMap[ "Green" ], $coefficients1 );
printFunction( $y, 3, $colorMap[ "Blue" ], $coefficients2 );
// Plot each of the curves.
plotRenderRegression( $regression2, $coefficients3, -1, 1, $colorMap[ "LightRed" ] );
plotRenderRegression( $regression1, $coefficients1, -1, 1, $colorMap[ "Green" ] );
plotRenderRegression( $regression2, $coefficients2, -1, 1, $colorMap[ "Blue" ] );
// Output image
header( "Content-Type: image/png" );
imagePNG( $image );
function printFunction( $y, $line, $color, $coefficients )
{
global $image;
global $leftMargin;
$functionText = "y = ";
foreach ( $coefficients as $power => $coefficient )
{
if ( $power > 0 )
{
if ( $coefficient > 0 )
$functionText .= " + ";
else
{
$functionText .= " - ";
$coefficient = -$coefficient;
}
}
$functionText .= number_format( $coefficient, 2 );
if ( $power > 0 )
{
$functionText .= "x";
if ( $power > 1 )
$functionText .= "^" . $power;
}
}
// Place text
imageString
(
$image,
2,
$leftMargin + 2,
$y - $line * imagefontheight( 2 ),
$functionText,
$color
);
}
?>
Here we have data from a 2nd degree polynomial function, and weighting data has been pre-calculated and placed in a third column. The UniqueWeighting class is used to add this custom weighting term to each value. For this example, the weighting term is an estimate of how likely the data point is to be accurate.
After declaring an instance of the weighting class it can be assigned to the regression class using the setWeighting function. Notice how this function is called before each data point is added to the regression.
In this example the red dots represent noisy data and the red line a plot of the actual data. The blue line is the unweighted regression, and the green the weighted regression.
The weighting allows the more accurate data points to be more strongly considered, and thus the green line more accurately fits the original curve.
The mathematics behind polynomial regression is broken into several sections, some of which is more detailed.
This library grew out of the author's quest to understand the mathematics of various curve fitting techniques of which polynomial regression is one. There are several blog postings about the math behind this library:
This software is free, open-source software released under the GNU license.
Polynomial regression class is written and maintained by Andrew Que. To get in touch with Andrew Que, visit his contact page.
(C) Copyright 2009, 2012-2015, 2021 by Andrew Que