Introduction
Polynomial regression is a method of least-square curve fitting. It will take a set of data and produce an approximation. More specifically, it will produce the coefficients to a polynomial that is an approximation of the curve. The number of coefficients can determine the degree of the polynomial and how accurate the curve can be fit. A degree of zero (1 coefficient) is a simple mean average. First degree (2 coefficients) is also known as linear regression. Second and higher degrees will produce non-linear polynomial regression.
Not all data sets can be accurately modeled with polynomial regression regardless of the degree polynomial selected. It helps to have an understanding of the origin of the data and the function attempting to be modeled. Polynomials are one method of curve fitting. Some other methods are implemented with the Gauss-Newton class.
Linearized Regression
There are some non-linear functions that can be linearized and then solved with polynomial regression. There has been a fair amount of interest in having these functions, so they have been added to a second package called Linearized Regression. The non-linear functions implemented are:
For more about how these functions can be linearized, consult this article.
Weighted Regression
Weighted regression is a method by which some input terms are considered more strongly than others. It can be applied to any regression method. Each input term is assigned a weight. If the weights are all equal then the results are the same as unweighted regression. For more information on the mathematics of weighted regression refer to this write-up.
The key to weighting is to register a weighting technique. These are classes that implement the interface Weighting. Three such classes are provided:
In addition the weighting interface is very simple and new custom methods of weighting can easily be added.
Requirements
Polynomial regression requires PHP 7.3 or above. PHP must have the BC math library. The use of BC arbitrary precision arithmetic is almost always necessary for regression of degrees higher then 4, or data sets with thousands of points. The numbers simply get very large or very small and exceed what can be represented by conventional floating-point values.
Versions prior to version 1.3.0 only PHP 5 is required. However,
the use of the bcscale
function to get the current
scale wasn't added until PHP 7.3—hence the newer requirement.
BCMath is
a PHP extension. If manually building PHP, it requires
--enable-bcmath
in the configuration. For Debian based
system (including Ubuntu and Mint), install the package
php-bcmath
(i.e. sudo apt install php-bcmath
).
A similar package exists for Arch based distributions, although you
might need to specify the version of PHP (i.e.
pacman -S php81-bcmath
). Windows-based systems have
BCMath enabled by default.
Development
This project is over a decade old, and does not receive many updates. It is not dead, but does its intended task and hasn't required much maintenance.
The polynomial regression class does not have a public code repository on sites such as gitlab or github. This is an open-source project and as such others are allowed to publish copies to said sites (and have). However, those are not maintained by the author. There are no intentions of moving this project to a commercially run code repository. Please contact the maintainers of the external repositores for questions relating to them as I am not involved and unable to help.
Online
To calculate polynomial regression online from your web browser, use the Online Polynomial Regression calculator page based on this library. It allows calculating and graphing of user supplied data.
Manual
Documentation is available online, generated from the source code using phpDocumentor. The documentation can be recreated from the source if needed.
Download
Releases are distributed in two formats. XZ for Linux users, and Zip for all others. Both archives hold identical information. Older archives use Bzip2 rather than LZMA.
Releases are signed by OpenPGP. The public key for DrQue.net is available here. The SHA-2-256 hash and CRC32 are also provided to check archive integrity. Older versions use MD5 and SHA1. The PGP signatures are from Andrew Que's key for the year released which can be found here
Current version
Version 1.3.0
Released December 25, 2023. Primarily a refresh update for PHP 8.
- Fixed conversions of E-notation floats to BCMath.
- Files renamed to fit PHP naming convention.
- Spelling corrections.
- Download
- XZ release 1.3.0
- File size
- 17,928
- File date
- 2023-12-25 04:18:20.351
- CRC-32
7da901b7
- Source SHA256
2e95313c678c5405573514ae42a32d8ea05fb24826caccb11846c7eb5003deba
- PGP signature
- Source PGP signature
- Download
- Zip release 1.3.0
- File size
- 27,623
- File date
- 2023-12-25 04:18:20.355
- CRC-32
7b6a8e25
- Source SHA256
3d550cd01b7864af635e4c661787b5214b696084966b71e3652b35f3a7379012
- PGP signature
- Source PGP signature
Archived versions
Version 1.2.1, Released February 17, 2015.
- Added support for weighted regression.
- Added 3 linearizable regression functions.
- Bug fix to LinearWeighting added in 1.2.1.
BZip 2 (Linux)
Download release 1.2.1MD5: d3bc84c6445a8d77e78ed1a1c64392a4
SHA1: 4229b43d7f0d87f6d5e608570373e38cabd82d52
Zip (Others)
Download release 1.2.1MD5: db5ecefe1e98d93f92515e409c53402f
SHA1: f8cdde1ec875dcad3880e0ba96c7c4321bafe57b
Version 1.2, Released February 13, 2015.
- Added support for weighted regression.
- Added 3 linearizable regression functions.
BZip 2 (Linux)
Download release 1.2MD5: 99339673b4f7e65a3996afe6604d237e
SHA1: f0eec0f17a357595df67f7ba49c4f37bde9420d2
Zip (Others)
Download release 1.2MD5: c8b491d23d37da3f864b823add3a0860
SHA1: a8c73c1e33245562cfbcc47a3ac6d21b81cd6543
Version 1.1, Released May 5, 2014.
-
The method
interpolate
is now static as it does not need an instance to operate. Useful if coefficients have been calculated elsewhere. -
Deprecated
setDegree
function. This is the wrong terminology for what the function does. It actually sets the number of coefficients for the polynomial. The degree of the polynomial is the number of coefficients less one. Made the identical functionsetNumberOfCoefficient
to replace it. - Added getter functions for anything that has a set function.
BZip 2 (Linux)
Download release 1.1MD5: 799a8c57e9730bbb9618999e2d6d0287
SHA1: 6e1defdc5754670156578578dea477198f7f735d
Zip (Others)
Download release 1.1MD5: 9357794584c282279e9d6738b4a08831
SHA1: 4735f430731eeb94c0441cbfdc59f6e93e452c59
Version 1.0, Released December 29, 2013.
Added support for forced coefficient.
BZip 2 (Linux)
Download release 1.0MD5: b9b9d9083d6cb0d1e9a2c6b8a3066ac8
SHA1: 3c7153945b5d90c6e176eebb5f820f7e31f292dd
Zip (Others)
Download release 1.0MD5: fb5e505d7fbd0915239e54a84bac005a
SHA1: 4feef4f46a36bcf3a27ca254afd1eed4857098ac
Version 0.91, Released May 18, 2013.
Library renamed to better clarify it's function.
BZip 2 (Linux)
Download release 0.91MD5: cc8284107dae45b60cd5e07fe6152991
SHA1: a3027e7bb9783d06e6f506712aaf4158db8baf52
Zip (Others)
Download release 0.91MD5: be854d254ffd844d20df4c6676cad80b
SHA1: 5b8e59d218968cdf8f593ef1dad4b3ca5541ebbf
Version 0.9, Released June 16, 2012.
Improved performance using Gaussian elimination rather than Cramer's rule to solve the system of equations.
BZip 2 (Linux)
Download release 0.9MD5: ea3aacc18c1b3086df502823704639d2
SHA1: 69038bd1414777e0e936ea8701a85cecb363ebc1
Zip (Others)
Download release 0.9MD5: 4193ddd4323277b67acdca3e59dda1f8
SHA1: 3491860264c8caa077fd4ea9517c8ebc139d8b60
Version 0.8, Released June 1, 2009.
BZip 2 (Linux)
Download release 0.8MD5: 602a40cfba4de4751edf409a7c3e0854
SHA1: c5495d644e32e7a20f0af812675da6eb96619485
Zip (Others)
Download release 0.8MD5: 17ceb039ff4474dcbf86adac8f103753
SHA1: e8e887a8cd504c32cc8c37c4c2b02364b6b44ee3
User comments
Examples
Several of these example include some functions from plot.php. It uses XY Plot to draw charts.
Linear Regression.
Linear regression is one of the simplest forms of polynomial regression. It produces a 1st degree polynomial with the two coefficients usually called slope and intercept.
<?php
// Load the polynomial regression class.
require_once( __DIR__ . '/../vendor/polynomialRegression/polynomial-regression.php' );
$data =
array
(
array( 0.00, 27.3834562958158 ), array( 0.02, 38.2347360741764 ),
array( 0.04, 42.5632501679666 ), array( 0.06, 19.4638760104114 ),
array( 0.08, 42.690858098909 ), array( 0.10, 25.330634164557 ),
array( 0.12, 49.6507591632989 ), array( 0.14, 34.3502467856792 ),
array( 0.16, 52.5267153107089 ), array( 0.18, 34.5528919545231 ),
array( 0.20, 44.3220950255077 ), array( 0.22, 44.7805694031715 ),
array( 0.24, 32.9090525820585 ), array( 0.26, 56.7941323051778 ),
array( 0.28, 48.7192221569495 ), array( 0.30, 48.7964850888813 ),
array( 0.32, 56.8905173101315 ), array( 0.34, 66.0107252116092 ),
array( 0.36, 74.3149331561425 ), array( 0.38, 52.9076168019644 ),
array( 0.40, 64.3463647026162 ), array( 0.42, 50.0776706625628 ),
array( 0.44, 62.3527806092493 ), array( 0.46, 75.9589658430523 ),
array( 0.48, 69.280743962744 ), array( 0.50, 74.4868159870338 ),
array( 0.52, 76.4548504742096 ), array( 0.54, 82.9347555390181 ),
array( 0.56, 83.9546576353049 ), array( 0.58, 83.6379624022705 ),
array( 0.60, 92.6278811310654 ), array( 0.62, 84.3395153143048 ),
array( 0.64, 86.832363003336 ), array( 0.66, 105.66563124607 ),
array( 0.68, 100.175129109663 ), array( 0.70, 82.0781941886623 ),
array( 0.72, 95.9916212989616 ), array( 0.74, 87.5853932119967 ),
array( 0.76, 93.5435091554247 ), array( 0.78, 98.0622114645327 ),
array( 0.80, 118.067000253198 ), array( 0.82, 98.2918886287489 ),
array( 0.84, 111.027863906934 ), array( 0.86, 113.1135947538 ),
array( 0.88, 117.777915259186 ), array( 0.90, 108.621331147219 ),
array( 0.92, 112.979639159754 ), array( 0.94, 122.065499190418 ),
array( 0.96, 116.136221596622 ), array( 0.98, 111.215762010712 ),
array( 1.00, 122.743302375187 )
);
// Precision digits in BC math.
bcscale( 10 );
// Start a regression class of order 2--linear regression.
$PolynomialRegression = new PolynomialRegression( 2 );
// Add all the data to the regression analysis.
foreach ( $data as $dataPoint )
$PolynomialRegression->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
// Get coefficients for the polynomial.
$coefficients = $PolynomialRegression->getCoefficients();
// Print slope and intercept of linear regression.
echo "Slope : " . round( $coefficients[ 1 ], 2 ) . "<br />";
echo "Y-intercept : " . round( $coefficients[ 0 ], 2 ) . "<br />";
In this example, 50 data points are used to construct linear regression. The slope and y-intercept of the trend are then displayed.
The image above was created in a spreadsheet with the data points from the example. The linear regression trend line is displayed, along with the trend line's function.
Y-intercept : 26.55
This is the output from the example. Note how the slope and intercept values match those of the function in the spreadsheet created chart.
There is not much reason to use this library to compute linear-regression as there are far faster implementations. However for data sets that have very large numbers or when high accuracy is needed this library may be useful.
Third Degree Polynomial.
<?php
// Load the polynomial regression class.
require_once( __DIR__ . '/../vendor/polynomialRegression/polynomial-regression.php' );
require_once( 'plot.php' );
// Data created in a spreadsheet with some random scatter. True function should be:
// f( x ) = 0.65 + 0.6 x - 6.25 x^2 + 6 x^3
$data =
array
(
array( 0.00, 0.65646507 ), array( 0.05, 0.61435503 ),
array( 0.10, 0.63151965 ), array( 0.15, 0.57711365 ),
array( 0.20, 0.58534249 ), array( 0.25, 0.54148715 ),
array( 0.30, 0.43877649 ), array( 0.35, 0.39516968 ),
array( 0.40, 0.24977940 ), array( 0.45, 0.24246690 ),
array( 0.50, 0.07730788 ), array( 0.55, 0.03633931 ),
array( 0.60, 0.08980716 ), array( 0.65, 0.07562991 ),
array( 0.70, 0.11196788 ), array( 0.75, 0.15086596 ),
array( 0.80, 0.19979455 ), array( 0.85, 0.34683801 ),
array( 0.90, 0.48338650 ), array( 0.95, 0.59196113 ),
array( 1.00, 0.99233320 )
);
// Precision digits in BC math.
bcscale( 10 );
// Start a regression class with a maximum of 4rd degree polynomial.
$polynomialRegression = new PolynomialRegression( 4 );
// Add all the data to the regression analysis.
foreach ( $data as $dataPoint )
$polynomialRegression->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
// Add raw data to plot.
plotRenderData( $data, $colorMap[ "Red" ] );
$plot->setX_Span( 0, 1 );
$plot->setY_Span( 0, 1 );
$Y_MajorScale = 0.1;
$Y_MinorScale = $Y_MajorScale / 5;
$X_MajorScale = 0.1;
$X_MinorScale = $X_MajorScale / 5;
plotAddScale();
// Render the points to image
$plot->setCircleSize( 8 );
$plot->renderPoints();
// Get coefficients for the polynomial.
$coefficients = $polynomialRegression->getCoefficients();
$functionText = "f( x ) = ";
foreach ( $coefficients as $power => $coefficient )
{
if ( $power > 0 )
$functionText .= " + ";
$functionText .= round( $coefficient, 2 );
if ( $power > 0 )
{
$functionText .= "x";
if ( $power > 1 )
$functionText .= "^" . $power;
}
}
// Place text
imageString
(
$image,
2,
$leftMargin + 2,
$topMargin + 2,
$functionText,
$colorMap[ "Black" ]
);
plotRenderRegression( $polynomialRegression, $coefficients, 0, 1, $colorMap[ "LightRed" ] );
// Output image
header( "Content-Type: image/png" );
imagePNG( $image );
This example starts with the knowledge the data was generated by some function that is a 3rd degree polynomial. The data is formed by 21 samples close to the function f( x ) = 6 x3 - 6.25 x2 + 0.6 x + 0.65 with some random noise added. The regression analysis attempts to reconstruct the coefficients of the original function.
The graph shows the input data as red circles, and the regression plot as the red line. The function with the interpolated coefficients is printed at the top. The coefficients of this function are fairly close to the original.
Calculating R-Squared.
<?php
// Load the polynomial regression class.
require_once( __DIR__ . '/../vendor/polynomialRegression/polynomial-regression.php' );
$data =
array
(
array( 0.00, 27.3834562958158 ), array( 0.02, 38.2347360741764 ),
array( 0.04, 42.5632501679666 ), array( 0.06, 19.4638760104114 ),
array( 0.08, 42.690858098909 ), array( 0.10, 25.330634164557 ),
array( 0.12, 49.6507591632989 ), array( 0.14, 34.3502467856792 ),
array( 0.16, 52.5267153107089 ), array( 0.18, 34.5528919545231 ),
array( 0.20, 44.3220950255077 ), array( 0.22, 44.7805694031715 ),
array( 0.24, 32.9090525820585 ), array( 0.26, 56.7941323051778 ),
array( 0.28, 48.7192221569495 ), array( 0.30, 48.7964850888813 ),
array( 0.32, 56.8905173101315 ), array( 0.34, 66.0107252116092 ),
array( 0.36, 74.3149331561425 ), array( 0.38, 52.9076168019644 ),
array( 0.40, 64.3463647026162 ), array( 0.42, 50.0776706625628 ),
array( 0.44, 62.3527806092493 ), array( 0.46, 75.9589658430523 ),
array( 0.48, 69.280743962744 ), array( 0.50, 74.4868159870338 ),
array( 0.52, 76.4548504742096 ), array( 0.54, 82.9347555390181 ),
array( 0.56, 83.9546576353049 ), array( 0.58, 83.6379624022705 ),
array( 0.60, 92.6278811310654 ), array( 0.62, 84.3395153143048 ),
array( 0.64, 86.832363003336 ), array( 0.66, 105.66563124607 ),
array( 0.68, 100.175129109663 ), array( 0.70, 82.0781941886623 ),
array( 0.72, 95.9916212989616 ), array( 0.74, 87.5853932119967 ),
array( 0.76, 93.5435091554247 ), array( 0.78, 98.0622114645327 ),
array( 0.80, 118.067000253198 ), array( 0.82, 98.2918886287489 ),
array( 0.84, 111.027863906934 ), array( 0.86, 113.1135947538 ),
array( 0.88, 117.777915259186 ), array( 0.90, 108.621331147219 ),
array( 0.92, 112.979639159754 ), array( 0.94, 122.065499190418 ),
array( 0.96, 116.136221596622 ), array( 0.98, 111.215762010712 ),
array( 1.00, 122.743302375187 )
);
// Precision digits in BC math.
bcscale( 10 );
// Start a regression class of order 2--linear regression.
$leastSquareRegression = new PolynomialRegression( 2 );
// Add all the data to the regression analysis.
foreach ( $data as $dataPoint )
$leastSquareRegression->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
// Get coefficients for the polynomial.
$coefficients = $leastSquareRegression->getCoefficients();
// Print slope and intercept of linear regression.
echo "Slope : " . round( $coefficients[ 1 ], 2 ) . "<br />\n";
echo "Y-intercept : " . round( $coefficients[ 0 ], 2 ) . "<br />\n";
//
// Get average of Y-data.
//
$Y_Average = 0.0;
foreach ( $data as $dataPoint )
$Y_Average += $dataPoint[ 1 ];
$Y_Average /= count( $data );
//
// Calculate R Squared.
//
$Y_MeanSum = 0.0;
$Y_ErrorSum = 0.0;
foreach ( $data as $dataPoint )
{
$x = $dataPoint[ 0 ];
$y = $dataPoint[ 1 ];
$error = $y;
$error -= $leastSquareRegression->interpolate( $coefficients, $x );
$Y_ErrorSum += $error * $error;
$error = $y;
$error -= $Y_Average;
$Y_MeanSum += $error * $error;
}
$R_Squared = 1.0 - ( $Y_ErrorSum / $Y_MeanSum );
echo "R Squared : $R_Squared<br />\n";
This example shows how to compute the Coefficient of determination (generally called R-Squared) after the coefficients have been calculated. This value is one representation of the goodness of fit. The closer this value is to 1.0, the better the fit.
Y-intercept : 26.55
R Squared : 0.92618245728437
Linear Regression with Forced Intercept.
There are times when it is known that the intercept of the function is zero, but the calculated coefficient for the offset is not. For this one can use the function setForcedCoefficient( 0, 0 ). This is a typical example involving linear regression of a noisy set of data points.
<?php
// Load the polynomial regression class.
require_once( __DIR__ . '/../vendor/polynomialRegression/polynomial-regression.php' );
require_once( 'plot.php' );
$data =
array
(
array( 0.05, 0.1924787314 ), array( 0.10, 0.4586186921 ),
array( 0.15, 0.1318838557 ), array( 0.20, 0.1865927433 ),
array( 0.25, 0.4667421897 ), array( 0.30, 0.1027880072 ),
array( 0.35, 0.5599968985 ), array( 0.40, 0.6605423892 ),
array( 0.45, 0.620103306 ), array( 0.50, 0.4445367125 ),
array( 0.55, 0.5912679423 ), array( 0.60, 0.7942020837 ),
array( 0.65, 0.8694575373 ), array( 0.70, 0.4146043937 ),
array( 0.75, 0.6604661468 ), array( 0.80, 0.9138025779 ),
array( 0.85, 0.8124334151 ), array( 0.90, 0.7998087715 ),
array( 0.95, 0.7391285236 ), array( 1.00, 0.9012208138 ),
);
// Precision digits in BC math.
bcscale( 10 );
// Add raw data to plot.
plotRenderData( $data, $colorMap[ "Red" ] );
$plot->setX_Span( 0, 1 );
$plot->setY_Span( 0, 1 );
plotAddScale();
// Render the points to image
$plot->setCircleSize( 8 );
$plot->renderPoints();
// Start a regression class of order 4, one with no forcing coefficients,
// one with two forced coefficients.
$regression1 = new PolynomialRegression( 2 );
$regression2 = new PolynomialRegression( 2 );
$regression2->setForcedCoefficient( 0, 0 );
// Add all the data to both regression analysis.
foreach ( $data as $dataPoint )
{
$regression1->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
$regression2->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
}
// Get coefficients for the polynomial.
$coefficients1 = $regression1->getCoefficients();
$coefficients2 = $regression2->getCoefficients();
// Plot each of the curves.
plotRenderRegression( $regression1, $coefficients1, 0, 1, $colorMap[ "Green" ] );
plotRenderRegression( $regression2, $coefficients2, 0, 1, $colorMap[ "LightBlue" ] );
// Output image
header( "Content-Type: image/png" );
imagePNG( $image );
In the graph, the green line shows linear-regression where the blue line shows linear regression with the intercept forced to zero. The actual slope is 1, but there is a very small signal-to-noise-ratio.
Forced coefficients.
In addition to being able to force a zero offset, it is possible to set any coefficient to a known value. This will allow the other coefficients to be determined by the regression analysis. Thus if it is known that one of the coefficients must be a specific value, the remaining coefficients will take this into account.
As with forcing an intercept, the function setForcedCoefficient is used. The first parameter is witch coefficient is to be forced to the known value, and the second parameter is the value. More than one coefficient may be forced if desired.
<?php
// Load the polynomial regression class.
require_once( __DIR__ . '/../vendor/polynomialRegression/polynomial-regression.php' );
require_once( 'plot.php' );
$data =
array
(
array( 0.00, 0.65379741 ), array( 0.05, 0.64074062 ),
array( 0.10, 0.72833783 ), array( 0.15, 0.44629689 ),
array( 0.20, 0.45174500 ), array( 0.25, 0.34161602 ),
array( 0.30, 0.78621158 ), array( 0.35, 0.38960121 ),
array( 0.40, 0.14126441 ), array( 0.45, 0.38123106 ),
array( 0.50, 0.20605429 ), array( 0.55, 0.02456525 ),
array( 0.60, 0.48434811 ), array( 0.65, 0.21453304 ),
array( 0.70, 0.54765807 ), array( 0.75, 0.41625294 ),
array( 0.80, 0.78163483 ), array( 0.85, 0.71306009 ),
array( 0.90, 0.53515664 ), array( 0.95, 0.98918384 ),
array( 1.00, 0.93061202 )
);
// The actual coefficients for the above data (without noise).
$trueCoefficients = array( 0.9, -2, 0.6, 1.5 );
// Precision digits in BC math.
bcscale( 10 );
// Add raw data to plot.
plotRenderData( $data, $colorMap[ "Red" ] );
$plot->setX_Span( 0, 1 );
$plot->setY_Span( 0, 1 );
plotAddScale();
// Render the points to image
$plot->setCircleSize( 4 );
$plot->renderPoints();
// Start a regression class of order 4, one with no forcing coefficients,
// one with two forced coefficients.
$regression1 = new PolynomialRegression( 4 );
$regression2 = new PolynomialRegression( 4 );
$regression2->setForcedCoefficient( 1, -2 );
$regression2->setForcedCoefficient( 3, 1.5 );
// Add all the data to both regression analysis.
foreach ( $data as $dataPoint )
{
$regression1->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
$regression2->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
}
// Get coefficients for the polynomial.
$coefficients1 = $regression1->getCoefficients();
$coefficients2 = $regression2->getCoefficients();
// Plot each of the curves.
plotRenderRegression( $regression1, $trueCoefficients, 0, 1, $colorMap[ "LightRed" ] );
plotRenderRegression( $regression1, $coefficients1, 0, 1, $colorMap[ "Green" ] );
plotRenderRegression( $regression2, $coefficients2, 0, 1, $colorMap[ "LightBlue" ] );
$y = $imageHeight - $bottomMargin;
printFunction( $y, 3, $colorMap[ "LightRed" ], $trueCoefficients );
printFunction( $y, 2, $colorMap[ "Green" ], $coefficients1 );
printFunction( $y, 1, $colorMap[ "LightBlue" ], $coefficients2 );
// Output image
header( "Content-Type: image/png" );
imagePNG( $image );
function printFunction( $y, $line, $color, $coefficients )
{
global $image;
global $leftMargin;
$functionText = "f( x ) = ";
foreach ( $coefficients as $power => $coefficient )
{
if ( $power > 0 )
$functionText .= " + ";
$functionText .= number_format( $coefficient, 2 );
if ( $power > 0 )
{
$functionText .= "x";
if ( $power > 1 )
$functionText .= "^" . $power;
}
}
// Place text
imageString
(
$image,
2,
$leftMargin + 2,
$y - $line * imagefontheight( 2 ),
$functionText,
$color
);
}
In this example regression is preformed on a 3rd degree polynomial set of noisy data. The true coefficients are (0.9, -2, 0.6, 1.5). The coefficients are first determined without any forced terms, and then by forcing two of the terms to known values.
The graph displays the input data as red circles. The red line is the true curve. The green line is the regression with no known coefficients, and the blue line is the regression with two forced coefficients. As expected, the blue line conforms more closely to the true curve represented by the red line.
Linearized Regression.
The linearized regression classes are children of the polynomial regression class and overload some of the functions in order to preform linearization. So their operation is almost identical to the polynomial regression class.
<?php
// Load the polynomial regression class.
require_once( __DIR__ . '/../vendor/polynomialRegression/exponential-regression.php' );
require_once( 'plot.php' );
$data =
array
(
array( 0.00, 0.024094775 ), array( 0.05, 0.0390894172 ),
array( 0.10, 0.0524281705 ), array( 0.15, 0.0094749558 ),
array( 0.20, 0.1342814605 ), array( 0.25, 0.0181198568 ),
array( 0.30, 0.032552131 ), array( 0.35, 0.0227223143 ),
array( 0.40, 0.1169744975 ), array( 0.45, 0.1226243145 ),
array( 0.50, 0.1427587983 ), array( 0.55, 0.1497210208 ),
array( 0.60, 0.1727192031 ), array( 0.65, 0.3031739468 ),
array( 0.70, 0.2400640511 ), array( 0.75, 0.3650339253 ),
array( 0.80, 0.4659496711 ), array( 0.85, 0.5082614871 ),
array( 0.90, 0.6841058006 ), array( 0.95, 0.7940730517 ),
);
// Precision digits in BC math.
bcscale( 10 );
// Add raw data to plot.
plotRenderData( $data, $colorMap[ "Red" ] );
$plot->setX_Span( 0, 1 );
$plot->setY_Span( 0, 1 );
plotAddScale();
// Render the points to image
$plot->setCircleSize( 8 );
$plot->renderPoints();
// Create instance of linearized regression.
$regression = new ExpRegression();
// Add all the data to both regression analysis.
foreach ( $data as $dataPoint )
$regression->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
// Get the resulting coefficients.
$coefficients = $regression->getCoefficients();
// Plot each of the curves.
plotRenderRegression
(
$regression,
$coefficients,
0,
1,
$colorMap[ "Green" ],
"ExpRegression"
);
$string =
"f( x ) = "
. number_format( $coefficients[ 0 ], 4 )
. " exp( "
. number_format( $coefficients[ 1 ], 4 )
. " x )";
// Place text
imageString
(
$image,
2,
$leftMargin + 2,
10,
$string,
$colorMap[ "Green" ]
);
// Output image
header( "Content-Type: image/png" );
imagePNG( $image );
The basic mechanics of the polynomial regression class are used because in the linearized form, this function turns into a 1st degree polynomial. For this reason the number of coefficients does not need to be specified.
The graph shows a noisy signal and the logarithmic curve calculated from y = a eb x. The actual noisy data used y = 0.18 e4 x.
Weighted Regression.
One application of weighting is useful for assisting the linearized version of the power function. This function often responds poorly to noise and does not resolve to a good fit because the linearized version is being minimized, not the actual function. A workaround solution is to unequally weight the terms before solving.
<?php
// Load the polynomial regression class.
require_once( __DIR__ . '/../vendor/polynomialRegression/polynomial-regression.php' );
require_once( __DIR__ . '/../vendor/polynomialRegression/pow-regression.php' );
require_once( __DIR__ . '/../vendor/polynomialRegression/exponentiation-weighting.php' );
require_once( 'plot.php' );
$data =
array
(
array( 0.05,0.00604730001 ),
array( 0.10,0.00368496403 ),
array( 0.15,0.00149732550 ),
array( 0.20,0.00750937272 ),
array( 0.25,0.01402765100 ),
array( 0.30,0.00460214218 ),
array( 0.35,0.01895682587 ),
array( 0.40,0.04611466211 ),
array( 0.45,0.06140241681 ),
array( 0.50,0.05753703495 ),
array( 0.55,0.10084107155 ),
array( 0.60,0.14016251588 ),
array( 0.65,0.18072751735 ),
array( 0.70,0.23557998528 ),
array( 0.75,0.30045147211 ),
array( 0.80,0.40979875947 ),
array( 0.85,0.51324006361 ),
array( 0.90,0.65069131055 ),
array( 0.95,0.81135826051 ),
array( 1.00,1.00234398314 ),
);
// Precision digits in BC math.
bcscale( 10 );
// Add raw data to plot.
plotRenderData( $data, $colorMap[ "Red" ] );
$plot->setX_Span( 0, 1 );
$plot->setY_Span( 0, 1 );
plotAddScale();
// Render the points to image
$plot->setCircleSize( 8 );
$plot->renderPoints();
// Create two power-of regression classes. One will be weighted, the other
// will not.
$regression1 = new PowRegression();
$regression2 = new PowRegression();
// Use exponentiation weighting on first regression class.
$powerWeightedRegression = new ExponentiationWeighting( 4 );
$regression1->setWeighting( $powerWeightedRegression );
// Add all the data to both regression analysis.
foreach ( $data as $dataPoint )
{
$regression1->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
$regression2->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
}
// Get coefficients for the polynomial.
$coefficients1 = $regression1->getCoefficients();
$coefficients2 = $regression2->getCoefficients();
printFunction( 0, $colorMap[ "Red" ], array( 1, 4 ), " true" );
printFunction( 1, $colorMap[ "Green" ], $coefficients1, " weighted" );
printFunction( 2, $colorMap[ "Blue" ], $coefficients2, " unweighted" );
// Plot each of the curves.
plotRenderRegression
(
$regression1,
$coefficients1,
0,
1,
$colorMap[ "Green" ],
"PowRegression"
);
// Plot each of the curves.
plotRenderRegression
(
$regression2,
$coefficients2,
0,
1,
$colorMap[ "Blue" ],
"PowRegression"
);
// Output image
header( "Content-Type: image/png" );
imagePNG( $image );
function printFunction( $line, $color, $coefficients, $text )
{
global $image;
global $leftMargin;
global $topMargin;
$functionText = "f( x ) = "
. number_format( $coefficients[ 0 ], 6 )
. " x^"
. number_format( $coefficients[ 1 ], 6 )
. $text;
// Place text
imageString
(
$image,
2,
$leftMargin + 2,
$topMargin + $line * imagefontheight( 2 ),
$functionText,
$color
);
}
Here exponentiation weighting is used to weight the term on the right side much more than those on the left.
After declaring an instance of the weighting class it can be assigned to the regression class using the setWeighting function.
In this example the red dots represent noisy data. The blue plot is the standard linearized regression, which produces a poor fit. The green trace shows a weighted regression which fits much better.
Uniquely weighted Regression.
In this example the unique weighting class is applied to a data set.
<?php
// Load the polynomial regression class.
require_once( __DIR__ . '/../vendor/polynomialRegression/polynomial-regression.php' );
require_once( __DIR__ . '/../vendor/polynomialRegression/pow-regression.php' );
require_once( __DIR__ . '/../vendor/polynomialRegression/unique-weighting.php' );
require_once( 'plot.php' );
// The data consists of three columns: x, y, and weight
// y is a noisy version of 0.5 x^2 - 0.2 x - 0.1.
$data =
array
(
array( -1.00, 0.429082372117, 0.569892634364 ), array( -0.98, 0.436345120319, 0.636378047300 ),
array( -0.96, 0.946300417341, 0.223095864079 ), array( -0.94, 0.488585724422, 0.881383015552 ),
array( -0.92, 0.507196009419, 0.999988028306 ), array( -0.90, 0.521656533309, 0.894012248939 ),
array( -0.88, -0.432993635624, 0.001118592602 ), array( -0.86, 0.620720965283, 0.553547494305 ),
array( -0.84, 0.307855315214, 0.697994672029 ), array( -0.82, -0.093627344638, 0.129686879222 ),
array( -0.80, 0.359517424353, 0.939802287599 ), array( -0.78, 0.097336113317, 0.400537391569 ),
array( -0.76, 0.358160609220, 0.948817112269 ), array( -0.74, 0.285387206942, 0.894691015908 ),
array( -0.72, -0.274064421286, 0.075545117800 ), array( -0.70, 0.277578108855, 0.977899171137 ),
array( -0.68, 0.233848248484, 0.903244664976 ), array( -0.66, 0.658390358233, 0.206854608847 ),
array( -0.64, 0.221625014019, 0.966848287456 ), array( -0.62, 0.389360858446, 0.565279299925 ),
array( -0.60, 0.189432485206, 0.968631292625 ), array( -0.58, 0.180331548781, 0.988439483196 ),
array( -0.56, 0.164248498341, 0.986407549234 ), array( -0.54, 0.097501356331, 0.840434240181 ),
array( -0.52, 0.119582406844, 0.942294220586 ), array( -0.50, 0.076778447408, 0.862199166178 ),
array( -0.48, 0.111391748860, 0.999424863716 ), array( -0.46, 0.218096278888, 0.680783916604 ),
array( -0.44, 0.220927612807, 0.644686798859 ), array( -0.42, 0.050800637625, 0.937165911538 ),
array( -0.40, 0.133490346077, 0.795334545733 ), array( -0.38, 0.050325561687, 0.993636859372 ),
array( -0.36, 0.125172166553, 0.757622265013 ), array( -0.34, 0.028406160314, 0.992201877572 ),
array( -0.32, -0.632114422481, 0.043869542050 ), array( -0.30, -0.377195572311, 0.235805021628 ),
array( -0.28, -0.838761757082, 0.004577458194 ), array( -0.26, 0.025756790679, 0.884855470507 ),
array( -0.24, -0.129885768151, 0.712873973182 ), array( -0.22, -0.091615936230, 0.831072011789 ),
array( -0.20, -0.042166426936, 0.993514789242 ), array( -0.18, -0.731100282906, 0.031764573338 ),
array( -0.16, -0.074678621279, 0.942694995709 ), array( -0.14, -0.010666450671, 0.853229614140 ),
array( -0.12, -0.423784984117, 0.268354866384 ), array( -0.10, -0.014367098462, 0.828907433941 ),
array( -0.08, -0.079386336384, 0.995765001662 ), array( -0.06, 0.366272140354, 0.164141601459 ),
array( -0.04, -0.035810132564, 0.842864571750 ), array( -0.02, 0.173016713505, 0.390911788143 ),
array( 0.00, -0.190364896737, 0.752664850487 ), array( 0.02, -0.507882832432, 0.211620478052 ),
array( 0.04, -0.028640420194, 0.782351244421 ), array( 0.06, -0.107881081963, 0.993059365560 ),
array( 0.08, -0.024250171192, 0.757179404521 ), array( 0.10, -0.117606369137, 0.992201254363 ),
array( 0.12, -0.102269730291, 0.957039509328 ), array( 0.14, -1.106658103724, 0.000001537558 ),
array( 0.16, -0.120761711644, 0.995322178087 ), array( 0.18, 0.759083522349, 0.001776681974 ),
array( 0.20, -0.094338197702, 0.924973278381 ), array( 0.22, -0.486209665402, 0.254346420737 ),
array( 0.24, -0.343395568066, 0.466935365843 ), array( 0.26, -0.624836207159, 0.120088611245 ),
array( 0.28, 0.661480114573, 0.010899684735 ), array( 0.30, -0.504487870789, 0.227553169938 ),
array( 0.32, -0.112587885487, 0.999363791430 ), array( 0.34, -0.042354488563, 0.809960212155 ),
array( 0.36, 0.858765224438, 0.000039424725 ), array( 0.38, 0.710171402360, 0.006437824548 ),
array( 0.40, 0.196668403984, 0.347920792140 ), array( 0.42, 0.195016059482, 0.356678291182 ),
array( 0.44, 0.139607332753, 0.455098502332 ), array( 0.46, 0.732007467101, 0.006007975072 ),
array( 0.48, -0.062308039377, 0.945543652579 ), array( 0.50, -0.064679333277, 0.969356448998 ),
array( 0.52, -0.235834408073, 0.577937913809 ), array( 0.54, -0.062405883212, 0.999382477518 ),
array( 0.56, 0.295241751701, 0.274065460164 ), array( 0.58, -0.048056778183, 0.999229863239 ),
array( 0.60, -0.044261192643, 0.987270817986 ), array( 0.62, 0.183358354779, 0.483443937529 ),
array( 0.64, -0.338719007719, 0.320689084223 ), array( 0.66, 0.823250995966, 0.004294898834 ),
array( 0.68, -0.005131977430, 0.999004398299 ), array( 0.70, 0.538155869311, 0.101745616887 ),
array( 0.72, -0.062350563324, 0.784924183587 ), array( 0.74, 0.473045854508, 0.168886923699 ),
array( 0.76, 0.216363321550, 0.552249336945 ), array( 0.78, -0.559543696016, 0.060354519356 ),
array( 0.80, 0.060103424898, 0.999689757395 ), array( 0.82, 0.679733968306, 0.060451380394 ),
array( 0.84, 0.458152379065, 0.246076525234 ), array( 0.86, 0.968899880624, 0.002141706519 ),
array( 0.88, 0.262240173749, 0.611873181275 ), array( 0.90, 1.034976322902, 0.000729575505 ),
array( 0.92, 0.188591612117, 0.859023265306 ), array( 0.94, 1.037115142351, 0.001588705877 ),
array( 0.96, 1.006945736352, 0.004240064205 ), array( 0.98, 0.236838302140, 0.850251616410 ),
array( 1.00, 0.190190185081, 0.970858308627 ),
);
// Precision digits in BC math.
bcscale( 10 );
// Add raw data to plot.
plotRenderData( $data, $colorMap[ "Red" ] );
$plot->setX_Span( -1, 1 );
$plot->setY_Span( -1, 1 );
plotAddScale();
// Render the points to image.
$plot->setCircleSize( 3 );
$plot->renderPoints();
// Create two power-of regression classes. One will be weighted, the other
// will not.
$regression1 = new PolynomialRegression( 3 );
$regression2 = new PolynomialRegression( 3 );
// Setup unique weighting on first regression.
$weighting = new UniqueWeighting();
$regression1->setWeighting( $weighting );
// Add all the data to both regression analysis.
foreach ( $data as $dataPoint )
{
// Set weighting term before adding data.
$weighting->setWeight( number_format( $dataPoint[ 2 ], 10 ) );
$regression1->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
$regression2->addData( $dataPoint[ 0 ], $dataPoint[ 1 ] );
}
// Get coefficients for the polynomial.
$coefficients1 = $regression1->getCoefficients();
$coefficients2 = $regression2->getCoefficients();
$coefficients3 = array( -0.1, -0.2, 0.5 ); // <- Actual coefficients.
// Display the functions for each plot.
$y = $imageHeight - $bottomMargin;
printFunction( $y, 1, $colorMap[ "Red" ], $coefficients3 );
printFunction( $y, 2, $colorMap[ "Green" ], $coefficients1 );
printFunction( $y, 3, $colorMap[ "Blue" ], $coefficients2 );
// Plot each of the curves.
plotRenderRegression( $regression2, $coefficients3, -1, 1, $colorMap[ "LightRed" ] );
plotRenderRegression( $regression1, $coefficients1, -1, 1, $colorMap[ "Green" ] );
plotRenderRegression( $regression2, $coefficients2, -1, 1, $colorMap[ "Blue" ] );
// Output image
header( "Content-Type: image/png" );
imagePNG( $image );
function printFunction( $y, $line, $color, $coefficients )
{
global $image;
global $leftMargin;
$functionText = "y = ";
foreach ( $coefficients as $power => $coefficient )
{
if ( $power > 0 )
{
if ( $coefficient > 0 )
$functionText .= " + ";
else
{
$functionText .= " - ";
$coefficient = -$coefficient;
}
}
$functionText .= number_format( $coefficient, 2 );
if ( $power > 0 )
{
$functionText .= "x";
if ( $power > 1 )
$functionText .= "^" . $power;
}
}
// Place text
imageString
(
$image,
2,
$leftMargin + 2,
$y - $line * imagefontheight( 2 ),
$functionText,
$color
);
}
Here we have data from a 2nd degree polynomial function, and weighting data has been pre-calculated and placed in a third column. The UniqueWeighting class is used to add this custom weighting term to each value. For this example, the weighting term is an estimate of how likely the data point is to be accurate.
After declaring an instance of the weighting class it can be assigned to the regression class using the setWeighting function. Notice how this function is called before each data point is added to the regression.
In this example the red dots represent noisy data and the red line a plot of the actual data. The blue line is the unweighted regression, and the green the weighted regression.
The weighting allows the more accurate data points to be more strongly considered, and thus the green line more accurately fits the original curve.
Mathematics
The mathematics behind polynomial regression is broken into several sections, some of which is more detailed.
Detailed
Articles
This library grew out of the author's quest to understand the mathematics of various curve fitting techniques of which polynomial regression is one. There are several blog postings about the math behind this library:
- Using Polynomial Regression for Interpolation
- Weighted Least-square Regression
- Some Non-Linear Functions that can be Linearized for Least-Square Regression
- Forced Coefficient Polynomial Regression
- Least-Square Regression Demo
- A better n-Point Curve
- Least-square Polynomial Regression
- Least Squares Quadratic Curve Fitting
License
This software is free, open-source software released under the GNU license.
Author
Polynomial regression class is written and maintained by Andrew Que. To get in touch with Andrew Que, visit his contact page. Please do not contact the author about questions or comments relating to external code repositores (like github) as the author did not create them and does not maintain them.