Threaded Mode | Linear Mode

Namir · 11-10-2018, 10:22 PM

Hi All,

I stumbled on an article discussing the advantages of using Bernstein polynomials for curve fitting. Unlike regular polynomials, the Bernstein polynomials offer smooth fitting with no wild deviations that occur when the order of the fitting classical polynomial is high.

The summary of Bernstein polynomials fitting of order n is:

1) map the values of the x observations to be in [0, 1] by using the minimum and maximum values of the original observations:

x = (x_orig - min(x_orig))/(max(x_orig) - min(x_orig))

2) Build the X matrix where the matrix elements are:

element(i,j) = combination(n,j-1) * x(i)^(j-1)*(1-x(i))^(n-j+1) for j=1 to n+1 and i=1 to number_of_observations

3) The last column of matrix X is filled with ones, needed to generate the constant term.
4) Solve X*c = y to obtain the regression coefficients c. The entities c and y are vectors.

I have tested the above concept using Excel and obtained satisfactory Bernstein polynomials fits.

Valentin Albillo · 11-11-2018, 03:10 AM

.
Hi, Namir:

(11-10-2018 10:22 PM)Namir Wrote: I have tested the above concept using Excel and obtained satisfactory Bernstein polynomials fits.

Please use your handy Bernstein Polynomial Excel implementation to fit the following data in [0,1]:

(0, 0), (0.1, 0.01), (0.2, 0.04), (0.3, 0.09), (0.4, 0.16), (0.5, 0.25),
(0.6, 0.36), (0.7, 0.49), (0.8, 0.64), (0.9, 0.81), (1, 1)

What values does produce your Bernstein fit for, say, order 2 ?
What are the coefficients of the Bernstein Polynomial fit ?

Have a nice weekend.
V.
.

Thomas Klemm · 11-11-2018, 05:12 AM

(11-11-2018 03:10 AM)Valentin Albillo Wrote: What values does produce your Bernstein fit for, say, order 2 ?
What are the coefficients of the Bernstein Polynomial fit ?

The Bernstein Polynomial reproduces the function for \(f(x)=1\) and \(f(x)=x\):

\(
\begin{matrix}
B_n(1;x)=1 \\
B_n(x;x)=x
\end{matrix}
\)

However for \(f(x)=x^2\) we get:

\(
B_n(x^2;x) = x^2 + \frac{1}{n}x(1−x)
\)

Thus for \(n=10\) it's off by \(\frac{1}{40}\) at \(x=\frac{1}{2}\).

While it can be shown that \(\lim _{n\to \infty }{B_{n}(f)}=f\) uniformly on the interval [0, 1] the convergence is rather slow.

Cheers
Thomas

Namir · 11-11-2018, 06:00 AM

(11-11-2018 03:10 AM)Valentin Albillo Wrote: .
Hi, Namir:

(11-10-2018 10:22 PM)Namir Wrote: I have tested the above concept using Excel and obtained satisfactory Bernstein polynomials fits.

Please use your handy Bernstein Polynomial Excel implementation to fit the following data in [0,1]:

(0, 0), (0.1, 0.01), (0.2, 0.04), (0.3, 0.09), (0.4, 0.16), (0.5, 0.25),
(0.6, 0.36), (0.7, 0.49), (0.8, 0.64), (0.9, 0.81), (1, 1)

What values does produce your Bernstein fit for, say, order 2 ?
What are the coefficients of the Bernstein Polynomial fit ?

Have a nice weekend.
V.
.

Constant = 1
b0 = 0
b1 = -0.5
b2 = -1

And with no constant term
b0 = 1
b1 = 0.5
b2 = 0

Thomas Klemm · (This post was last modified: 11-12-2018 05:46 AM by Thomas Klemm.)

I assumed you used the Bernstein Expansion to approximate a continuous function.

Here's a Python program that does this expansion:

Code:

from math import factorial

def binominal(n, k):

    return factorial(n) // factorial(k) // factorial(n-k)

def bernstein(n, k, x):

    return binominal(n, k) * x**k * (1-x)**(n-k)

def expand(f, x, n):

    return sum([f(j/n) * bernstein(n, j, x) for j in range(n+1)])

We can now compare the different results:

Code:

n = 10

for j in range(n+1):

    x = j / n

    f = lambda x: x**2

    approx = lambda x: x**2 + x*(1-x)/n

    print("%.1f: %.4f %.4f %.4f" % (x, f(x), expand(f, x, n), approx(x)))

x: f(x) expand approx
0.0: 0.0000 0.0000 0.0000
0.1: 0.0100 0.0190 0.0190
0.2: 0.0400 0.0560 0.0560
0.3: 0.0900 0.1110 0.1110
0.4: 0.1600 0.1840 0.1840
0.5: 0.2500 0.2750 0.2750
0.6: 0.3600 0.3840 0.3840
0.7: 0.4900 0.5110 0.5110
0.8: 0.6400 0.6560 0.6560
0.9: 0.8100 0.8190 0.8190
1.0: 1.0000 1.0000 1.0000

We can see that the expansion and the approximation agree.
That's astonishing since the expansion is based on Bernstein polynomials of degree 10 whereas the approximation is only of degree 2.

We can also see that approximation isn't really good except at \(x=0\) and \(x=1\).

However you appear to calculate a best fit of a function using Bernstein polynomials of a certain degree.

Here's how we can do this with Python:

Code:

from numpy import array, dot

from numpy.linalg import solve

b = array([x**2 for j in range(m+1) for x in [j/m]])

A = array([[bernstein(n, k, x) for k in range(n+1)] for j in range(m+1) for x in [j/m]])

These are the intermediate results:

>>> b
array([0. , 0.01, 0.04, 0.09, 0.16, 0.25, 0.36, 0.49, 0.64, 0.81, 1. ])

>>> A
array([[1. , 0. , 0. ],
[0.81, 0.18, 0.01],
[0.64, 0.32, 0.04],
[0.49, 0.42, 0.09],
[0.36, 0.48, 0.16],
[0.25, 0.5 , 0.25],
[0.16, 0.48, 0.36],
[0.09, 0.42, 0.49],
[0.04, 0.32, 0.64],
[0.01, 0.18, 0.81],
[0. , 0. , 1. ]])

For a best fit we solve the system that we get when both A and b are multiplied by A.T, the transpose of A:

Code:

solve(A.T @ A, A.T @ b)

The result is:

array([0., 0., 1.])

That's not a surprise as \(B_{2,2}(x) = x^2\).
However we would get the same result with fewer or more points as long as they are exactly on the curve as in Valentin's example.

What's the benefit of using Bernstein polynomials of order 2 compared to the ordinary basis \(1, x, x^2\)?

And why would you need an additional constant?
Since \(\sum_{i=0}^n B_{i,n}=1\) the constant 1 is already present as a linear combination.

Kind regards
Thomas

Namir · (This post was last modified: 11-11-2018 02:56 PM by Namir.)

Thomas,

I was doing a regression with Bernstein basis polynomials terms. I do not use the term f(j/n). With orders of 4 or 5 I get good results.

In my Excel sheet column A has the original values of x, and column B has the normalized values of x. Column C has the y data. To calculate the fits row of Bernstein basis polynomials terms based on the x(1) value in cell B2, I use the following formulas:

=COMBIN(4,0)*(1-B2)^4 in cell E2
=COMBIN(4,1)*B2*(1-B2)^3 in cell F2
=COMBIN(4,2)*B2^2*(1-B2)^2 in cell G2
=COMBIN(4,3)*B2^3*(1-B2) in cell H2
=COMBIN(4,4)*B2^4 in cell I2

I then replicate these formulas down the rows of data and then use Excel Analysis Tool to perform multiple regression. I specified that the model include a constant term. This choice is optional.

Namir

Thomas Klemm · 11-12-2018, 05:39 AM

(11-11-2018 01:51 PM)Namir Wrote: I was doing a regression with Bernstein basis polynomials terms.

There's nothing wrong with using them as a basis for regression.
However there's no benefit compared to using the ordinary polynomials \(1, x, x^2, \cdots\).

To see why, let's assume \(M\) is the matrix that describes the change of basis.
As before we try to solve the overdetermined equation: \(A\ x = b\)
Instead we solve: \(A^\top \ A\ x = A^\top \ b\)

Given that \(A = C\ M\) we conclude that \(A^\top = M^\top \ C^\top\).
Therefore:

\(A^\top \ A = M^\top \ C^\top \ C\ M\)

and

\( A^\top \ b = M^\top \ C^\top\ b\)

And so the equation becomes:

\(M^\top \ C^\top \ C\ M\ x = M^\top \ C^\top\ b\)

We can drop \(M^\top\) on both sides:

\(C^\top \ C\ M\ x = C^\top\ b\)

And with the definition \(y = M\ x\) this becomes:

\(C^\top \ C\ y = C^\top\ b\)

Thus it doesn't matter which basis is used to calculate the regression.
You could also use Chebyshev polynomials if you prefer them.

Example:

To make it a bit more interesting we will try to find a best fit for the sine function.
Thus we use:

Code:

from math import sin, pi

b = array([sin(x*pi/2) for j in range(m+1) for x in [j/m]])

>>> b
array([0. , 0.15643447, 0.30901699, 0.4539905 , 0.58778525,
0.70710678, 0.80901699, 0.89100652, 0.95105652, 0.98768834,
1. ])

The transformation matrix \(M\) is defined as:

Code:

M = array([[ 1,  0,  0],

           [-2,  2,  0],

           [ 1, -2,  1]])

And then we use the ordinary instead of the Bernstein polynomials:

Code:

C = array([[x**k for k in range(n+1)] for j in range(m+1) for x in [j/m]])

>>> C
array([[1. , 0. , 0. ],
[1. , 0.1 , 0.01],
[1. , 0.2 , 0.04],
[1. , 0.3 , 0.09],
[1. , 0.4 , 0.16],
[1. , 0.5 , 0.25],
[1. , 0.6 , 0.36],
[1. , 0.7 , 0.49],
[1. , 0.8 , 0.64],
[1. , 0.9 , 0.81],
[1. , 1. , 1. ]])

We can then solve the system with:

Code:

y = solve(C.T @ C, C.T @ b)

The result is:

>>> y
array([-0.01699491, 1.85988312, -0.82839241])

But we can do the same using the Bernstein polynomials (using the definition of A from my previous post):

Code:

x = solve(A.T @ A, A.T @ b)

Here the result is:

>>> x
array([-0.01699491, 0.91294665, 1.0144958 ])

Now we can check that \(y = M\ x\) holds:

>>> M @ x
array([-0.01699491, 1.85988312, -0.82839241])

We can use WolframAlpha to plot this function and get:

[Image: attachment.php?aid=6586]

(11-10-2018 10:22 PM)Namir Wrote: I stumbled on an article discussing the advantages of using Bernstein polynomials for curve fitting. Unlike regular polynomials, the Bernstein polynomials offer smooth fitting with no wild deviations that occur when the order of the fitting classical polynomial is high.

This happens e.g. with Lagrange interpolation and is called Runge's phenomenon.
To mitigate the problem both Least squares fitting and approximation by using Bernstein polynomials are mentioned.
However these methods are not the same.

Cheers
Thomas

Thomas Okken · 11-12-2018, 01:21 PM

(11-12-2018 05:39 AM)Thomas Klemm Wrote:
(11-10-2018 10:22 PM)Namir Wrote: I stumbled on an article discussing the advantages of using Bernstein polynomials for curve fitting. Unlike regular polynomials, the Bernstein polynomials offer smooth fitting with no wild deviations that occur when the order of the fitting classical polynomial is high.

This happens e.g. with Lagrange interpolation and is called Runge's phenomenon.
To mitigate the problem both Least squares fitting and approximation by using Bernstein polynomials are mentioned.
However these methods are not the same.

I vaguely remember learning about Chebyshev polynomials for this purpose. As I recall, Chebyshev fits have the nice property of having a hard upper bound on the error, which is within a constant (a factor of about 3 IIRC) of the worst-case error of the optimal fit. I'd have to dig around to find that textbook, though, it may have been lost in the mists of time...

Thomas Klemm · 11-12-2018, 02:09 PM

(11-12-2018 01:21 PM)Thomas Okken Wrote: I vaguely remember learning about Chebyshev polynomials for this purpose.

They are mentioned in the section: Change of interpolation points.

Quote:As I recall, Chebyshev fits have the nice property of having a hard upper bound on the error, which is within a constant (a factor of about 3 IIRC) of the worst-case error of the optimal fit. I'd have to dig around to find that textbook, though, it may have been lost in the mists of time...

It appears to be even better:

Quote:Therefore, when the interpolation nodes x_i are the roots of T_n, the error satisfies:
\(\left|f(x)-P_{{n-1}}(x)\right|\leq {\frac {1}{2^{{n-1}}n!}}\max _{{\xi \in [-1,1]}}\left|f^{{(n)}}(\xi )\right|\)

Cheers
Thomas