Estimating Multivariate Functions and Derivatives using Local Polynomial Regression in Matlab

1. Theoretical Background

I want to start with some theory which will in the end lead us to the Matlab coding. If you are not interested how to derive the estimator in detail, feel free to skip this section. If you are interested in a detailed discussion about local polynomial regression I recommend to have a look into Fan and Gijbles (1996).

We assume that a smooth curve X(t) \in \mathbb{R}^g is observed at independent randomly-distributed points t_{k} \in [0,1]^g, \; k=1,\dots, T contaminated with additional noise. Our model is then given by

(1)   \begin{equation*}  Y(t_{k})=X(t_{k}) + \epsilon_{k} \end{equation*}

where \epsilon_{k} are i.i.d. random variables with \mathbf{E}\left[\varepsilon_{k}\right]=0, \var\left(\epsilon_{k}\right)= \sigma^2 and \epsilon_{k} is independent of X= X(t_1),\dots, X(t_k).

Our goal is then to estimate partial derivatives  which are denoted using a vector d=(d_1,\dots,d_g)

(2)   \begin{equation*} X^{(d)}(t) \stackrel{\operatorname{def}}{=} \frac{\partial^{d_1} }{\partial t_1^{d_1} } \cdots \frac{\partial^{d_g} }{\partial t_g^{d_g} } X(t_1,\dots,t_g). \end{equation*}

For the following a little notation is needed. For any vectors a,b \in \mathbb{R}^g  we define |a|\stackrel{\operatorname{def}}{=}\sum_{j=1}^g |a_j|, a^b\stackrel{\operatorname{def}}{=}a_1^{b_1} \times \dots \times a_g^{b_g}. Let k=(k_1,\dots, k_g)^{\top}, k_l \in \mathbb{N} and consider a multivariate local polynomial estimator of order \rho for a point t \in \mathbb{R}^g given by \hat{\beta}(t) \in \mathbb{R}^\rho that solves

(3)   \begin{equation*} \underset{\beta(t)}{\operatorname{min}} \sum_{l=1}^{T} \left[ Y(t_{l}) - \sum_{0 \leq |k| \leq \rho} \beta_{k}(t) (t_{l}-t)^k \right]^2 K_{B} (t_{l}-t). \end{equation*}

K_B is any non-negative, symmetric and bounded  multivariate kernel function and B a g \times g bandwidth matrix. For simplicity, we assume that B has main diagonal entries b=(b_1,\dots,b_g)^{\top} and zero elsewhere. Remember that \hat{\beta}(t) is a vector of size \rho, the desired estimates are then given by \hat{X}^{(d)}(t) = |d|! \hat{\beta}_c(t) where c is the position in the vector which is dependent on the ordering of the polynomials in (3).

To code the estimator in Matlab we will write down the estimator using matrices. Let \tilde{t}_l=(t_{l}-t),

    \begin{equation*} \textbf{Z}(t)= \begin{bmatrix} 1/g& \tilde{t}_{11} & \tilde{t}_{12} & \dots & \tilde{t}_{11}  \tilde{t}_{12} & \dots & \tilde{t}_{1g}^\rho\\ 1/g& \tilde{t}_{21} & \tilde{t}_{22} & \dots & \tilde{t}_{21}  \tilde{t}_{22} & \dots & \tilde{t}_{2g}^\rho\\ \vdots & \vdots & \vdots & \ddots & \vdots & \ddots & \vdots\\ 1/g& \tilde{t}_{T1} & x_{T2} & \dots & \tilde{t}_{T1}  \tilde{t}_{T2} &\dots & \tilde{t}_{Tg}^\rho \end{bmatrix} , \textbf{A}(t) = \begin{bmatrix} K_{B} (\tilde{t}_{1}) & 0 & \dots & 0 \\ 0& K_{B} (\tilde{t}_{2}) & \dots & 0 \\ \vdots &  \vdots & \ddots & \vdots \\ 0 & 0 & \dots & K_{B} (\tilde{t}_{T}) \end{bmatrix} \end{equation*}

using this notation together with \textbf{Y}=( Y(t_1),\dots,Y(t_T) )^T , (3) can then be rewritten as

(4)   \begin{equation*} min_{\beta(t)} \textbf{Z}(t)^T \textbf{A}(t) \textbf{Y} - \beta(t) \textbf{Z}(t)^T \textbf{A}(t) \textbf{Z}(t) \end{equation*}

(4) looks very much like a least squares problem, and under some regularity assumptions the solution is given by

(5)   \begin{equation*} \hat{\beta}(t)= \left( \textbf{Z}(t)^T \textbf{A}(t) \textbf{Z}(t) \right)^{-1} \textbf{Z}(t)^T \textbf{A}(t) \textbf{Y}. \end{equation*}

This expression is then indeed easy to code.

2. Implementation using Matlab

The implementation is based on product kernels of a Gaussian and an Epanechnikov kernel. For running a kernel regression on very large data sets bounded kernels like the Epanechnikov kernel are of particular interest because a lot of entries in (4) will become zero if t_l is far from t. A conscientious programmer will thus chose only a certain window based on t and bandwidth b of observations to construct \textbf{Z}(t) and compute to (5). Since I am not a conscientious programmer and I chose to keep the implementation simple this time, in this implementation this speed-up opportunity is omitted 😉

% ------------------------------------------------------------------------------ 
% Description: g-Dimensional local polynomial estimator 
%
% ------------------------------------------------------------------------------ 
% Usage:       - 
% ------------------------------------------------------------------------------ 
% Inputs Simulation:      
%X - g dimensional inputcoordinate (Dimension: Txg)
%Y - function value (Dimension: Tx1) 
%x - g dimensional output coordinates (Dimension: T_outxg)
%b - g dimensional vector of  bandwiths (Dimension: 1xg) 
%p - degree of polynomial (rho)
  
% ------------------------------------------------------------------------------ 
%Output:      
%fit  - Smoothed curves and derivatives
 
% ------------------------------------------------------------------------------ 
% Keywords:    local polynomial surface estimator, derivatives
% ------------------------------------------------------------------------------ 
% See also:    -  
% ------------------------------------------------------------------------------ 
% Author:      Heiko Wagner, 2017/01/18
% ------------------------------------------------------------------------------ 



 
function [fit] = multilocpoly(X,Y,x,b,p,kernel)

%% Epanechnikov Kernel function
function [k]=epan(t);
%t=(-100:100)/50
k= 3/4*(1-t.^2);
   k((abs(t)-1)>0)=0;
end


g= size(X,2) ;
T = size(X,1);
T_out = size(x,1);
dummy=permn(0:p,g);             %%%Get all possible combinations
k=dummy( sum(dummy,2)<p+1,:);
pp=size(k,1);    
Z = zeros( T, pp );   

function [beta_j]=getpoint(x_j)
    %%%Construct g dimensional grid around point j, at this pint you can improve the code if you use the Epanechnikov Kernel by only selecting certain observations
    Xn = X -  repmat( x_j,T,1 );  
     
    %%%Construct polynomials
    for m=1:pp
        Z(:,m)= prod( Xn.^repmat(k(m,:),T,1) ,2);
    end
     
    %%%Construct g-dimensional product kernel
    A=1;
    if(strcmp(kernel,'Gauss')==1)   %Use Gaussian Kernel
        for (m=1:g)
           A= A.*normpdf(Xn(:,m)/b(m))/b(m)  ;
        end
    else                            %Use Epanechnikov Kernel
        for (m=1:g)
           A= A.*epan(Xn(:,m)/b(m))/b(m)  ;
        end
    end
    A=diag(A); 
    
    %%%Solve for beta_j
    beta_j = inv(Z' * A * Z) * Z' * A * Y; 
end
%%%To estimate not just a single point we estimate the function for an entire set of g dimensional output coordinates (x) 
[out]=arrayfun(@(s) getpoint( x(s,:) ), 1:T_out , 'UniformOutput', false);
fit=[out{:}]'*diag(max(1,factorial( sum(k,2) )));
end
end

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.