Research ArticleA Note on the Adaptive Estimation of a Multiplicative Separable Regression Function Christophe Chesneau Laboratoire de Math´ematiques Nicolas Oresme, Universit´e de Caen
Trang 1Research Article
A Note on the Adaptive Estimation of a Multiplicative Separable Regression Function
Christophe Chesneau
Laboratoire de Math´ematiques Nicolas Oresme, Universit´e de Caen Basse-Normandie, Campus II, Science 3, 14032 Caen, France
Correspondence should be addressed to Christophe Chesneau; christophe.chesneau@gmail.com
Received 18 January 2014; Accepted 25 February 2014; Published 20 March 2014
Academic Editors: F Ding, E Skubalska-Rafajlowicz, and H C So
Copyright © 2014 Christophe Chesneau This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited
We investigate the estimation of a multiplicative separable regression function from a bidimensional nonparametric regression model with random design We present a general estimator for this problem and study its mean integrated squared error (MISE) properties A wavelet version of this estimator is developed In some situations, we prove that it attains the standard unidimensional rate of convergence under the MISE over Besov balls
1 Motivations
We consider the bidimensional nonparametric regression
model with random design described as follows Let
(𝑌𝑖, 𝑈𝑖, 𝑉𝑖)𝑖∈Zbe a stochastic process defined on a probability
space(Ω, A, P), where
𝑌𝑖= ℎ (𝑈𝑖, 𝑉𝑖) + 𝜉𝑖, 𝑖 ∈ Z, (1) (𝜉𝑖)𝑖∈Zis a strictly stationary stochastic process,(𝑈𝑖, 𝑉𝑖)𝑖∈Zis
a strictly stationary stochastic process with support in[0, 1]2,
and ℎ : [0, 1]2 → R is an unknown bivariate regression
function It is assumed thatE(𝜉1) = 0, E(𝜉2
1) exists, (𝑈𝑖, 𝑉𝑖)𝑖∈Z are independent,(𝜉𝑖)𝑖∈Zare independent, and, for any𝑖 ∈ Z,
(𝑈𝑖, 𝑉𝑖) and 𝜉𝑖 are independent In this study, we focus our
attention on the case where ℎ is a multiplicative separable
regression function: there exist two functions𝑓 : [0, 1] → R
and𝑔 : [0, 1] → R such that
ℎ (𝑥, 𝑦) = 𝑓 (𝑥) 𝑔 (𝑦) (2)
We aim to estimate ℎ from the 𝑛 random variables:
(𝑌1, 𝑈1, 𝑉1), , (𝑌𝑛, 𝑈𝑛, 𝑉𝑛) This problem is plausible in
many practical situations as in utility, production, and cost
function applications (see, e.g., Linton and Nielsen [1],
Yatchew and Bos [2], Pinske [3], Lewbel and Linton [4], and Jacho-Ch´avez [5])
In this note, we provide a theoretical contribution to the subject by introducing a new general estimation method for
ℎ A sharp upper bound for its mean integrated squared error (MISE) is proved Then we adapt our methodology to propose an efficient and adaptive procedure It is based on two wavelet thresholding estimators following the construction studied in Chaubey et al [6] It has the features to be adaptive for a wide class of unknown functions and enjoy nice MISE properties Further details on wavelet estimators can be found in, for example, Antoniadis [7], Vidakovic [8], and H¨ardle et al [9] Despite the so-called “curse of dimensionality” coming from the bidimensionality of (1),
we prove that our wavelet estimator attains the standard unidimensional rate of convergence under the MISE over Besov balls (for both the homogeneous and inhomogeneous zones) It completes asymptotic results proved by Linton and Nielsen [1] via nonadaptive kernel methods for the structured nonparametric regression model
The paper is organized as follows Assumptions on (1 and some notations are introduced inSection 2 Section 3 presents our general MISE result Section 4 is devoted to our wavelet estimator and its performances in terms of rate
http://dx.doi.org/10.1155/2014/271303
Trang 2of convergence under the MISE over Besov balls Technical
proofs are collected inSection 5
2 Assumptions and Notations
For any𝑝 ≥ 1, we set
L𝑝([0, 1])
= {V : [0, 1] → R; ‖V‖𝑝= (∫1
0 |V (𝑥)|𝑝𝑑𝑥)
1/𝑝
< ∞}
(3)
We set
𝑒𝑜= ∫1
0 𝑓 (𝑥) 𝑑𝑥, 𝑒∗ = ∫1
0 𝑔 (𝑥) 𝑑𝑥, (4) provided that they exist
We formulate the following assumptions
(H1) There exists a known constant𝐶1> 0 such that
sup
𝑥∈[0,1]𝑓(𝑥) ≤ 𝐶1 (5) (H2) There exists a known constant𝐶2> 0 such that
sup
𝑥∈[0,1]𝑔(𝑥) ≤ 𝐶2 (6) (H3) The density of(𝑈1, 𝑉1), denoted by 𝑞, is known and
there exists a constant𝑐3> 0 such that
𝑐3≤ inf (𝑥,𝑦)∈[0,1] 2𝑞 (𝑥, 𝑦) (7) (H4) There exists a known constant𝜔 > 0 such that
𝑒𝑜𝑒∗ ≥ 𝜔 (8) The assumptions (H1) and (H2), involving the boundedness
of ℎ, are standard in nonparametric regression models
The knowledge of 𝑞 discussed in (H3) is restrictive but
plausible in some situations, the most common case being
(𝑈1, 𝑉1) ∼ U([0, 1]2) (the uniform distribution on [0, 1]2)
Finally, mention that (H4) is just a technical assumption more
realistic to the knowledge of𝑒𝑜and𝑒∗(depending on𝑓 and
𝑔, resp.)
3 MISE Result
Theorem 1presents an estimator forℎ and shows an upper
bound for its MISE
Theorem 1 One considers (1) under (H1)–(H4) One
intro-duces the following estimator forℎ (2):
̂ℎ (𝑥, 𝑦) = 𝑓 (𝑥) ̃𝑔 (𝑦)̃
̃𝑒 1{|̃𝑒|≥𝜔/2}, (9)
where ̃ 𝑓 denotes an arbitrary estimator for 𝑓𝑒∗inL2([0, 1]), ̃𝑔
denotes an arbitrary estimator for𝑔𝑒𝑜inL2([0, 1]), 1 denotes
the indicator function,
̃𝑒 = 1𝑛∑𝑛
𝑖=1
𝑌𝑖
𝑞 (𝑈𝑖, 𝑉𝑖), (10)
and 𝜔 refers to (H4).
Then there exists a constant 𝐶 > 0 such that
E (∬1
0(̂ℎ (𝑥, 𝑦) − ℎ (𝑥, 𝑦))2𝑑𝑥 𝑑𝑦)
≤ 𝐶 (E (̃𝑔 − 𝑔𝑒𝑜2
2) + E ( ̃𝑓− 𝑓𝑒∗2
2) + E (̃𝑔 − 𝑔𝑒𝑜2
2𝑓 − 𝑓𝑒̃ ∗2
2) +1𝑛)
(11)
The form of ̃ℎ (9) is derived to the multiplicative separable structure of ℎ (2) and a ratio-type normalization Other results about such ratio-type estimators in a general statistical context can be found in Vasiliev [10]
Based on Theorem 1, ̂ℎ is efficient for ℎ if and only if
̃
𝑓 is efficient for 𝑓𝑒∗ and ̃𝑔 is efficient for 𝑔𝑒𝑜 in terms of MISE Even if several methods are possible, we focus our attention on wavelet methods enjoying adaptivity for a wide class of unknown functions and having optimal properties under the MISE For details on the interests of wavelet methods in nonparametric statistics, we refer to Antoniadis [7], Vidakovic [8], and H¨ardle et al [9]
4 Adaptive Wavelet Estimation
Before introducing our wavelet estimators, let us present some basics on wavelets
4.1 Wavelet Basis on [0, 1] Let us briefly recall the
con-struction of wavelet basis on the interval[0, 1] introduced by Cohen et al [11] Let𝑁 be a positive integer, and let 𝜙 and 𝜓
be the initial wavelets of the Daubechies orthogonal wavelets 𝑑𝑏2𝑁 We set
𝜙𝑗,𝑘(𝑥) = 2𝑗/2𝜙 (2𝑗𝑥 − 𝑘) , 𝜓𝑗,𝑘(𝑥) = 2𝑗/2𝜓 (2𝑗𝑥 − 𝑘)
(12) With appropriate treatments at the boundaries, there exists
an integer𝜏 satisfying 2𝜏 ≥ 2𝑁 such that the collection S = {𝜙𝜏,𝑘(⋅), 𝑘 ∈ {0, , 2𝜏−1}; 𝜓𝑗,𝑘(⋅); 𝑗 ∈ N−{0, , 𝜏−1}, 𝑘 ∈ {0, , 2𝑗− 1}}, is an orthonormal basis of L2([0, 1]) AnyV ∈ L2([0, 1]) can be expanded on S as
V (𝑥) =2
𝜏 −1
∑
𝑘=0
𝛼𝜏,𝑘𝜙𝜏,𝑘(𝑥) +∑∞
𝑗=𝜏
2 𝑗 −1
∑
𝑘=0
𝛽𝑗,𝑘𝜓𝑗,𝑘(𝑥) , 𝑥 ∈ [0, 1] ,
(13) where𝛼𝑗,𝑘and𝛽𝑗,𝑘are the wavelet coefficients ofV defined by
𝛼𝑗,𝑘= ∫1
0 V (𝑥) 𝜙𝑗,𝑘(𝑥) 𝑑𝑥, 𝛽𝑗,𝑘= ∫1
0 V (𝑥) 𝜓𝑗,𝑘(𝑥) 𝑑𝑥
(14)
Trang 34.2 Besov Balls For the sake of simplicity, we consider the
sequential version of Besov balls defined as follows Let𝑀 >
0, 𝑠 ∈ (0, 𝑁), 𝑝 ≥ 1 and 𝑟 ≥ 1 A function V belongs to 𝐵𝑠
𝑝,𝑟(𝑀)
if and only if there exists a constant𝑀∗ > 0 (depending on
𝑀) such that the associated wavelet coefficients (14) satisfy
2𝜏(1/2−1/𝑝)(2
𝜏 −1
∑
𝑘=0
|𝛼𝜏,𝑘|𝑝)
1/𝑝
+ (∑∞
𝑗=𝜏
(2𝑗(𝑠+1/2−1/𝑝)(2
𝑗 −1
∑
𝑘=0𝛽𝑗,𝑘𝑝
)
1/𝑝
)
𝑟
)
1/𝑟
≤ 𝑀∗ (15)
In this expression,𝑠 is a smoothness parameter and 𝑝 and
𝑟 are norm parameters For a particular choice of 𝑠, 𝑝, and
𝑟, 𝐵𝑠
𝑝,𝑟(𝑀) contains the H¨older and Sobolev balls (see, e.g.,
DeVore and Popov [12], Meyer [13], and H¨ardle et al [9])
4.3 Hard Thresholding Estimators In the sequel, we consider
() under (H1)–(H4)
We consider hard thresholding wavelet estimators for ̃𝑓
and ̃𝑔 in (9) They are based on a term-by-term selection
of estimators of the wavelet coefficients of the unknown
function Those which are greater to a threshold are kept; the
others are removed This selection is the key to the adaptivity
and the good performances of the hard thresholding wavelet
estimators (see, e.g., Donoho et al [14], Delyon and Juditsky
[15], and H¨ardle et al [9])
To be more specific, we use the “double thresholding”
wavelet technique, introduced by Delyon and Juditsky [15]
then recently improved by Chaubey et al [6] The role of
the second thresholding (appearing in the definition of the
wavelet estimator for𝛽𝑗,𝑘) is to relax assumption on the model
(seeRemark 6)
Estimator ̃ 𝑓 for 𝑓𝑒∗ We define the hard thresholding wavelet
estimator ̃𝑓 by
̃
𝑓 (𝑥) =2
𝜏 −1
∑
𝑘=0̂𝛼𝜏,𝑘𝜙𝜏,𝑘(𝑥) +∑𝑗1
𝑗=𝜏
2 𝑗 −1
∑
𝑘=0
̂
𝛽𝑗,𝑘1{| ̂𝛽𝑗,𝑘|≥𝜅𝐶∗𝜆𝑛}𝜓𝑗,𝑘(𝑥) ,
(16) where
̂𝛼𝜏,𝑘= 𝑎1
𝑛
𝑎 𝑛
∑
𝑖=1
𝑌𝑖
𝑞 (𝑈𝑖, 𝑉𝑖)𝜙𝜏,𝑘(𝑈𝑖) , (17) where𝑎𝑛is the integer part of𝑛/2,
̂
𝛽𝑗,𝑘= 1
𝑎𝑛
𝑎 𝑛
∑
𝑖=1
𝑊𝑖,𝑗,𝑘1{|𝑊𝑖,𝑗,𝑘|≤𝐶∗/𝜆𝑛},
𝑊𝑖,𝑗,𝑘= 𝑌𝑖
𝑞 (𝑈𝑖, 𝑉𝑖)𝜓𝑗,𝑘(𝑈𝑖) ,
(18)
where𝑗1 is the integer satisfying(1/2)𝑎𝑛 < 2𝑗 1 ≤ 𝑎𝑛, 𝜅 =
2 + 8/3 + 2√4 + 16/9, 𝐶∗= √(2/𝑐3)(𝐶2𝐶2+ E(𝜉2)), and
𝜆𝑛= √ln𝑎𝑛
Estimator ̃𝑔 for 𝑔𝑒𝑜 We define the hard thresholding wavelet
estimator ̃𝑔 by
̃𝑔(𝑥) =2
𝜏 −1
∑
𝑘=0
̂𝜐𝜏,𝑘𝜙𝜏,𝑘(𝑥) +∑𝑗2
𝑗=𝜏
2 𝑗 −1
∑
𝑘=0
̂𝜃𝑗,𝑘1{|̂𝜃𝑗,𝑘|≥𝜅∗𝐶∗𝜂𝑛}𝜓𝑗,𝑘(𝑥) ,
(20) where
̂𝜐𝜏,𝑘= 𝑏1
𝑛
𝑏 𝑛
∑
𝑖=1
𝑌𝑎𝑛+𝑖
𝑞 (𝑈𝑎𝑛+𝑖, 𝑉𝑎𝑛+𝑖)𝜙𝜏,𝑘(𝑉𝑎𝑛 +𝑖) , (21) where𝑎𝑛is the integer part of𝑛/2, 𝑏𝑛= 𝑛 − 𝑎𝑛,
̂𝜃𝑗,𝑘= 1
𝑏𝑛
𝑏𝑛
∑
𝑖=1
𝑍𝑎𝑛+𝑖,𝑗,𝑘1{|𝑍𝑎𝑛+𝑖,𝑗,𝑘|≤𝐶∗/𝜂𝑛},
𝑍𝑎𝑛+𝑖,𝑗,𝑘= 𝑌𝑎𝑛 +𝑖
𝑞 (𝑈𝑎𝑛+𝑖, 𝑉𝑎𝑛+𝑖)𝜓𝑗,𝑘(𝑉𝑎𝑛 +𝑖) ,
(22)
Where𝑗2is the integer satisfying(1/2)𝑏𝑛 < 2𝑗2 ≤ 𝑏𝑛,𝜅∗ =
2 + 8/3 + 2√4 + 16/9, 𝐶∗= √(2/𝑐3)(𝐶2
1𝐶2
2+ E(𝜉2
1)), and
𝜂𝑛 = √ln𝑏𝑛
Estimator for ℎ From ̃𝑓 (16) and ̃𝑔 (20), we consider the following estimator forℎ (2):
̂ℎ (𝑥, 𝑦) = 𝑓 (𝑥) ̃𝑔(𝑦)̃ ̃𝑒 1{|̃𝑒|≥𝜔/2}, (24) where
̃𝑒 = 1𝑛∑𝑛
𝑖=1
𝑌𝑖
𝑞 (𝑈𝑖, 𝑉𝑖) (25) and𝜔 refers to (H4)
Let us mention that ̃ℎ is adaptive in the sense that it does not depend on𝑓 or 𝑔 in its construction
Remark 2 Since 𝑓̃ is defined with (𝑌1, 𝑈1,
𝑉1), , (𝑌𝑎𝑛, 𝑈𝑎𝑛, 𝑉𝑎𝑛) and ̃𝑔 is defined with (𝑌𝑎𝑛+1,
𝑈𝑎𝑛+1, 𝑉𝑎𝑛+1), , (𝑌𝑛, 𝑈𝑛, 𝑉𝑛), thanks to the independence of (𝑌1, 𝑈1, 𝑉1), , (𝑌𝑛, 𝑈𝑛, 𝑉𝑛), ̃𝑓 and ̃𝑔 are independent
Remark 3 The calibration of the parameters in ̃𝑓 and ̃𝑔 is based on theoretical considerations; thus defined, ̃𝑓 and ̃𝑔can attain a fast rate of convergence under the MISE over Besov balls (see [6], Theorem 6.1]) Further details are given in the proof ofTheorem 4
Trang 44.4 Rate of Convergence Theorem 4investigates the rate of
convergence attains by ̂ℎ under the MISE over Besov balls
Theorem 4 We consider (1) under (H1)–(H4) Let ̂ ℎ be (24)
and let ℎ be (2) Suppose that
(i)𝑓 ∈ 𝐵𝑠 1
𝑝 1 ,𝑟 1(𝑀1) with 𝑀1 > 0, 𝑟1 ≥ 1, either {𝑝1 ≥ 2
and𝑠1∈ (0, 𝑁)} or {𝑝1∈ [1, 2) and 𝑠1∈ (1/𝑝1, 𝑁)},
(ii)𝑔 ∈ 𝐵𝑠 2
𝑝2,𝑟2(𝑀2) with 𝑀2 > 0, 𝑟2 ≥ 1, either {𝑝2 ≥ 2
and𝑠2∈ (0, 𝑁)} or {𝑝2∈ [1, 2) and 𝑠2∈ (1/𝑝2, 𝑁)}.
Then there exists a constant 𝐶 > 0 such that
E (∬1
0(̂ℎ (𝑥, 𝑦) − ℎ (𝑥, 𝑦))2𝑑𝑥 𝑑𝑦) ≤𝐶(ln𝑛
𝑛 )
2𝑠 ∗ /(2𝑠 ∗ +1)
, (26)
where𝑠∗ = min(𝑠1, 𝑠2).
The rate of convergence(ln 𝑛/𝑛)2𝑠 ∗ /(2𝑠 ∗ +1)is the near
opti-mal one in the minimax sense for the unidimensional
regres-sion model with random design under the MISE over Besov
balls𝐵𝑠 ∗
𝑝,𝑟(𝑀) (see, e.g., Tsybakov [16], and H¨ardle et al [9])
ThusTheorem 4proves that our estimator escapes to the
so-called “curse of dimensionality.” Such a result is not possible
with the standard bidimensional hard thresholding wavelet
estimator attaining the rate of convergence (ln 𝑛/𝑛)2𝑠/(2𝑠+𝑑)
with𝑑 = 2 under the MISE over bidimensional Besov balls
defined with 𝑠 as smoothness parameter (see Delyon and
Juditsky [15])
Theorem 4completes asymptotic results proved by
Lin-ton and Nielsen [1] investigating this problem for the
struc-tured nonparametric regression model via another
estima-tion method based on nonadaptive kernels
Remark 5 In Theorem 4, we take into account both the
homogeneous zone of Besov balls, that is,{𝑝1 ≥ 2 and 𝑠1 ∈
(0, 𝑁)}, and the inhomogeneous zone, that is, {𝑝1∈ [1, 2) and
𝑠1 ∈ (1/𝑝1, 𝑁)}, for the case 𝑓 ∈ 𝐵𝑠1
𝑝 1 ,𝑟 1(𝑀1) and the same for
𝑔 ∈ 𝐵𝑠 2
𝑝 2 ,𝑟 2(𝑀2) This has the advantage to cover a very rich
class of unknown regression functionsℎ
Remark 6 Note thatTheorem 4does not require the
knowl-edge of the distribution of𝜉1;{E(𝜉1) = 0 and the existence of
E(𝜉2
1)} is enough
Remark 7 Let us mention that the phenomenon of curse of
dimensionality has also been studied via wavelet methods
by Neumann [17] but for the multidimensional Gaussian
white noise model and with different approaches based on
anysotropic frameworks
Remark 8 Our study can be extended to the
multidimen-sional case considered by Yatchew and Bos [2], that is,𝑓 :
[0, 1]𝑞 1 → R and 𝑔 : [0, 1]𝑞 2 → R; 𝑞1 and𝑞2 denoting
two positive integers In this case, adapting our framework
to the multidimensional case (𝑞1 dimensional Besov balls,
𝑞1 dimensional (tensorial) wavelet basis, 𝑞1 dimensional
hard thresholding wavelet estimator, see, e.g, Delyon and
Juditsky [15]), one can prove that (9) attains the rate of convergence(ln 𝑛/𝑛)2𝑠∗ /(2𝑠∗+𝑞∗), where𝑠∗ = min(𝑠1, 𝑠2) and
𝑞∗= max(𝑞1, 𝑞2)
5 Proofs
In this section, for the sake of simplicity,𝐶 denotes a generic constant; its value may change from one term to another
Proof of Theorem 1 Observe that
̂ℎ (𝑥, 𝑦) − ℎ (𝑥, 𝑦) =𝑓 (𝑥) ̃𝑔 (𝑦)̃
̃𝑒 1{|̃𝑒|≥𝜔/2}− 𝑓 (𝑥) 𝑔 (𝑦)
=1
̃𝑒( ̃𝑓 (𝑥) ̃𝑔 (𝑦) − 𝑓 (𝑥) 𝑔 (𝑦) ̃𝑒) 1{|̃𝑒|≥𝜔/2}
− 𝑓 (𝑥) 𝑔 (𝑦) 1{|̃𝑒|<𝜔/2}
(27) Therefore, using the triangular inequality, the Markov ine-quality, (H1), (H2), (H4), {|̃𝑒| < 𝜔/2} ∩ {|𝑒∗𝑒𝑜| ≥ 𝜔} ⊆ {|̃𝑒 − 𝑒∗𝑒𝑜| ≥ 𝜔/2}, and again the Markov inequality, we get
̂ℎ (𝑥, 𝑦) − ℎ (𝑥, 𝑦)
≤ 2
𝜔𝑓 (𝑥) ̃𝑔 (𝑦) − 𝑓 (𝑥) 𝑔 (𝑦) ̃𝑒̃ +𝑓(𝑥)𝑔(𝑦)1{|̃𝑒|<𝜔/2}
≤ 𝐶 ( ̃𝑓(𝑥) ̃𝑔(𝑦) − 𝑓(𝑥)𝑔(𝑦) ̃𝑒 + 1{|̃𝑒−𝑒 ∗ 𝑒 𝑜 |≥𝜔/2})
≤ 𝐶 ( ̃𝑓(𝑥) ̃𝑔(𝑦) − 𝑓(𝑥)𝑔(𝑦) ̃𝑒 +̃𝑒− 𝑒∗𝑒𝑜)
(28)
On the other hand, we have the decomposition
̃
𝑓 (𝑥) ̃𝑔(𝑦) − 𝑓 (𝑥) 𝑔 (𝑦) ̃𝑒
= 𝑓 (𝑥) 𝑒∗( ̃𝑔 (𝑦) − 𝑔 (𝑦) 𝑒𝑜) + 𝑔 (𝑦) 𝑒𝑜( ̃𝑓 (𝑥) − 𝑓 (𝑥) 𝑒∗) + ( ̃𝑔 (𝑦) − 𝑔 (𝑦) 𝑒𝑜) ( ̃𝑓 (𝑥) − 𝑓 (𝑥) 𝑒∗)
+ 𝑓 (𝑥) 𝑔 (𝑦) (𝑒∗𝑒𝑜− ̃𝑒)
(29) Owing to the triangular inequality, (H1) and (H2), we have
𝑓 (𝑥) ̃𝑔 (𝑦) − 𝑓 (𝑥) 𝑔 (𝑦) ̃𝑒̃
≤ 𝐶 (̃𝑔(𝑦) − 𝑔 (𝑦) 𝑒𝑜 + ̃𝑓(𝑥)−𝑓(𝑥)𝑒∗ + ̃𝑔(𝑦) − 𝑔 (𝑦) 𝑒𝑜 ̃𝑓(𝑥)−𝑓(𝑥)𝑒∗ +̃𝑒− 𝑒∗𝑒𝑜)
(30) Putting (28) and (30) together, we obtain
̂ℎ (𝑥, 𝑦) − ℎ (𝑥, 𝑦)
≤ 𝐶 (̃𝑔(𝑦) − 𝑔 (𝑦) 𝑒𝑜 + ̃𝑓(𝑥)−𝑓(𝑥)𝑒∗ + ̃𝑔(𝑦) − 𝑔 (𝑦) 𝑒𝑜 ̃𝑓(𝑥)−𝑓(𝑥)𝑒∗ +̃𝑒− 𝑒∗𝑒𝑜)
(31)
Trang 5Therefore, by the elementary inequality:(𝑎 + 𝑏 + 𝑐 + 𝑑)2 ≤
8 (𝑎2 + 𝑏2 + 𝑐2 + 𝑑2), (𝑎, 𝑏, 𝑐, 𝑑) ∈ R4, an integration over
[0, 1]2and taking the expectation, it comes
E (∬1
0(̂ℎ (𝑥, 𝑦) − ℎ (𝑥, 𝑦))2𝑑𝑥 𝑑𝑦)
≤ 𝐶 (E (̃𝑔 − 𝑔𝑒𝑜2
2) + E ( ̃𝑓− 𝑓𝑒∗2
2) + E (̃𝑔 − 𝑔𝑒𝑜2
2𝑓 − 𝑓𝑒̃ ∗2
2) + E ((̃𝑒− 𝑒∗𝑒𝑜)2))
(32) Now observe that, owing to the independence of(𝑈𝑖, 𝑉𝑖)𝑖∈Z,
the independence between(𝑈1, 𝑉1) and 𝜉1, andE(𝜉1) = 0, we
obtain
E (̃𝑒) = E (𝑞 (𝑈𝑌1
1, 𝑉1))
= E (ℎ (𝑈1, 𝑉1)
𝑞 (𝑈1, 𝑉1)) + E (𝜉1) E (
1
𝑞 (𝑈1, 𝑉1))
= ∬1
0
𝑓 (𝑥) 𝑔 (𝑦)
𝑞 (𝑥, 𝑦) 𝑞 (𝑥, 𝑦) 𝑑𝑥 𝑑𝑦
= (∫1
0 𝑓 (𝑥) 𝑑𝑥) (∫1
0 𝑔 (𝑦) 𝑑𝑦) = 𝑒∗𝑒𝑜
(33)
Then, using similar arguments to (33),(𝑎 + 𝑏)2≤2(𝑎2+ 𝑏2),
(𝑎, 𝑏) ∈ R2, (H1), (H2), (H3), andE(𝜉21) < ∞, we have
E ((̃𝑒− 𝑒∗𝑒𝑜)2) = V (̃𝑒)
= 1𝑛V (𝑞 (𝑈𝑌1
1, 𝑉1))
≤ 1
𝑛E ((
𝑌1
𝑞 (𝑈1, 𝑉1))
2
)
≤ 2
𝑛E (
(ℎ (𝑈1, 𝑉1))2+ 𝜉2
1
(𝑞 (𝑈1, 𝑉1))2 )
≤ 2
𝑐2(𝐶21𝐶22+ E (𝜉21))1𝑛 = 𝐶𝑛1
(34)
Equations (32) and (34) yield the desired inequality:
E (∬1
0(̂ℎ (𝑥, 𝑦) − ℎ (𝑥, 𝑦))2𝑑𝑥 𝑑𝑦)
≤ 𝐶 (E (̃𝑔 − 𝑔𝑒𝑜2
2) + E ( ̃𝑓− 𝑓𝑒∗2
2) + E (̃𝑔 − 𝑔𝑒𝑜2
2𝑓 − 𝑓𝑒̃ ∗2
2) +1
𝑛)
(35)
Proof of Theorem 4 We aim to applyTheorem 1by
investigat-ing the rate of convergence attained by ̃𝑓 and ̃𝑔 under the
MISE over Besov balls
First of all, remark that, for𝛾 ∈ {𝜙, 𝜓}, any integer 𝑗 ≥ 𝜏 and any𝑘 ∈ {0, , 2𝑗− 1}
(i) Using similar arguments to (33), we obtain
E (1
𝑎𝑛
𝑎 𝑛
∑
𝑖=1
𝑌𝑖
𝑞 (U𝑖, 𝑉𝑖)𝛾𝑗,𝑘(𝑈𝑖))
= E ( 𝑌1
𝑞 (𝑈1, 𝑉1)𝛾𝑗,𝑘(𝑈1))
= E (ℎ (𝑈1, 𝑉1)
𝑞 (𝑈1, 𝑉1)𝛾𝑗,𝑘(𝑈1)) + E (𝜉1) E (
𝛾𝑗,𝑘(𝑈1)
𝑞 (𝑈1, 𝑉1))
= ∬1
0
𝑓 (𝑥) 𝑔 (𝑦)
𝑞 (𝑥, 𝑦) 𝛾𝑗,𝑘(𝑥) 𝑞 (𝑥, 𝑦) 𝑑𝑥 𝑑𝑦
= (∫1
0 𝑓 (𝑥) 𝛾𝑗,𝑘(𝑥) 𝑑𝑥) (∫1
0 𝑔 (𝑦) 𝑑𝑦)
= ∫1
0 (𝑓 (𝑥) 𝑒∗) 𝛾𝑗,𝑘(𝑥) 𝑑𝑥
(36)
(ii) Using similar arguments to (34) and‖𝛾𝑗,𝑘‖22 = 1, we have
𝑎𝑛
∑
𝑖=1
E (( 𝑌𝑖
𝑞 (𝑈𝑖, 𝑉𝑖)𝛾𝑗,𝑘(𝑈𝑖))
2
)
= E (( 𝑌1
𝑞 (𝑈1, 𝑉1)𝛾𝑗,𝑘(𝑈1))
2
) 𝑎𝑛
≤ 2E ((ℎ (𝑈1, 𝑉1))
2+ 𝜉2 1
(𝑞 (𝑈1, 𝑉1))2 (𝛾𝑗,𝑘(𝑈1))
2
) 𝑎𝑛
≤ 𝑐2
3(𝐶2
1𝐶2
2+ E (𝜉2
1)) E ((𝛾𝑗,𝑘(𝑈1))
2
𝑞 (𝑈1, 𝑉1) ) 𝑎𝑛
= 2
𝑐3(𝐶12𝐶2
2+ E (𝜉2
1)) ∬1
0
(𝛾𝑗,𝑘(𝑥))2
𝑞 (𝑥, 𝑦) 𝑞 (𝑥, 𝑦) 𝑑𝑥 𝑑𝑦𝑎𝑛
= 2
𝑐3(𝐶12𝐶2
2+ E (𝜉2
1)) 𝛾𝑗,𝑘2
2𝑎𝑛= 𝐶2
∗𝑎𝑛,
(37) with𝐶2
∗= (2/𝑐3)(𝐶2
1𝐶2
2+ E(𝜉2
1))
Applying [6, Theorem 6.1] (see the Appendix) with𝑛 =
𝜇𝑛= 𝜐𝑛= 𝑎𝑛,𝛿 = 0, 𝜃𝛾= 𝐶∗,𝑊𝑖= (𝑌𝑖, 𝑈𝑖, 𝑉𝑖),
𝑞𝑖(𝛾, (𝑦, 𝑥, 𝑤)) = 𝑦
𝑞 (𝑥, 𝑤)𝛾 (𝑥) (38)
Trang 6and𝑓 ∈ 𝐵𝑠 1
𝑝 1 ,𝑟 1(𝑀1) (so 𝑓𝑒∗ ∈ 𝐵𝑠 1
𝑝 1 ,𝑟 1(𝑀1𝑒∗)) with 𝑀1 > 0,
𝑟1 ≥ 1, either {𝑝1 ≥ 2 and 𝑠1 ∈ (0, 𝑁)} or {𝑝1 ∈ [1, 2) and
𝑠1 ∈ (1/𝑝1, 𝑁)}, we prove the existence of a constant 𝐶 > 0
such that
E ( ̃𝑓− 𝑓𝑒∗2
2) ≤ 𝐶 (ln𝑎𝑛
𝑎𝑛 )
2𝑠 1 /(2𝑠 1 +1)
≤𝐶(ln𝑛𝑛)2𝑠1/(2𝑠1+1),
(39)
when𝑛 is large enough
The MISE of ̃𝑔 can be investigated in a similar way: for
𝛾 ∈ {𝜙, 𝜓}, any integer 𝑗 ≥ 𝜏 and any 𝑘 ∈ {0, , 2𝑗− 1}
(i) We show that
E (𝑏1
𝑛
𝑏 𝑛
∑
𝑖=1
𝑌𝑎𝑛+𝑖
𝑞 (𝑈𝑎𝑛+𝑖, 𝑉𝑎𝑛+𝑖)𝛾𝑗,𝑘(𝑉𝑎𝑛 +𝑖))
= ∫1
0 (𝑔 (𝑥) 𝑒𝑜) 𝛾𝑗,𝑘(𝑥) 𝑑𝑥
(40)
(ii) We show that
𝑏 𝑛
∑
𝑖=1
E (( 𝑌𝑎𝑛 +𝑖
𝑞 (𝑈𝑎𝑛+𝑖, 𝑉𝑎𝑛+𝑖)𝛾𝑗,𝑘(𝑉𝑎𝑛 +𝑖))
2
) ≤ 𝐶2∗𝑏𝑛, (41) with always𝐶2∗ = (2/𝑐3)(𝐶21𝐶22+ E(𝜉21))
Applying again [6, Theorem 6.1] (see the Appendix) with
𝑛 = 𝜇𝑛 = 𝜐𝑛 = 𝑏𝑛,𝛿 = 0, 𝜃𝛾= 𝐶∗,𝑊𝑖= (𝑌𝑖, 𝑈𝑖, 𝑉𝑖),
𝑞𝑖(𝛾, (𝑦, 𝑥, 𝑤)) = 𝑦
𝑞 (𝑥, 𝑤)𝛾 (𝑤) (42) and𝑔 ∈ 𝐵𝑠2
𝑝2,𝑟2(𝑀2) with 𝑀2 > 0, 𝑟2 ≥ 1, either {𝑝2 ≥ 2 and
𝑠2 ∈ (0, 𝑁)} or {𝑝2 ∈ [1, 2) and 𝑠2 ∈ (1/𝑝2, 𝑁)}; we prove the
existence of a constant𝐶 > 0 such that
E (̃𝑔 − 𝑔𝑒𝑜2
2) ≤ 𝐶(ln𝑏𝑏𝑛
𝑛 )2𝑠2/(2𝑠2+1) ≤𝐶(ln𝑛𝑛)2𝑠2/(2𝑠2+1),
(43)
when𝑛 is large enough
Using the independence between ̃𝑓 and ̃𝑔 (seeRemark 2),
it follows from (39) and (43) that
E (̃𝑔 − 𝑔𝑒𝑜2
2𝑓 − 𝑓𝑒̃ ∗2
2) = E (̃𝑔 − 𝑔𝑒𝑜2
2) E ( ̃𝑓− 𝑓𝑒∗2
2)
≤𝐶(ln𝑛𝑛)4𝑠1𝑠2/(2𝑠1+1)(2𝑠2+1)
(44)
Owing toTheorem 1, (39), (43) and (44), we get
E (∬1
0(̂ℎ (𝑥, 𝑦) − ℎ (𝑥, 𝑦))2𝑑𝑥 𝑑𝑦)
≤ 𝐶 (E (̃𝑔 − 𝑔𝑒𝑜2
2) + E ( ̃𝑓− 𝑓𝑒∗2
2) + E (̃𝑔 − 𝑔𝑒𝑜2
2𝑓 − 𝑓𝑒̃ ∗2
2) +1𝑛)
≤ 𝐶 ((ln𝑛𝑛)2𝑠2/(2𝑠2+1)+ (ln𝑛𝑛)2𝑠1/(2𝑠1+1) +(ln𝑛𝑛)4𝑠1𝑠2/(2𝑠1+1)(2𝑠2+1)+1𝑛)
≤𝐶(ln𝑛
𝑛 )
2𝑠 ∗ /(2𝑠 ∗ +1)
,
(45)
with𝑠∗= min(𝑠1, 𝑠2)
Theorem 4is proved
Appendix
Let us now present in detail [6, Theorem 6.1] which is used two times in the proof ofTheorem 4
We consider a general form of the hard thresholding wavelet estimator denoted by ̂𝑓𝐻for estimating an unknown function𝑓 ∈ L2([0, 1]) from 𝑛 independent random variables
𝑊1, , 𝑊𝑛:
̂
𝑓𝐻(𝑥) =2
𝜏 −1
∑
𝑘=0
̂𝛼𝜏,𝑘𝜙𝜏,𝑘(𝑥) +
𝑗1
∑
𝑗=𝜏
2 𝑗 −1
∑
𝑘=0
̂
𝛽𝑗,𝑘1{| ̂𝛽
𝑗,𝑘 |≥𝜅𝜗𝑗}𝜓𝑗,𝑘(𝑥) ,
(A.1) where
̂𝛼𝑗,𝑘= 1
𝜐𝑛
𝑛
∑
𝑖=1
𝑞𝑖(𝜙𝑗,𝑘, 𝑊𝑖) ,
̂
𝛽𝑗,𝑘= 1
𝜐𝑛
𝑛
∑
𝑖=1
𝑞𝑖(𝜓𝑗,𝑘, 𝑊𝑖) 1{|𝑞𝑖(𝜓𝑗,𝑘,𝑊𝑖)|≤𝜍𝑗},
𝜍𝑗= 𝜃𝜓2𝛿𝑗 𝜐𝑛
√𝜇𝑛ln𝜇𝑛, 𝜗𝑗 = 𝜃𝜓2
𝛿𝑗√ln𝜇𝑛
𝜇𝑛 ,
(A.2)
𝜅 ≥ 2 + 8/3 + 2√4 + 16/9 and 𝑗1is the integer satisfying
1
2𝜇𝑛1/(2𝛿+1)< 2𝑗1≤ 𝜇𝑛1/(2𝛿+1) (A.3) Here, we suppose that there exist
(i)𝑛 functions 𝑞1, , 𝑞𝑛with𝑞𝑖 : L2([0, 1]) × 𝑊𝑖(Ω) →
C for any 𝑖 ∈ {1, , 𝑛}, (ii) two sequences of real numbers (𝜐𝑛)𝑛∈N and(𝜇𝑛)𝑛∈N satisfying lim𝑛 → ∞𝜐𝑛 = ∞ and lim𝑛 → ∞𝜇𝑛= ∞,
Trang 7such that, for𝛾 ∈ {𝜙, 𝜓},
(A1) any integer𝑗 ≥ 𝜏 and any 𝑘 ∈ {0, , 2𝑗− 1},
E (1
𝜐𝑛
𝑛
∑
𝑖=1
𝑞𝑖(𝛾𝑗,𝑘, 𝑊𝑖)) = ∫1
0 𝑓 (𝑥) 𝛾𝑗,𝑘(𝑥) 𝑑𝑥 (A.4)
(A2) there exist two constants,𝜃𝛾> 0 and 𝛿 ≥ 0, such that,
for any integer𝑗 ≥ 𝜏 and any 𝑘 ∈ {0, , 2𝑗− 1},
𝑛
∑
𝑖=1E (𝑞𝑖(𝛾𝑗,𝑘, 𝑊𝑖)2) ≤ 𝜃2
𝛾22𝛿𝑗𝜐𝑛2
𝜇𝑛. (A.5)
Let ̂𝑓𝐻 be (A.1) under (A1) and (A2) Suppose that 𝑓 ∈
𝐵𝑠𝑝,𝑟(𝑀) with 𝑟 ≥ 1, {𝑝 ≥ 2 and 𝑠 ∈ (0, 𝑁)} or {𝑝 ∈ [1, 2)
and𝑠 ∈ ((2𝛿 + 1)/𝑝, 𝑁)} Then there exists a constant 𝐶 > 0
such that
E (‖ ̂𝑓𝐻− 𝑓‖22) ≤ 𝐶(ln𝜇𝜇𝑛
𝑛 )2𝑠/(2𝑠+2𝛿+1) (A.6)
Conflict of Interests
The author declares that there is no conflict of interests
regarding the publication of this paper
References
[1] O B Linton and J P Nielsen, “A kernel method of estimating
structured nonparametric regression based on marginal
inte-gration,” Biometrika, vol 82, no 1, pp 93–100, 1995.
[2] A Yatchew and L Bos, “Nonparametric least squares
estima-tion and testing of economic models,” Journal of Quantitative
Economics, vol 13, pp 81–131, 1997.
[3] J Pinske, Feasible Multivariate Nonparametric Regression
Esti-mation Using Weak Separability, University of British Columbia,
Vancouver, Canada, 2000
[4] A Lewbel and O Linton, “Nonparametric matching and
efficient estimators of homothetically separable functions,”
Econometrica, vol 75, no 4, pp 1209–1227, 2007.
[5] D Jacho-Ch´avez, A Lewbel, and O Linton, “Identification and
nonparametric estimation of a transformed additively separable
model,” Journal of Econometrics, vol 156, no 2, pp 392–407,
2010
[6] Y P Chaubey, C Chesneau, and H Doosti, “Adaptive wavelet
estimation of a density from mixtures under multiplicative
censoring,” 2014,http://hal.archives-ouvertes.fr/hal-00918069
[7] A Antoniadis, “Wavelets in statistics: a review (with
discus-sion),” Journal of the Italian Statistical Society B, vol 6, no 2,
pp 97–144, 1997
[8] B Vidakovic, Statistical Modeling by Wavelets, John Wiley &
Sons, New York, NY, USA, 1999
[9] W H¨ardle, G Kerkyacharian, D Picard, and A Tsybakov,
Wavelet, Approximation and Statistical Applications, vol 129 of
Lectures Notes in Statistics, Springer, New York, NY, USA, 1998.
[10] V A Vasiliev, “One investigation method of a ratios type
estimators,” in Proceedings of the 16th IFAC Symposium on
System Identification, pp 1–6, Brussels, Belgium, July 2012.
[11] A Cohen, I Daubechies, and P Vial, “Wavelets on the interval
and fast wavelet transforms,” Applied and Computational Har-monic Analysis, vol 1, no 1, pp 54–81, 1993.
[12] R DeVore and V Popov, “Interpolation of Besov spaces,”
Transactions of the American Mathematical Society, vol 305, pp.
397–414, 1988
[13] Y Meyer, Wavelets and Operators, Cambridge University Press,
Cambridge, UK, 1992
[14] D L Donoho, I M Johnstone, G Kerkyacharian, and D
Picard, “Density estimation by wavelet thresholding,” Annals of Statistics, vol 24, no 2, pp 508–539, 1996.
[15] B Delyon and A Juditsky, “On minimax wavelet estimators,”
Applied and Computational Harmonic Analysis, vol 3, no 3, pp.
215–228, 1996
[16] A B Tsybakov, Introduction `a L’Estimation Non-Param´etrique,
Springer, New York, NY, USA, 2004
[17] M H Neumann, “Multivariate wavelet thresholding in
anisotropic function spaces,” Statistica Sinica, vol 10, no 2, pp.
399–431, 2000
Trang 8the copyright holder's express written permission However, users may print, download, or email articles for individual use.