Reducing costs in linear regression
experiments
David Causeur & Thierry Dhorne
Laboratory of Applied Statistics (SABRES)
University of South Brittany
Rue Yves Mainguy, Tohannic
F56000 Vannes, France
email : causeur@iuvannes.fr
thierry.dhorne@univubs.fr

Experimental costs :

Multiplicity of grading systems :
Spatial trend
Slaughterhouse effect
Chronological trend
Genetic selection effect
Connioee, D. and Moran, M.A. (1972)
Double sampling with regression in comparative studies of carcass composition.
Biometrics
28, 1011 1023.
Cook, G.L., Jones, D.W. and Kempster, A.J. (1983)
A note on a simple criterion for choosing among sample joints for use in double sampling.
Animal Production
36, 493495.
Engel B. and Walstra P. (1991) Increasing precision or reducing expense in regression by using information from a concomitant
variable.
Biometrics 47, 1320.
Causeur, D. and Dhorne, T. (1998)
Finitesample properties of a multivariate extension of doubleregression.
Biometrics.
54 (4), 299309.
Causeur, D. (1998)
Plan d'échantillonnage á plusieurs phases pour la réduction des coûts expérimentaux en régression linéaire.
Revue de Statistique Appliquée.
XLVI (4), 5973.
Causeur, D. and Dhorne, T. (2000)
Using surrogate predictors in linear regression models.
Submitted to Biometrika.
Main parts of the talk
Double regression
Double sampling
Use of an auxiliary covariate
Estimation procedure
Optimization of the sampling design
Multivariate doubleregression
Two phase sampling
Multiple phase sampling
Calibration methods
Experimental constraints
Sampling design


Double sampling design

Practical properties of Z
Z better predictor of Y than X
Z cheaper than Y
Random sub sampling or selection



Unbiased procedure
Efficiency of the procedure
![]()
Comparison with OLS efficiency

Therefore
E.C. protocol : ''120 carcasses'' constraint

Objective function to be optimized





Objective
To take into account the different costs of the auxiliary covariates
Monotone sampling design

Number of auxiliary covariates : 7
| Nr of covariates | Optimal plan | Cost | Reduction (%) |
| 0 | 120 | 45600 | 0 |
| 1 | (76,245) | 34795 | 23.70 |
| 2 | (69,216,398) | 32275 | 29.22 |
| 3 | (64,195,195,377) | 31125 | 31.74 |
| 4 | (53,134,209,209,209) | 29155 | 36.06 |
| 5 | (48,153,178,178,178,241) | 27575 | 39.53 |
| 6 | (47,141,157,157,157,217,337) | 26570 | 41.73 |
| 7 | (46,61,140,155,155,155,215,333) | 26430 | 42.04 |
Optimal subsets :
J
(J,F)
(J,Ld,F)
(E,J,Ld,Lc)
(E,J,Ld,Lc,D)
(E,J,Ld,Lc,D,F)
(P,E,J,Ld,Lc,D,F)
Multiplicity of new grading systems
Multiplicity of prediction formulae
Multiplicity of linear regression experiments with measurements of Y
Main practical problems
Global experimental costs
Possible trends
Chronological trends
Spatial trends
...

Methodological proposals
How do these methods interact with PLS, PCR, ... ?
Practical proposals
Statistical handbook
Statistical software including detection of outliers, optimization of the sampling design, statistical analysis,...