Objective of cleaning procedure using smoothing splines anova

Smoothing spline analysis of variance on each genotype-scenario of an experiment. Detection of outlier repetition if significant TT*Rep (thermal time by repetition) interaction using a Kullblack-Leibler projection (KL). I consider a genotype-scenario as outlier:

  • biovolume: if KL > 0.05
  • plantHeight: if KL > 0.05
  • leafArea: if KL > 0.05

The input dataset must contain the following columns:

  • experimentAlias
  • genotypeAlias
  • scenario
  • repetition
  • thermalTime (for thermal time)
  • parameter of interest (biovolume, plantHeight etc…)

The five first column names are standard names extracted from the web service.

Import of data


  cat("-------------- plant3 dataset ---------------\n")
## -------------- plant3 dataset ---------------
## Experiment: manip3 
## Genotypes: 10 
##  [1] "A3_H"     "A310_H"   "11430_H"  "A554_H"   "A374_H"   "A347_H"  
##  [7] "B100_H"   "A375_H"   "AS5707_H" "A347"    
## Scenario: 2 
## [1] "WW" "WD"
## Repetition-scenario: 6 
## [1] "1-WW" "2-WW" "3-WW" "1-WD" "2-WD" "3-WD"
## Pots (number of plants): 60 
## Line: 25 
## Position: 42
  # Import data, here is a dataset in the phisStatR package, You have to import your own dataset
  # using a read.table() statement or a request to the web service
  # You can add some datamanagement statements...
  # Please, add the 'Ref' and 'Genosce' columns if don't exist. 
  # 'Ref' is the concatenation of experimentAlias-Line-Position-scenario
  # 'Genosce' is the concatenation of experimentAlias-genotypeAlias-scenario

  # For one parameter, for example biovolume

Curves by genotype-scenario


  outlierbio<-printGSS(object=resbio,threshold = 0.05)
  klbio<-printGSS(object=resbio,threshold = NULL)

  cat("Detection of outlier curve with KL projection:\n")
## Detection of outlier curve with KL projection:
##              Genosce      ratio        kl     check
## 1  manip3-11430_H-WW 0.15015910 1175.7782 0.9999887
## 2 manip3-AS5707_H-WD 0.07205993  633.0729 0.9999874
  # You can export these two datasets
  # suppress the comments
  #   row.names = FALSE,sep="\t")
  #   row.names = FALSE,sep="\t")

I take a threshold of 0.05 for this example. We can take a more conservative threshold like 0.01 or 0.02 to detect more outlier curves…

  # plot of the smoothing splines by genotype-scenario  
  for(i in seq(1,length(unique(mydata[,"Genosce"])),by=12)){

