Multiple Factor Analysis on Wines

R code
# You will need the following libraries
# for this example, if these are not installed
# install them with `remotes`, `pak`

# de-comment the following lines to
# install missing packages
# install.packages("pak")
# pak::pak('dplyr')
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(knitr)
library(kableExtra)
#> 
#> Attaching package: 'kableExtra'
#> The following object is masked from 'package:dplyr':
#> 
#>     group_rows
library(ggplot2)
library(TExPosition)
#> Loading required package: prettyGraphs
#> Loading required package: ExPosition
#pak::pak('HerveAbdi/data4PCCAR')
library(data4PCCAR)
#pak::pak('HerveAbdi/PTCA4CATA')
library(FactoMineR)
suppressMessages(library(factoextra))
#pak::pak('gastonstat/colortools')
library(colortools)
library(coin)
#> Loading required package: survival
# suppressMessages(library(XLConnect))
library(dplyr)
library(ggplot2)
library(tidyr)
library(FactoMineR)
library(factoextra)

We illustrate MFA with the 6 wines and 3 assessors data-set. In this example, three different assessors (named e1, e2, and e3) evaluated 6 wines using their own descriptors.

R code
data('wines2007', package = "ExPosition")
# your data (a concatenated data tables)
# Have a look
wines2007$data
#>    e1.fruity e1.woody e1.coffee e2.red.fruit e2.roasted e2.vanillin e2.woody
#> W1         1        6         7            2          5           7        6
#> W2         5        3         2            4          4           4        2
#> W3         6        1         1            5          2           1        1
#> W4         7        1         2            7          2           1        2
#> W5         2        5         4            3          5           6        5
#> W6         3        4         4            3          5           4        5
#>    e3.fruity e3.butter e3.woody
#> W1         3         6        7
#> W2         4         4        3
#> W3         7         1        1
#> W4         2         2        2
#> W5         2         6        6
#> W6         1         7        5
# a data frame with one row that specifies the table membership of each column of `wines2007$data`
# run mfa
group4MFA <- summary(as.factor(t(wines2007$table)))
name.group4MFA <- names(summary(as.factor(t(wines2007$table))))
resMFA <- FactoMineR::MFA(wines2007$data, # data table
                          group = group4MFA,
                          name.group = name.group4MFA,
                          graph = FALSE)
# The list `resMFA` contains all the information 
# needed to interpret the results of the analysis.

The \(\mathbf{R_v}\) coefficient matrix

A first step is to look at the \(R_V\) matrix—A matrix that stores the values of the \(R_v\) coefficient between pairs of the original data tables (and also with the whole MFA).

For MFA, these coefficients are only descriptive and can be interpreted like a squared coefficient of correlation for matrices: a value close to 1 for a pair of data tables indicates that these two tables are storing similar information, a value close to zero indicates that these tables store independent information.

R code
# a nice heatmap
resMFA$group$RV %>%
  as_tibble(rownames = "rows") %>%
  pivot_longer(-1, names_to = "cols") %>%
    mutate(
      cols = factor(cols, levels = rev(unique(cols)))) %>%
  ggplot(aes(x = rows, y = cols)) +
  geom_tile(aes(fill = value)) +  
  geom_text(aes(label = round(value, 2)), color = "white", fontface = "bold") +  
  scale_fill_gradient2(low = "#BB4444", mid =  "#FFFFFF", high = "#4477AA", midpoint = 0, limits = c(-1, 1 + 1e-10)) +
  coord_equal() + 
  scale_x_discrete(position = "top") +
  labs(x = "", y = "", fill) +
  theme_minimal() + 
  # theme(text = element_text())
  geom_hline(yintercept = 1.5, color = "white") + 
  geom_vline(xintercept = 3.5, color = "white")

R code


# col <- colorRampPalette(
#   c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA")
#   )
# plot.RV <- corrplot::corrplot(resMFA$group$RV, 
#                            method = "color",
#                            col = col(201), 
#                            addCoef.col = "black", 
#                            number.cex = 0.8, 
#                            tl.col = "black", 
#                            mar = c(0,0,0,0),
#                            addgrid.col = "grey", 
#                            tl.srt = 90)
# print(plot.RV)

Weights (alpha in the paper) applied to each data table

You can show the alphas (\(\boldsymbol{\alpha}\)) with a barplot.

R code
#Eig.tab <- demo.mfa.2007$mexPosition.Data$Compromise$compromise.eigs
Eig.tab <- c(resMFA$separate.analyses$E1$eig[1],
             resMFA$separate.analyses$E1$eig[2],
             resMFA$separate.analyses$E1$eig[3])
Alpha <- 1/sqrt(Eig.tab)

tibble(Assessor = paste0("E", 1:3),
       Alpha = Alpha) %>%
  ggplot(aes(Assessor, Alpha, fill = Assessor)) + 
  geom_col(show.legend = FALSE) + 
  theme_bw() + 
  labs(y = "", title = "Weight applied to each table")

Scree-plot

R code
#Eig4scree <- demo.mfa.2007$mexPosition.Data$Table$eigs
Eig4scree <- resMFA$global.pca$eig[,1]
Tau4scree <- resMFA$global.pca$eig[,2]
# Use factorextra
# Eigenvalues/variances of dimensions
fviz_screeplot(resMFA)

Global factor scores of the rows:

This shows how the rows are projected onto the space from the point of all tables

(You want to plot them like how you plot them in PCA: Component 1 vs. Component 2)

R code
fi <- resMFA$global.pca$ind$coord
fviz_mfa_ind(resMFA, col.ind = "cos2",
  gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
  repel = TRUE)

Partial factor scores of the rows: how the rows are viewed from the persepctive of each table

(You want to plot them with the global factor scores. Please check out the example of DiSTATIS to see how you plot them.)

R code
# Partial individuals
fviz_mfa_ind(resMFA, partial = "all")

Loadings of variables

(You interpret them as in PCA.)

R code
# Quantitative variables
fviz_mfa_var(resMFA, "quanti.var", palette = "jco", 
             repel = TRUE)

Contributions

Left as an exercise.

Bonus for R users: How to run an MFA

FactoMineR is the standard R-package for running MFA. As the name indicates, the function FactoMineR::MFA will run the MFA.