Visualizing Class Specific Heterogeneous Tendencies in Categorical Data

Mariko Takagishi, Michel van de Velden

Research output: Contribution to journalArticlepeer-review

Abstract

In multiple correspondence analysis, both individuals (observations) and categories can be represented in a biplot that jointly depicts the relationships across categories and individuals, as well as the associations between them. Additional information about the individuals can enhance interpretation capacities, such as by including class information for which the interdependencies are not of immediate concern, but that facilitate the interpretation of the plot with respect to relationships between individuals and categories. This article proposes a new method which we call multiple-class cluster correspondence analysis that identifies clusters specific to classes. The proposed method can construct a biplot that depicts heterogeneous tendencies of individual members, as well as their relationships with the original categorical variables. A simulation study to investigate the performance of the proposed method and an application to data regarding road accidents in the United Kingdom confirms the viability of this approach. Supplementary materials for this article are available online.

Original languageEnglish
JournalJournal of Computational and Graphical Statistics
DOIs
Publication statusAccepted/In press - 2022
Externally publishedYes

Keywords

  • Clustering
  • Contingency table
  • External information
  • Multiple correspondence analysis
  • Visualization

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty
  • Discrete Mathematics and Combinatorics

Fingerprint

Dive into the research topics of 'Visualizing Class Specific Heterogeneous Tendencies in Categorical Data'. Together they form a unique fingerprint.

Cite this