Oversmoothing Degradation Analysis on Graph Neural Networks

Co-Supervised by: Francesco Leonardi

If you are interested in this topic or have further questions, do not hesitate to contact francesco.leonardi@unibe.ch.

Context

Graph Neural Networks (GNNs) are a powerful tool for processing graphs, but suffer from certain limitations, including oversmoothing. This phenomenon occurs when, by increasing the number of convolutions in the layers, node representations tend to become too similar to each other, losing vital local information. This can significantly impair the predictive capabilities of the model, particularly in the classification of nodes and graphs.

In recent years, several approaches have been proposed to address oversmoothing, including various types of layers and architecture modifications. These attempts aim to preserve local information in the graphs, but it is still unclear how each approach affects the degradation of the model’s predictive capabilities.

Goal(s)

The aim of this project is to analyse the impact of different layers of GNNs on the predictive capabilities of the model, both for node and graph classification. In particular, the study will seek to identify recurring patterns in performance degradation, caused by oversmoothing, in relation to the various layers and hyperparameters of GNNs.

Approach

  • How do the number of layers and network depth affect predictive performance in GNNs?
  • Does oversmoothing have the same impact on node classification as on graph classification?
  • Do different layers (e.g. GCN, GraphSAGE, GAT) have different vulnerabilities to oversmoothing?
  • Is it possible to identify specific patterns that correlate model performance with the degree of oversmoothing of node representations?

Methodology

To investigate these questions, various graph datasets will be used. The experimental approach involves:

  • The training of GNN models with different types of layers and architectures.
  • The evaluation of performance on nodes and graphs, using metrics such as accuracy and F1 score.
  • The analysis of oversmoothing by observing changes in node representations as the number of layers and network depth increases.
  • Experimentation with techniques to mitigate oversmoothing, such as residual connections (skip connections).

Required Skills

  • Good programming skills, with a preference for Python and PyTorch.
  • Mathematical Knowledge: Strong background in linear algebra (matrix operations, eigenvalues, SVD), differential calculus and statistics.
  • A bit of graph algorithms, knowledge on graph neural networks is appreciated.
  • Languages: Ability to work effectively in at least one of the following languages: English, French, Italian.