Graph-Enhanced BERT for Automatic Error Detection
Co-Supervised by: Corina Masanti
If you are interested in this topic or have further questions, do not hesitate to contact corina.masanti@unibe.ch.
Background / Context
Transformer models perform well on sentence classification tasks but struggle with subtle grammatical error detection (such as real-word errors). Prior work treats samples independently. In contrast, graph neural networks can exploit sample-to-sample relationships.
Research Question(s) / Goals
Does adding a graph-based neighborhood propagation layer on top of BERT embeddings improve sentence-level grammatical error detection?
Approach / Methods
- Use mBERT for embedding extraction of an input sentence
- Build a graph (e.g., edge to k closest samples in terms of embedding and/or corrections, tune k)
- Apply a shallow GNN (e.g., GCN or GIN) for node classification with limited message passing rounds
- End-to-end training: update BERT and GNN weights jointly
- Evaluation: binary classification (error vs. clean sentence)
- Metrics: F1, precision, recall
- Compare pipeline: BERT-only vs. BERT+GNN
Expected Contributions / Outcomes
- Method to construct sentence-embedding graphs for grammatical error detection
- Hybrid BERT-GNN model for classification tasks
- Comparison between mBERT baseline and BERT-GNN
- Reproducible pipeline and code
Required Skills / Prerequisites
- Strong programming skills
- Solid understanding of transformer-based language models or willingness to study them in depth
- Interest in graph neural networks and willingness to learn graph-based methods
Possible Extensions
- Different neighborhood methods (e.g., prototype-based)
- Adaptive edge weighting
Further Reading / Starting Literature
- Devlin et al., “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”
- Hamilton, “Graph Representation Learning” (available as free pre-publication) https://www.cs.mcgill.ca/~wlh/grl_book/
- Stanford CS224W: Machine Learning with Graphs (lectures + notes available) https://web.stanford.edu/class/cs224w/index.html
