Sparsifying Transformer Models with Trainable Representation Pooling

This repository contains a demonstrative implementation for pooling-based models, e.g., DeepPyramidion complementing our paper Sparsifying Transformer Models with Trainable Representation Pooling.

The detailed README on how to use it is provided in examples. See fairseq/examples/deep_pyramidion for more.

[arXiv, BibTeX]

General information

The method we propose is inspired by this search for relevant fragments, which is an important aspect of human cognition when engaged in reading to do actions. We intend to mimic relevance judgments and hypothesize that it is possible to answer problems involving natural language with only selected passages of the input text.

These passages may be of substantially shorter length than the original text. One may compare this to a person reading the paper and highlighting in such a way that it is possible to provide a summary using only the highlighted parts.

The end-to-end mechanism we introduce performs such highlighting by scoring the representations and passes only the selected ones to the next layer of the neural network.

Figure. An illustration of sparse attention matrices assuming a three-layer encoder and decoder (separated by the dashed line). The blue color reflects the memory consumption of self-attention (encoder) and cross-attention (decoder). (A) The complete input consumed at once. (B) Memory reduced with blockwise attention and (C) pooling applied after the encoder. (D) Gradual reduction of memory by pooling after every layer.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
fairseq		fairseq
README.md		README.md
cite.bib		cite.bib
hero.png		hero.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sparsifying Transformer Models with Trainable Representation Pooling

General information

About

Releases

Packages

Contributors 2

Languages

applicaai/pyramidions

Folders and files

Latest commit

History

Repository files navigation

Sparsifying Transformer Models with Trainable Representation Pooling

General information

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages