Does KotlinDL have any plans to add GLU and GLU variants. Since I used these activations quite a few times, I recently ended up creating a library for myself (https://github.com/Rishit-dagli/GLU) and would love to contribute this to KotlinDL.
This was meant to be an issue but I wanted to know if this is already in the pipeline?
z
zaleslaw
05/13/2022, 1:08 PM
Could you please share more information about this unit, paper and architectures where it could be used. If it could be implemented with pure tf ops, you could add it as a pr, I will assist you
r
Rishit Dagli
05/14/2022, 3:49 PM
GLU is a simple activation function and the original GLU was first introduced in the famous paper: "Language Modeling with Gated Convolutional Networks", it is since then been used in multiple architectures and has seemed to work quite well for NLP tasks. The other GLU variants were introduced in "GLU Variants Improve Transformer" and have been shown to work well for language tasks.
Please find other paper that use GLU at (a list of 220 paper that use GLU): https://paperswithcode.com/method/glu