Read Paper 《A Stable Variational Autoencoder for Text Modelling》

简介

INLG2019短论文，介绍了一种VAE语言建模的正则化方法，减少VAE-RNN接口发生潜变量崩溃的情况。

之前VAE尤其是使用RNN作为编解码器对结构容易模型崩溃(KL损失消失),作者认为是只对最后对输出做了正则化，现在则对RNN每一步都加入正则化。

原来VAE-RNN的损失函数: \( L(\theta, \phi; X) = E_{Q_\phi (Z|X)}[\log P_\theta(X|Z)] - KL(Q_\phi (Z|X) || P(Z)) \)

全面正则化后: \( L(\theta, \phi; X) = E_{Q_\phi (Z_N|X)}[\log P_\theta(X|Z_N)] - \frac{1}{N} \sum_{t=0}^N KL(Q_{\phi_t} (Z_t|X) || P(Z_t)) \)

记作HR-VAE, holistic regularisation VAE.

具体的, 编解码器采用两层LSTM, 作者表示其他GRU也可以。

Name	Description	Link
PTB	经典词性标注语料	https://catalog.ldc.upenn.edu/LDC99T42
E2E	由数据表生成餐馆评价,也是本次INLG的挑战赛	http://www.macs.hw.ac.uk/InteractionLab/E2E

和近期其他开源模型进行了一些对比,结果如上图。

Model	Paper	Code
VAE-LSTM-base	CoNLL2016 Generating sentences from a continuous space	https://github.com/timbmg/Sentence-VAE
VAE-CNN	ICML2017 Improved variational autoencoders for text modeling using dilated convolutions	https://github.com/kefirski/contiguous-succotash
vMF-VAE	EMNLP2018 Spherical latent spaces for stable variational autoencoders	https://github.com/jiacheng-xu/vmf_vae_nlp

作者还对训练过程和结果进行了可视化，说明正则化产生了平滑的效果和可见的提升。

本文对VAE在文本生成方面对相关工作做了比较全面对介绍和对比分析。模型简单有效。