BlonD: An Automatic Evaluation Metric for Document-level Machine Translation
Yuchen Jiang, Shuming Ma, Dongdong Zhang, Ming Zhou
Standard automatic metrics such as BLEU are problematic for the document-level MT evaluation. Neither can they distinguish document-level improvements from sentence-level ones, nor can they identify which specific phenomena lead to translation errors.
To address these problems, we propose an automatic metric BlonD for document-level machine translation evaluation. BlonD takes discourse coherence into consideration by calculating recall and distance of check-pointing phrases and tags, and further provide comprehensive evaluation scores by combining with n-gram.
An extensive comparison between BlonD and existing evaluation metrics is conducted to illustrate their critical distinctions. Experimental results show a large margin between the sensitivity of BlonD and those metrics to the document-level improvements. Human evaluation also reveals high Pear-son R correlation values between BlonD scores and human judgments of translation quality.
Directional Sentence-Pair Embedding for Commonsense Causal Reasoning
Yuchen Jiang, Zhenxin Xiao, Kai-Wei Chang
Enabling machines with the ability of reasoning and inference over text is one of the core missions of natural language understanding. Although deep learning models have shown strong performance on various cross-sentence inference benchmarks, recent work has shown that they tend to leverage spurious statistical cues rather than capturing deeper relations between pairs of sentences.
We show that state-of-the-art language encoding models are especially bad at modeling directional relations between sentences.
To remedy this issue, we incorporate a mutual attention mechanism with a transformer-based model to better capture directional relations between sentences. We further curate CER, a Cause-and-Effect Relation corpus, to facilitate the model embeds commonsense causal relations in sentence representations.
Experiment results show that the proposed approach improves performance on downstream applications, such as abductive reasoning.
BOSH: An Efficient Meta Algorithm for Decision-based Attacks
Zhenxin Xiao, Puyudi Yang, Yuchen Jiang, Kai-Wei Chang, Cho-Jui Hsieh
Adversarial example generation becomes a viable method for evaluating the robustness of a machine learning model. In this work, we consider hard-label black-box attacks (a.k.a. decision-based attacks), which is a challenging setting that generates adversarial examples based on only a series of black-box hard-label queries. This type of attacks can be used to attack discrete and complex models, such as Gradient Boosting Decision Tree (GBDT) and detection-based defense models. Existing decision-based attacks based on iterative local updates often get stuck in a local minimum and fail to generate the optimal adversarial example with the smallest distortion.
To remedy this issue, we propose an efficient meta-algorithm called BOSH-attack, which tremendously improves existing algorithms through Bayesian Optimization (BO) and Successive Halving (SH). In particular, instead of traversing a single solution path when searching an adversarial example, we maintain a pool of solution paths to explore important regions. We show empirically that the proposed algorithm converges to a better solution than existing approaches, while the query count is smaller than applying multiple random initializations by a factor of 10.
Learning Language-agnostic Entity Prototype for Zero-shot Cross-lingual Entity Linking
Haihong Yang, Zhongkai Hu, Yuchen Jiang, Boxing Chen, Huajun Chen
Following the recent trend of language model pre-training, we propose to learn entity prototypes for building a general-purpose entity linking system. Our model, Entity Prototype Network (EPN), is able to produce language-agnostic entity prototypes by simply reading entity description and transforms dissimilar surface forms in different languages to the neighborhood of the corresponding entity prototype in vector space.
We define a new Zero-shot Cross-lingual Entity Linking (ZXEL) task as testbed, validating the core assumption of learnable language-agnostic entity prototype. We also propose a simple yet effective auxiliary task termed Entity Identification which further improves our model to capture entity similarity including intra-entity similarity and inter-entity dissimilarity. We evaluate our model on the new task in classic and generalized setting.
Experiment results highlight the consistent improvement (>30% on average) of this approach over competitive baselines. Ablation study justifies the necessity of our model design and reveals the effect of the proposed auxiliary task. Data resources and out-of-the-box code will be publicly available after anonymous period.