当前位置:首页 > Web开发 > 正文

distributional vectors often do not properly capture the di

2024-03-31 Web开发

Semantic word spaces have been very useful but cannot express the meaning of longer phrases in a principled way.

语义词空间长短常有用的,但它不能有原则地表达较是非语的意义。

Further progress towards understanding compositionality in tasks such as sentiment detection requires richer supervised training and evaluation resources and more powerful models of composition.

要想在感情检测等任务中进一步理解组成,需要更丰富的监督训练和评估资源,以及更强大的组成模型。

To remedy this, we introduce a Sentiment Treebank.

为了解决这个问题,我们引入了一个情绪树库。

It includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality.

它为11855个句子的解析树中的215154个短语供给了细粒度的感情标签,并为感情组成提出了新的挑战。

To address them, we introduce the Recursive Neural Tensor Network.

为了解决这个问题,我们引入了递归神经张量网络。

When trained on the new treebank, this model outperforms all previous methods on several metrics.

当在新的树桩长进行训练时,该模型在几个指标上优于以前的所有要领。

It pushes the state of the art in single sentence positive/negative classification from 80% up to 85.4%.

它将单一句子的积极/消极分类从80%提升到85.4%。

The accuracy of predicting fine-grained sentiment labels for all phrases reaches 80.7%, an improvement of 9.7% over bag of features baselines.

预测所有短语的细粒度情绪标签的准确性到达80.7%,比成果包基线提高了9.7%。

Lastly, it is the only model that can accurately capture the effects of negation and its scope at various tree levels for both positive and negative phrases.

最后,它是独一能够准确捕捉否定效果及其在差别树条理上的范畴的模型。

1 Introduction

Semantic vector spaces for single words have been widely used as features (Turney and Pantel, 2010).

单个单词的语义向量空间被广泛用作特征(Turney和Pantel, 2010)。

Because they cannot capture the meaning of longer phrases properly, compositionality in semantic vector spaces has recently received a lot of attention (Mitchell and Lapata, 2010; Socher et al., 2010; Zanzotto et al., 2010; Yessenalina and Cardie, 2011; Socher et al., 2012; Grefenstette et al., 2013).

由于不能正确地捕捉较是非语的含义,语义向量空间中的组合性比来受到了很多存眷(Mitchell和Lapata, 2010;Socher等人,2010;Zanzotto等人,2010;Yessenalina和Cardie, 2011年;Socher等人,2012;(Grefenstette et al., 2013)。

However, progress is held back by the current lack of large and labeled compositionality resources and models to accurately capture the underlying phenomena presented in such data.

然而,由于目前缺乏大型和符号的可组合性资源和模型来准确地捕获这些数据中泛起的潜在现象,这一进展受到了阻碍。

To address this need, we introduce the Stanford Sentiment Treebank and a powerful Recursive Neural Tensor Network that can accurately predict the compositional semantic effects present in this new corpus.

为了满足这一需求,我们引入了斯坦福感情树库和一个强大的递归神经张量网络,它可以准确地预测这一新语料库中呈现的身分语义效应。

Figure 1: Example of the Recursive Neural Tensor Network accurately predicting 5 sentiment classes, very negative to very positive (- 0, +, + +), at every node of a parse tree and capturing the negation and its scope in this sentence.

图1:递归神经张量网络的例子,准确地预测了5个情绪类,从非常负面到非常正面(- 0,+,+ +),在解析树的每个节点,捕捉否定和它在这句话中的范畴。

The Stanford Sentiment Treebank is the first corpus with fully labeled parse trees that allows for a complete analysis of the compositional effects of sentiment in language.

斯坦福情绪树库是第一个拥有完整符号的解析树的语料库,它允许对语言中情绪的构成影响进行完整的分析。

The corpus is based on the dataset introduced by Pang and Lee (2005) and consists of 11,855 single sentences extracted from movie reviews.

语料库基于庞和李(2005)介绍的数据集,由11,855个从影戏评论中提取的单句构成。

It was parsed with the Stanford parser (Klein and Manning, 2003) and includes a total of 215,154 unique phrases from those parse trees, each annotated by 3 human judges.

它是由斯坦福解析器解析的(Klein和Manning, 2003年),包罗了来自这些解析树的总共215,154个奇特的短语,每个短语都由3名人类裁判注释。

This new dataset allows us to analyze the intricacies of sentiment and to capture complex linguistic phenomena.

这个新的数据集让我们能够分析感情的庞大性,并捕捉庞大的语言现象。

Fig. 1 shows one of the many examples with clear compositional structure.

图1显示了许多具有清晰的构成布局的例子之一。

The granularity and size of this dataset will enable the community to train compositional models that are based on supervised and structured machine learning techniques.

该数据集的粒度和巨细将使社区能够培训基于监督和布局化机器学习技术的组合模型。

温馨提示: 本文由Jm博客推荐,转载请保留链接: https://www.jmwww.net/file/web/31087.html