Skip to main content

Penn Discourse Treebank Version 3.0

Resource
URL
https://dss2.princeton.edu/data/231/
Blurb

Penn Discourse Treebank (PDTB) Version 3.0 is the third release in the Penn Discourse Treebank project, the goal of which is to annotate the Wall Street Journal (WSJ) section of Treebank-2 with discourse relations. Penn Discourse Treebank Version 2 contains over 40,600 tokens of annotated relations. In Version 3, an additional 13,000 tokens were annotated, certain pairwise annotations were standardized, new senses were included and the corpus was subject to a series of consistency checks. 

Largely because the PDTB project was based on the idea that discourse relations are grounded in an identifiable set of explicit words or phrases (discourse connectives) or simply in the adjacency of two sentences, the PTDB has been used by many researchers in the natural language processing community and more recently, by researchers in psycholinguistics. It has also stimulated the development of similar resources in other languages and domains.

Link time
2021-03-17 14:08:00 UTC
Sample
Principal investigator
Producer
Distributor
Version
More detail URL
Resource type
Single study
Subjects
  • Art & Culture
  • Qualitative Data
Regions
    Countries