CGN core corpus

TreebankContents# Sentences# Words# Sentences# Words# Sentences# Words
NLVLNLVLTOTAL
NAVASpontaneous conversations ('face-to-face')50,239302,82822,881147,41873,120450,246
NBVBInterviews with teachers of Dutch2,48425,7244,28934,1586,77359,882
NCVCTelephone conversations (recorded via a switchboard)11,64970,0843,14219,98414,79190,068
NDVDTelephone conversations (recorded on MD)009296,3099296,309
NEVESimulated business negotiations3,12325,524003,12325,524
NFVFInterviews/discussions/debates (broadcast)6,29075,1672,61725,1228,907100,289
NGVG(Political) discussions/debates/meetings (non-broadcast)1,16625,1255439,0091,70934,134
NHVHLessons recorded in the classroom3,06426,0041,39510,1164,45936,120
NIVILive (sports) commentaries (broadcast)2,25125,0021,02610,1473,27735,149
NJVJNewsreports (broadcast)2,25925,0845367,6862,79532,770
NKVKNews (broadcast)1,92325,3535587,3062,48132,659
NLVLCommentaries/columns/reviews (broadcast)1,85725,0826017,4312,45832,513
NMVMCeremonious speeches/sermons4445,1901071,8945517,084
NNVNLectures/seminars59314,9217018,1591,29423,080
NOVORead speech003,25644,1443,25644,144
CGN coreComplete treebank87,342671,08842,581338,883129,9231,009,971

#words: //node[@postag and not(@postag="LET()")]
#sentences: //sentence