Dataset | OpenDataMonitor

CKAN

Odm ID	50b2e0a5-7413-4128-82b6-f429bb14e44d
Title	english-gigaword
Notes	This is a recipe to train word n-gram language models using the newswire text provided in the English Gigaword corpus (1200M words of NYT, APW, AFE, XIE). It also prepares dictionaries needed to use the LMs with the HTK and Sphinx speech recognizers.
Author	Keith Vertanen
Author Email
Catalogue Url	http://datahub.io/
Dataset Url	http://thedatahub.org/dataset/english-gigaword-language-model-training-recipe
Metadata Updated	2015-09-14 22:59:26
Tags
Date Released
Date Updated
Update Frequency
Organisation	Global
Country
State
Platform	ckan
Language	en
Version	(not set)