english-gigaword
Dataset Profile
Odm ID | 50b2e0a5-7413-4128-82b6-f429bb14e44d
|
---|---|
Title | english-gigaword
|
Notes | This is a recipe to train word n-gram language models using the newswire text provided in the English Gigaword corpus (1200M words of NYT, APW, AFE, XIE). It also prepares dictionaries needed to use the LMs with the HTK and Sphinx speech recognizers.
|
Author | Keith Vertanen
|
Author Email | |
Catalogue Url | |
Dataset Url | |
Metadata Updated | 2015-09-14 22:59:26
|
Tags | |
Date Released | |
Date Updated | |
Update Frequency | |
Organisation | Global
|
Country | |
State | |
Platform | ckan
|
Language | en
|
Version | (not set)
|