Skip to main content


About Linguistic Data Consortium (LDC)

The University of Alberta Library subscribes to a key source of textual data, named the Linguistic Data Consortium (LDC). The LDC is an open consortium of universities, companies and government research laboratories hosted by the University of Pennsylvania, which creates, collects and distributes speech and text databases, lexicons, and other resources for research and development purposes.
The LDC Catalog includes:
  • Hundreds of corpora of language data
  • Indexed collections of Arabic, Chinese and English newswire text
  • Millions of words of English telephone speech from the Switchboard and Fisher collection
The University of Alberta is an institutional member of the Linguistic Data Consortium (LDC). All language corpora listed in the LDC Catalog are available on request.

To access to corpora in LDC:

  1. Look up the individual language corpora by typing ‘LDC’ and your language of interest in the Search the Library box on the UofA Libraries homepage.  
  2. When you find the result you desire click on the Request Form and fill it out with as many details as possible.
  3. Once your request has been processed you will be provided with a link to download the data if it is available online. If available only in a physical format (CD, hard drive etc.) you will be notified when it is ready for pickup.
  4. IF YOU CANNOT FIND what you are looking for in the Library Catalogue the LDC Catalog is still available to provide advanced searching options. 
  5. The LDC Request Form is monitored Monday through Friday during regular hours.
  6. If you require additional assistance, please email: