What is a corpus?
A language corpus is an electronic collection of authentic texts (written or spoken) where various language phenomena can be easily searched for and displayed in their natural context.
The CNC corpora include written (printed) contemporary Czech (more than 5 billion tokens), internet Czech (more than 6 billion tokens), spontaneous spoken Czech, historical Czech, as well as the InterCorp parallel corpus which contains translations from/to 60+ languages.

