Hermetic Word Frequency Counter
Input File Size & Output to a File

There is theoretically no limit on the size of an input file or the number of words in a file. The program has been tested successfully with a 2 MB text file containing nearly 100,000 different words. In such cases processing of the text may some considerable time, and for these cases a progress bar is provided:

Very large files, e.g. larger than 10 MB, containing very many words, e.g., more than 200,000, may take an inconveniently long time to process, so the time required to count words in such large files sets a practical limit on size and number of words. This program definitely cannot be used to count words in a 60 Mb file. In practice the limit is something between 10 Mb and 20 Mb.

There is a limit on the amount of text which can be held in the output textbox, either by pasting from the clipboard or as a result of listing words found. This does not prevent Hermetic Word Frequency Counter from being able to handle large files. For example, there may be a file on your PC named Win32api.txt. This is about 652 Kb in size and has over 80,000 instances of about 11,000 different words. When the program is run on this file, with the Don't display words as found option unchecked, words found as the program goes through the file will be displayed until 2000 words have been displayed, at which point further words are not displayed so as to avoid a buffer overflow. After the entire file has been processed, the words found will be listed until the capacity of the output textbox buffer is reached. If the words are listed in alphabetical order then (in the case of Win32api.txt) only words beginning with a, b, c or d are listed.

In order to obtain a complete listing of the words in this file you have to specify an output file before starting the word count. In this case the complete listing is written to the output file before the listing is given in the output textbox. The displayed listing will still stop with words beginning with d, but the entire listing can be viewed by opening the output file in some text editor such as WordPad.

Hermetic Word Frequency Counter has been used successfully with large files with many different words. In one case a 4.12 Mb file with 46,398 different words, and in another a 12.1 MB file with 61,979 different words (and a total of 1,847,893 instances of these words). For an example of successful application of the program to a file containing nearly 100,000 words click on that link.

Introduction User Manual: Contents
Hermetic Systems Home Page