Phrase Frequency Counter
Overview
- Takes the output of the autocomplete tools as input
- Produces a sorted list of the most common phrases and how many times they have occurred within the input file
- Also produces a list of the top N=20 phrases
- Both written in Python
PHP & Python
- Two versions
- Difference due to differences in delimiting of data
- Versions are separate and can be used independently
PHP
frequencycount.py
*toptermfinder.py
- Takes the resulting top N terms from all input files and consolidates them into one file, finding the phrases with the most occurrences overall
Python
frequencycount_forpython.py
PHP ver.
Input:
Output:
$ python frequencycounter.py <input> <output>


PHP cont
(in same directory) Another test file:
Output:


PHP
Convert output (N = 20)

toptermfinder.py takes all of these lists in one directory and finds the most common phrases
directory: USArest

Python
$ python frequencycounter_forpython.py <input1> <input2> <output>

input1
input2

Python cont
output is similar to PHP ver.


6.23
By katiec089
6.23
- 256