Project: Word Counter


Note: This first project must be done individually. Starting with the next project, you will work on all remaining projects with a teammate.

Objectives

  1. Familiarity with designing and coding a realistic component-based application program without being provided a skeleton solution.

Note that in your solution you can only use components from the components package and components from the standard Java libraries that have been used in CSE 2221 in lectures/labs/projects (e.g., String ). You should not use other components from any other libraries that have not been used in CSE 2221.

The Problem

Write a Java program that counts word occurrences in a given input file and outputs an HTML document with a table of the words and counts listed in alphabetical order. Here are some initial requirements:

These are the stated requirements for your program. If you have questions of clarification or need additional details, ask in class.

Setup

You're on your own! There are no other setup instructions. As a reminder, instructions on how to set up Eclipse and to create a new project from the ProjectTemplate are available here, and instructions on how to submit the project are available here.

Method

When you are satisfied that your program works, select your Eclipse project (not just some of the files, but the whole project), create a zip archive of it, and submit the zip archive to the Carmen dropbox for this project, as described in Submitting a Project.

Your grade will depend not merely on whether the final program meets the initial requirements, of course, but also on the general software quality factors you've learned in CSE 2221: understandability, precision, appropriate use of existing software components, maintainability, adherence to coding standards, and so forth.

A sample input file is available at:

     gettysburg.txt
  

A sample of the corresponding program output is available at:

     gettysburg.html
  

You should not assume that your output must look exactly like this one, which is merely a sample of what the output might be. You should feel free to improve on it.

Helpful References

Here are links to some CSE 2221 assignments where you practiced skills directly related to this project. You may find it useful to review the work you did on these assignments in CSE 2221.

Additional Activities

Here are some possible additional activities related to this project. Any extra work is strictly optional, for your own benefit, and will not directly affect your grade.

  1. Modify the program so that it outputs the words in decreasing order of counts and in alphabetical order among words with the same count.
  2. Modify the program so that it is case-insensitive, i.e., the words "hello" and "HeLLo" would be counted as the same word.
  3. Modify the case-insensitive program so that the capitalization displayed in the output is the one that occurs most often among the different capitalizations of the same word.