Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Part II: Word cloud (30 pts) I want to write a poem. One that I know I\'d like,

ID: 3667755 • Letter: P

Question

Part II: Word cloud (30 pts) I want to write a poem. One that I know I'd like, so I've decided that the poem should contain words from other poems that I like. I've collected a sample of poems and put them into the file /home/1inux/ieng6/cs11wb/public/HW6/poems.txt I now want you to create a program, WordCloud.java, that can read in a text file and print out the most commonly used words in the file. Once done, my job of writing a new poem will be easier because l'll know which words I should definitely use Based on Rick Ord's problem set 2 Implementation details . Words occurrence count is case insensitive (River, river, rivER would all count as the same word) . I'm not interested in how many times common words like "the, a, an, ..." (also case insensitive) show up, so your program will also need to read in the text file /home/1inux/ieng6/cs11wb/public/HW6/common.txt And make sure these common words are not counted and don't appear in the resulting output . The program will take two arguments: the text file to read in and the number of words . The output of the program will be the top N words along with the number of times each . If the user requests the top 100 words, but there are only 50 unique words in the input . You'll need to remove any “, .!?x" from the end of strings that you read, otherwise . If multiple words have the same frequency, you can print them in any order (N) to print out. word occurred in the text. file, then the top 50 words should be printed "time" and "time." will be counted separately. If the user does not enter the file name to read, or the number of words to report, the program should state how to use the program (example below) .

Explanation / Answer


import java.io.BufferedReader;
import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.StringTokenizer;
import java.util.Map.Entry;

public class MaxDuplicateWordCount {
    
    public Map<String, Integer> getWordCount(String fileName){

        FileInputStream fis = null;
        DataInputStream dis = null;
        BufferedReader br = null;
        Map<String, Integer> wordMap = new HashMap<String, Integer>();
        try {
            fis = new FileInputStream(fileName);
            dis = new DataInputStream(fis);
            br = new BufferedReader(new InputStreamReader(dis));
            String line = null;
            while((line = br.readLine()) != null){
                StringTokenizer st = new StringTokenizer(line, " ");
                while(st.hasMoreTokens()){
                    String tmp = st.nextToken().toLowerCase();
                    if(wordMap.containsKey(tmp)){
                        wordMap.put(tmp, wordMap.get(tmp)+1);
                    } else {
                        wordMap.put(tmp, 1);
                    }
                }
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally{
            try{if(br != null) br.close();}catch(Exception ex){}
        }
        return wordMap;
    }
    
    public List<Entry<String, Integer>> sortByValue(Map<String, Integer> wordMap){
        
        Set<Entry<String, Integer>> set = wordMap.entrySet();
        List<Entry<String, Integer>> list = new ArrayList<Entry<String, Integer>>(set);
        Collections.sort( list, new Comparator<Map.Entry<String, Integer>>()
        {
            public int compare( Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2 )
            {
                return (o2.getValue()).compareTo( o1.getValue() );
            }
        } );
        return list;
    }
    
    public static void main(String[] args){
        MaxDuplicateWordCount mdc = new MaxDuplicateWordCount();
        Map<String, Integer> wordMap = mdc.getWordCount(args[0]);
        List<Entry<String, Integer>> list = mdc.sortByValue(wordMap);
        for(Map.Entry<String, Integer> entry:list){
            System.out.println(entry.getKey()+" ==== "+entry.getValue());
        }
    }
}