I need to write a Java program that does the following: When you are given a tex
ID: 3872097 • Letter: I
Question
I need to write a Java program that does the following:
When you are given a text file, parse, tokenize, and further split the tokens into specified sized letter groups.
Example:
Input: “Second Programming Assignment”
Tokens: “second” “programming” “assignment”
Letter Groups (2) : "se" "ec" "co" "on" "nd" "pr" "ro" "og" "gr" "ra" "am" "mm" "mi" "in" "ng" "as" "ss" "si" "ig" "gn" "nm" "me" "en" "nt"
Letter Groups (3): "sec" "eco" "con" "ond" "pro" "rog" "ogr" "gra" "ram" "amm" "mmi" "min" "ing" "ass" "ssi" "sig" "ign" "gnm" "nme" "men" "ent"
After generating the letter groups, generate the histogram (frequency of occurrence) of the letter groups.
The name of the text file will be the first argument of your main function and letterGroupLen will be the second argument of your main function.
Parse the input text file
You will have 2 classes in your design. SentenceUtils and Histogram.
SentenceUtils class will tokenize and partition the tokens into letter groups
Histogram class using a HashMap data structure will count the number of occurrence of each letter group, and print the results when requested.
The solution is composed of 2 classes: a SentenceUtils Class to convert a string into letterGroups and also an Histogram class to do histogram processing. These 2 functionalities are independent of each other.
You need to implement body of the following 4 functions and also the main function:
private String[] getTokens(String line)
private void splitTokenstoLetterGroups(String[] tokens)
public void generateHistogram(ArrayList<String> letterGroups)
public void printHistogram()
Explanation / Answer
import java.io.File;
import java.util.ArrayList;
import java.util.Scanner;
public class SentenceUtils {
public static void main(String[] args) throws Exception
{
File fileName=new File("text.txt");
String line="";
Scanner scanInput=new Scanner(fileName);
while(scanInput.hasNextLine())
{
line+=scanInput.nextLine();
}
line=line.replaceAll(" ", "");
String[] tokens=getTokens(line);
}
private static String[] getTokens(String line)
{
//for three letter just replace 2 with the 3.
String[] tokens=line.split("(?<=\G.{2})");
splitTokenstoLetterGroups(tokens);
return tokens;
}
private static void splitTokenstoLetterGroups(String[] tokens)
{
ArrayList <String> letterGroups=new ArrayList<>();
for(int i=0;i<tokens.length;i++)
{
letterGroups.add(tokens[i]);
}
Histogram.generateHistogram(letterGroups);
}
}
public class Histogram {
public static void generateHistogram(ArrayList<String> letterGroups) {
HashMap<String, Integer> histoGramPairSet= new HashMap<String , Integer>();
int size=letterGroups.size();
for(int i=0;i<size;i++)
{
String str=letterGroups.get(i);
if (histoGramPairSet.containsKey(str))
{
int value=histoGramPairSet.get(str);
histoGramPairSet.put(str,++value );
}
else
{
histoGramPairSet.put(str, 1);
}
}
printHistogram(histoGramPairSet);
}
public static void printHistogram(HashMap<String, Integer> histoGram) {
System.out.println("Histohrams are :");
System.out.println(histoGram);
}
}
text.txt contains
Second Programming Assignment
OUTPUT:-
Histohrams are :
{mm=1, ss=1, Pr=1, in=1, en=1, co=1, ra=1, Se=1, t=1, nd=1, og=1, gA=1, ig=1, nm=1}