Regular Expressions and Grep (part I) For much of this and the next two labs, yo
ID: 3710271 • Letter: R
Question
Regular Expressions and Grep (part I)
For much of this and the next two labs, you will use files that are already in your VM. However, for some parts, you will need the following files. Use the wget statements below to download the files. Put them into ~Student/FILES for convenience (cd to ~/FILES before doing the wget commands below).
• wget www.nku.edu/~foxr/equals.txt
• wget www.nku.edu/~foxr/names.txt
• wget www.nku.edu/~foxr/sentences.txt
1. You will start by experimenting with grep (egrep). [egrep means grep -E. You either can write grep -E or just egrep.]To know more about grep, egrep and their options you should run man command. I encourage you to do so. Enter each of the following commands. Look at the output and see if you can figure out what each regular expression represents. Your instructor may discuss these in your class. There are no questions for this step. a. egrep ‘[0-9]+’ *
b. egrep ‘[A-Z]+{12,}’ *
c. egrep ‘[A-Z]+=[0-9]+’ *
d. egrep ‘[A-Z]+=’ bashrc
e. egrep ‘if [‘ bashrc
f. egrep ‘() {‘ bashrc
g. egrep ‘[0-255].[0-255].[0-255].[0-255]’ *
h. In the previous example, we wanted to list from files any that had IP addresses. You might notice in the output that you got a lot of output that consists of non-IP addresses. Can you figure out what is wrong with the above regex? Use this regex instead:
Type cd /etc for the next set of commands.
[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}
Can you figure out why this regex is more appropriate?
2. You are responsible for answering the questions from this point of the lab forward. Change directory to ~Student/FILES. Look at the content of the three files sales.txt, computers.txt and addresses.txt to familiarize yourself with them. We will use egrep to search for patterns in the sales.txt file. a. In sales.txt, we will search for all lines that contain the month Feb. The command is
b. Remember that the –c option for egrep counts the number of occurrences. Write egrep commands to count the number of entries in the file that contain Cameron. What command did you enter? How many were found?
c. Repeat b for entries that include KY.
d. Let’s find all lines that contain a commission rate of .15. Enter the command
egrep ‘Feb’ sales.txt
What was the response? Now how would you search for all lines that contain an entry for Smith?
egrep ‘.15’ sales.txt
e. Let’s assume we want to find all records whose Sales value is over 9999. Type the command
f. Following up on part e, assume we instead want to find everyone who had less than 10000 for Sales. The command you might think of would be
g. To find all entries of sales in either OH or PA, you can use egrep ‘OH|PA’ sales.txt. Try it. Write an egrep command to find all entries that contain either Barber or Cameron. What command did you enter?
h. Following up on part g, how could you search for entries that contain both OH and PA? Hint: you cannot do this with a single egrep statement, instead use a pipe between two egrep statements. Include your command in your answers.
Look at the response. What entries appeared that shouldn’t? Why did they appear? How will you fix this regex? Do so and try again to make sure you have the correct answer.[Hint: look into man page for egrep to find exact match]
egrep ‘[1-9][0-9]{4}’ sales.txt
Why did we use [1-9] instead of just having [0-9]{5}?
egrep ‘[1-9][0-9]{3}’ sales.txt
Enter this command. You will see that it responded with all entries. Why didn’t it work? See if you can figure out. One solution to solve this problem is to repeat the command from part e but add the –v option to your egrep command. Try it out to see if it worked.
3. For this step, write egrep commands using the computers.txt file. Each step requires that you come up with the egrep command. The answer to place in your answer file is the egrep command you come up with which successfully accomplishes the step. a. Find all entries of faculty whose name starts with the letter F, G, H, I, J or K.
b. Find all entries of faculty whose name ends with an n.
c. Find all entries that are on the 4th floor of their building. This means that the room number is 4xx (so for instance, it should not include 314 just because there is a 4).
Explanation / Answer
this command searches for line with atleast one number (0to9)
this command searches for line with sequence of twelve alphabet (A -Z)
this command searches all files will be capital letter
bash shell script to display Capital letters files
bash shell script to display file strating with [
bash shell script to display find function
to display ip address