Tuesday, October 18, 2011

UNIX "grep" command Examples

What is grep  command used for?

Ans: grep command can be considered as a "search" command for UNIX/LINUX systems. Used for searching text strings and some regular expressions.

Examples:

1) Search for a text string in a file

grep 'shiyas' /etc/passwd

above command search for all the occurence of the text 'shiyas' in the file passwd and print the lines having those text on the screen.

2) Search for the string in multiple files

grep 'shiyas' *

above command search for all the occurence of the text 'shiyas' in all the files in current dierectory and print the lines having those text on the screen along with the filenames. Here the wildcard character * looks for all the files in the current directory.

grep 'shiyas' *.txt
this command will search for the pattern in all txt files.

grep 'SHIYAS' /etc/passwd
wont display anything as it doesnt have any string SHIYAS, and UNIX is very sensitive for this case

inorder to avoid this case sensitive issue you can use below command which will help you to get the output you need irrespective of case

grep -i 'SHIYAS' /etc/passwd

4) what if I need to display all the lines which doesnt have the string called 'shiyas'
grep -v 'shiyas' /etc/passwd

Note: It is not necessary to place the string in single quotes, but mandatory when a space is included.

5) find all the sub-directories in the current directory
ls -al | grep '^d'

6) search for multiple patterns at one time (egrep)
egrep 'and|loop|cursor' example.txt

Note: egrep stand for "extended grep"
7) suppose you want to search for the strings "Foo" or "Goo" in all files in the current directory. That grep command would be:

grep '[FG]oo' *

8) Search for the string 'bond' but not 'jamesbond'

grep '^bond' /etc/password

9) Display the files that has the search string?
grep -l 'bond' /etc/password

10) Display the line number as well along withe lines that has the search string
grep -n 'bond' /etc/password

11) Display the lines before/after your search pattern

grep -B 4 'bond' /etc/password
this command displays 4 lines before the search pattern

grep -A 4 'bond' /etc/password
this command diplays 4 lines after the search pattern

12) List down all files which has pattern 'bond' in all the subdirectories in the current directory tree

find . -type f -exec grep -il 'foo' {} \;
This command will list down all the files which has the given search pattern, no matter at what level of directory it is

13) What is the grep command that returns specific number of rows for the search pattern provided.
For ex :- if your file has 10 row with the given string and you want to display only 5 out of 10 rows use below command

grep text 'string' | head -5

What if i want to search for multiple patterns?
Use egrep. egrep stands for "extended grep".

How it works?
What if you have to search for a pattern either this"" or this "", you have rely on "egrep" command which has more powerful notational scheme than "grep" command.

NotationMeaning
cMatches the character c
\cForces c to be read as the letter c, not as another meaning the character might have
^Beginning of the line
$End of the line
.Any single character
[xy]Any single character in the set specified
[^xy]Any single character not in the set specified
c*Zero or more occurrences of character c
c+One or more occurrences of character c
c?Zero or one occurrences of character c
a|bEither a or b
(a)Regular expression


Application of egrep is explained as below.

cat passwd | head-10

root:x:0:0:Super-User:/:/sbin/sh
daemon:x:1:1::/:
bin:x:2:2::/usr/bin:
sys:x:3:3::/:
adm:x:4:4:Admin:/var/adm:
lp:x:71:8:Line Printer Admin:/usr/spool/lp:
uucp:x:5:5:uucp Admin:/usr/lib/uucp:
nuucp:x:9:9:uucp Admin:/var/spool/uucppublic:/usr/lib/uucp/uucico
smmsp:x:25:25:SendMail Message Submission Program:/:
listen:x:37:4:Network Admin:/usr/net/nls:

When I am looking for one or more occurence of string 'c' in /etc/passwd file, using grep the command is as below
grep 'cc*' /etc/passwd
uucp:x:5:5:uucp Admin:/usr/lib/uucp:
nuucp:x:9:9:uucp Admin:/var/spool/uucppublic:/usr/lib/uucp/uucico

what if i am entering the command as
grep 'c*' /etc/passwd
this will list down all those lines which has zero or more occurence of the string 'c' which is equal to listing down the entire content of the file.

root:x:0:0:Super-User:/:/sbin/sh
daemon:x:1:1::/:
bin:x:2:2::/usr/bin:
sys:x:3:3::/:
adm:x:4:4:Admin:/var/adm:
lp:x:71:8:Line Printer Admin:/usr/spool/lp:
uucp:x:5:5:uucp Admin:/usr/lib/uucp:
nuucp:x:9:9:uucp Admin:/var/spool/uucppublic:/usr/lib/uucp/uucico
smmsp:x:25:25:SendMail Message Submission Program:/:
listen:x:37:4:Network Admin:/usr/net/nls:

whereas while using egrep it is as simple as,
egrep 'u+' /etc/passwd
root:x:0:0:Super-User:/:/sbin/sh
bin:x:2:2::/usr/bin:
lp:x:71:8:Line Printer Admin:/usr/spool/lp:
uucp:x:5:5:uucp Admin:/usr/lib/uucp:
nuucp:x:9:9:uucp Admin:/var/spool/uucppublic:/usr/lib/uucp/uucico
smmsp:x:25:25:SendMail Message Submission Program:/:
listen:x:37:4:Network Admin:/usr/net/nls:


then what if I am giving the command as
egrep 'cc+' /etc/passwd
this will list down all those lines which has two or more occurence of the charater 'c'
uucp:x:5:5:uucp Admin:/usr/lib/uucp:
nuucp:x:9:9:uucp Admin:/var/spool/uucppublic:/usr/lib/uucp/uucico


Now let us look into the question asked above, how to search for multiple patterns.
It is as simple as this

bash-3.00$ cat passwd |head -10 | egrep 'uu|mm'uucp:x:5:5:uucp Admin:/usr/lib/uucp:
nuucp:x:9:9:uucp Admin:/var/spool/uucppublic:/usr/lib/uucp/uucico
smmsp:x:25:25:SendMail Message Submission Program:/:

I want to list down those lines which starts with a
bash-3.00$ cat passwd |head -10 | egrep '^a' passwdadm:x:4:4:Admin:/var/adm:
apache:x:104:103:Apache User:/export/apache:/bin/bash


What if I want to list down those lines which starts with a,b,c,d
bash-3.00$ cat passwd |head -10 | egrep '^[a-d]' passwddaemon:x:1:1::/:
bin:x:2:2::/usr/bin:
adm:x:4:4:Admin:/var/adm:
build:x:102:103:Build User:/export/build:/bin/bash
apache:x:104:103:Apache User:/export/apache:/bin/bash
csvn:x:204:204:CollabNet Subversion User:/opt/CollabNet_Subversion:/bin/sh


What is fgrep and how it is used?
fgrep stands for 'file based grep'. For example I have a file of search strings say MyWords.txt and what I need is to list down the lines in 'appreport.txt' file having these search strings.

For this let me go the directory
bash-3.00$ cd /tmp/shiyas/skills/unix/
bash-3.00$ lswhat is unix.txt
bash-3.00$ cat what\ is\ unix.txt | head -5X is a computer operating system, a control program that works with users to run
programs, manage resources, and communicate with other computer systems. Several people
can use a UNIX computer at the same time; hence UNIX is called a multiuser system. Any
of these users can also run multiple programs at the same time; hence UNIX is called
multitasking. Because UNIX is such a pastiche.a patchwork of development.it.s a lot


let me create one file Mywords.txt having words manage & such

bash-3.00$ vi MyWords.txtmanage
such
~
~
~
:wq!

"MyWords.txt" [New file] 2 lines, 12 characters
bash-3.00$ cat MyWords.txtmanage
such


bash-3.00$ fgrep -f MyWords.txt what\ is\ unix.txtprograms, manage resources, and communicate with other computer systems. Several people
multitasking. Because UNIX is such a pastiche.a patchwork of development.it.s a lot
used in high-speed networking, file revision management, and software development.
Why is having all this choice such a big deal? Think about why Microsoft MS-DOS and the


Is there any alernative for typing down such a long command?
Ofcourse you have, and it is using 'alias'
the command for alias is
bash-3.00$ alias search='fgrep -i -f MyWords.txt'
you have to be very careful with the syntax else will through error.
the space after '=' will give error as below
bash-3.00$ alias search= 'fgrep -i -f MyWords.txt'
bash: alias: fgrep -i -f MyWords.txt: not found


Now lets see how we can use this alias
bash-3.00$ search what\ is\ unix.txtprograms, manage resources, and communicate with other computer systems. Several people
multitasking. Because UNIX is such a pastiche.a patchwork of development.it.s a lot
used in high-speed networking, file revision management, and software development.
Why is having all this choice such a big deal? Think about why Microsoft MS-DOS and the
bash-3.00$


Now I am removing the alias function
bash-3.00$ unalias search
bash-3.00$ search what\ is\ unix.txt
bash: search: command not foundbash-3.00$

I need to display only the words that match instead of the entire line, what should I do?
To achieve this we have to use the 'awk' command , for which below is sample
bash-3.00$ echo 'My name is shiyas' | awk '{for (i=1;i<=NF;i++) print $i}'
My
name
is
shiyas

bash-3.00$
NF stands for number of fields (here it is 4)

Now lets work on displaying only the matching word alone, not the entire line
Logic we are going to implement is:
Make the content of file a list of one word each as above and now search for the pattern
step1: awk '{for (i=1;i<=NF;i++) print $i}'
step2: fgrep -i -f MyWords.txt what\ is\ unix.txtFor this lets code one shell script to incorporate both above commands.

bash-3.00$ cat search
#Wrongwords - show a list of commonly misused words in the file
cat $* | \
awk .{for (i=1;i<=NF;i++) print $i}. |\
fgrep -i -f MyWords.txt


bash-3.00$ unalias search

Give execute permission for the above shell script
bash-3.00$ chmod +x search

****
Hope this is helpful. Thanks Phoenix

No comments:

Post a Comment