About uniq
uniq reports or filters out repeated lines in a file.
uniq command Syntax
uniq [OPTION]... [INPUT [OUTPUT]]
Description
uniq filters out adjacent, matching lines from input file INPUT, writing the filtered data to output file OUTPUT.
If INPUT is not specified, uniq reads from the standard input.
If OUTPUT is not specified, uniq writes to the standard output.
If no options are specified, matching lines are merged to the first occurrence.
uniq command options
-c, --count Prefix lines with a number representing how many times they occurred.
-d, --repeated Only print duplicated lines.
-D, --all-repeated[=delimit-method] Print all duplicate lines. delimit-method may be one of the following:
none Do not delimit duplicate lines at all. This is the default.
prepend Insert a blank line before each set of duplicated lines.
separate Insert a blank line between each set of dupliated lines.
The -D option is the same as specifying --all-repeated=none.
-f N, --skip-fields=N Avoid comparing the first N fields of a line before determining uniqueness. A field is a group of characters, delimited by whitespace.
This option is useful, for instance, if your lines of document are numbered, and you want to compare everything in the line except the line number. If the option -f 1 were specified, the adjacent lines
1 This is a line. 2 This is a line.
would be considered identical. If no -f option were specified, they would be considered unique.
-i, --ignore-case Normally, comparisons are case-sensitive. This option performs case-insensitive comparisons instead.
-s N, --skip-chars=N Avoid comparing the first N characters of each line when determining uniqueness. This is like the -f option, but it skips individual characters rather than fields.
-u, --unique Only print unique lines.
-z, --zero-terminated End lines with 0 byte (NULL), instead of a newline.
-w, --check-chars=N Compare no more than N characters in lines.
--help Display a help message and exit.
--version Output version information and exit.
Notes:
uniq does not detect repeated lines unless they are adjacent. You may want to sort the input first, or use sort -u instead of uniq.
Examples
Uniq command is helpful to remove or detect duplicate entries in a file. This tutorial explains few most frequently used uniq command line options that you might find helpful.
The following test file is used in some of the example to understand how uniq command works.
$ cat test
aa
aa
bb
bb
bb
xx
Basic Usage
Syntax:
$ uniq [-options]
For example, when uniq command is run without any option, it removes duplicate lines and displays unique lines as shown below.
$ uniq test
aa
bb
xx
Count Number of Occurrences using -c option
This option is to count occurrence of lines in file.
$ uniq -c test
2 aa
3 bb
1 xx
Print only Duplicate Lines using -d option
This option is to print only duplicate repeated lines in file. As you see below, this didn’t display the line “xx”, as it is not duplicate in the test file.
$ uniq -d test
aa
bb
The above example displayed all the duplicate lines, but only once. But, this -D option will print all duplicate lines in file. For example, line “aa” was there twice in the test file, so the following uniq command displayed the line “aa” twice in this output.
$ uniq -D test
aa
aa
bb
bb
bb
Print only Unique Lines using -u option
This option is to print only unique lines in file.
$ uniq -u test
xx
If you like to delete duplicate lines from a file using certain pattern, you can use sed delete command.
Limit Comparison to ‘N’ characters using -w option
This option restricts comparison to first specified ‘N’ characters only. For this example, use the following test2 input file.
$ cat test2
hi Linux
hi LinuxU
hi LinuxUnix
hi Unix
The following uniq command using option ‘w’ is compares the first 8 characters of lines in file, and then using ‘c’ option prints number of occurrences of lines of file.
$ uniq -c -w 8 testNew
3 hi Linux
1 hi Unix
The following uniq command using option ‘w’ is compares first 8 characters of lines in file, and then using ‘D’ option prints all duplicate lines of file.
$ uniq -D -w 8 testNew
hi Linux
hi LinuxU
hi LinuxUnix
Avoid Comparing first ‘N’ Characters using -s option
This option skips comparison of first specified ‘N’ characters. For this example, use the following test3 input file.
$ cat test3
aabb
xxbb
bbc
bbd
The following uniq command using option ‘s’ skips comparing first 2 characters of lines in file, and then using ‘D’ option prints all duplicate lines of file.
Here, starting 2 characters i.e. ‘aa’ in 1st line and ‘’xx’ in 2nd line would not be compared and then next 2 characters ‘bb’ in both lines are same so would be shown as duplicated lines.
$ uniq -D -s 2 test3
aabb
xxbb
Avoid Comparing first ‘N’ Fields using -f option
This option skips comparison of first specified ‘N’ fields of lines in file.
$ cat test2
hi hello Linux
hi friend Linux
hi hello LinuxUnix
The following uniq command using option ‘f’ skips comparing first 2 fields of lines in file, and then using ‘D’ option prints all duplicate lines of file.
Here, starting 2 fields i.e. ‘hi hello’ in 1st line and ‘hi friend’ in 2nd line would not be compared and then next field ‘Linux’ in both lines are same so would be shown as duplicated lines.
$ uniq -D -f 2 test2
hi hello Linux
hi friend Linux
Reference
http://www.thegeekstuff.com/2013/05/uniq-command-examples/
http://www.computerhope.com/unix/uuniq.htm