码迷,mamicode.com
首页 > 其他好文 > 详细

Bash:常用命令工具-uniq

时间:2015-04-16 19:07:07      阅读:138      评论:0      收藏:0      [点我收藏+]

标签:

NAME
       uniq - report or omit repeated lines

SYNOPSIS
       uniq [OPTION]... [INPUT [OUTPUT]]

DESCRIPTION
       Filter adjacent matching lines from INPUT (or standard input), writing to OUTPUT (or standard output).

       With no options, matching lines are merged to the first occurrence.

       Mandatory arguments to long options are mandatory for short options too.

       -c, --count
              prefix lines by the number of occurrences

       -d, --repeated
              only print duplicate lines

       -D, --all-repeated[=delimit-method]
              print all duplicate lines delimit-method={none(default),prepend,separate} Delimiting is done with blank lines

       -f, --skip-fields=N
              avoid comparing the first N fields

       -i, --ignore-case
              ignore differences in case when comparing

       -s, --skip-chars=N
              avoid comparing the first N characters

       -u, --unique
              only print unique lines

       -z, --zero-terminated
              end lines with 0 byte, not newline

       -w, --check-chars=N
              compare no more than N characters in lines

       --help display this help and exit

       --version
              output version information and exit

       A field is a run of blanks (usually spaces and/or TABs), then non-blank characters.  Fields are skipped before chars.

       Note:  uniq  does not detect repeated lines unless they are adjacent.  You may want to sort the input first, or use sort -u without uniq.  Also, comparisons honor the rules specified by LC_COL‐
       LATE.

以上是man输出。

从最后的note中可以知道当使用uniq进行重复统计输出时重复项应该是相邻的,这个比较好理解,要求重复项时连续的话可以省去一个hashmap的空间来做统计。为了获得这样的一个输入,可以先对数据进行一个排序操作,这样重复项必然是连续相邻的。

Bash:常用命令工具-uniq

标签:

原文地址:http://www.cnblogs.com/lailailai/p/4432579.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!