搜索系统不同目录下相同名称的文件个数和具体路径

时间：2015-11-14 23:26:57 阅读：359 评论：0 收藏：0 [点我收藏+]

标签：

 1 #!/bin/sh
 2 #find the count of same name files under current directory
 3 
 4 rm -rf search.txt
 5 for file in `find . -name "*.txt"`
 6 do 
 7         echo `basename ${file}` >> search.txt
 8 done
 9 
10 cat search.txt | sort | uniq -c > times_of_file.txt
11 
12 
13 rm -rf result.txt
14 while read line 
15 do 
16   file_cnt=`echo ${line} | awk -F‘ ‘ ‘{print $1}‘`
17   file_name=`echo ${line} | awk -F‘ ‘ ‘{print $2}‘`
18   #echo ${line} >> result.txt
19   echo "file name:${file_name}, count is:${file_cnt}" >> result.txt
20   echo "file paths are:" >> result.txt
21   find . -name "${file_name}" >> result.txt 
22 done < times_of_file.txt
23 
24 rm -rf times_of_file.txt
25 rm -rf search.txt

写这个脚本的背景是：商用环境上存在相同名称的.so 文件，系统默认查找的是旧的.so 文件，导致组件安装失败。

所以想写个脚本，确认环境上同名文件的个数，同名文件存在的具体路径。

上述脚本的执行结果：

 1 $ cat result.txt 
 2 file name:1.txt, count is:4
 3 file paths are:
 4 ./2/1.txt
 5 ./1/1.txt
 6 ./3/1.txt
 7 ./4/4_1/1.txt
 8 file name:2.txt, count is:2
 9 file paths are:
10 ./2/2.txt
11 ./1/2.txt
12 file name:3.txt, count is:1
13 file paths are:
14 ./3/3.txt
15 file name:4.txt, count is:1
16 file paths are:
17 ./4/4.txt

脚本的输出默认重定向到文件中，这样即使数据量很大时，也能防止查询结果刷屏，导致后期的确认存在困难。

对查询结果sort 之后，使用 uniq -c 方便的统计同名字符串出现次数，这是我记录这个脚本的初衷，因为这个命令在这个场景下真的很好用。

网上很多人都说使用grep -c 统计相同字符串，觉得很不方便。

至于为什么要先sort，然后再使用uniq -c，是因为uniq -c 只能对连在一起的相同字符串做统计。

$ cat test.txt 
A
B
A

$ cat test.txt | uniq -c
      1 A
      1 B
      1 A

如上，uniq -c 不能正确的统计A 出现的次数，所以sort之后，两个A会连在一起，那么就能正确的统计次数。

上述脚本中使用了shell 中的for 循环和while循环两种格式，读取文件时，个人常用的即是while循环格式。

while循环读取的文件最后一行最好有一个空行，否则windows下的文件放到linux下时无法有效读取到最后一行。

搜索系统不同目录下相同名称的文件个数和具体路径

标签：

原文地址：http://www.cnblogs.com/xbh-blog/p/4965242.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行