第二周作业wordcount

时间：2018-03-20 14:03:59 阅读：265 评论：0 收藏：0 [点我收藏+]

标签：size mpi cat width buffered 多个详细 param write

github项目链接https://github.com/liqia/WordCount

1.项目简介

对程序设计语言源文件统计字符数、单词数、行数，统计结果以指定格式输出到默认文件中，以及其他扩展功能，并能够快速地处理多个文件。

可执行程序命名为：wc.exe，该程序处理用户需求的模式为：

wc.exe [parameter] [input_file_name]

存储统计结果的文件默认为result.txt，放在与wc.exe相同的目录下。

2.项目psp表格

PSP2.1表格

PSP2.1	PSP阶段	预估耗时（分钟）	实际耗时（分钟）
Planning	计划	60	100
· Estimate	· 估计这个任务需要多少时间	两天	一天半
Development	开发	一天	一天
· Analysis	· 需求分析 (包括学习新技术)	180	240
· Design Spec	· 生成设计文档	50	60
· Design Review	· 设计复审 (和同事审核设计文档)	30	30
· Coding Standard	· 代码规范 (为目前的开发制定合适的规范)	20	60
· Design	· 具体设计	100	120
· Coding	· 具体编码	240	260
· Code Review	· 代码复审	30	50
· Test	· 测试（自我测试，修改代码，提交修改）	120	60
Reporting	报告	140	100
· Test Report	· 测试报告	50	30
· Size Measurement	· 计算工作量	40	60
· Postmortem & Process Improvement Plan	· 事后总结, 并提出过程改进计划	60	60

3.思路

首先要有对java文件处理的，因为要读文件还要写文件；

对字符进行统计，就要会一些正则表达式去处理这些字符串；

为了便于管理项目还要将项目推到GitHub上；

4.程序设计实现

（1）首先要对输入的参数进行处理，得出要调用哪些功能模块；

　　　　为了满足需求中的输出顺序，我使用一个特定数组a[参数的个数]来表示每一个可能出现的参数存在不存在，存在则在相应的位置置1，扫描完参数之后，就可以按照需求规定的顺序进行处理了。详情请看代码

　　代码：

public static void main(String[] args) {
    int[] canshu=new int[5];//0-4分别表示字符、单词、行数、代码行数/空行数/注释行、递归处理 的参数存在不存在
    String file = new String();
    String outputFile = new String();
    String stopListFile = new String();
    int flag=0;
    for(int i=0;i<args.length;i++) {
        if (args[i].equals("-c")) canshu[0]=1;
        else if (args[i].equals("-w")) canshu[1]=1;
        else if (args[i].equals("-l")) canshu[2]=1;
        else if (args[i].equals("-a")) canshu[3]=1;
        else if (args[i].equals("-s")) canshu[4]=1;
        else if (args[i].equals("-o"))
        {
            if (i==args.length-1) erro("参数不匹配");
            if (Pattern.compile("\\w+\\.txt").matcher(args[i+1]).find())
            {
                outputFile=args[i+1];
                i++;
            }
            else {
                erro("输出文件名不正确");
            }
        }
        else if (args[i].equals("-e"))
        {
            if (i==args.length-1) erro("参数不匹配");
            if (Pattern.compile("\\w+\\.txt").matcher(args[i+1]).find())
            {
                stopListFile=args[i+1];
                i++;
            }
            else {
                erro("参数不匹配");
            }
        }
        else if (Pattern.compile("(\\w+|\\*)\\.\\w+").matcher(args[i]).find()) {
            if (i==0) erro("输入参数不正确");
            flag=1;
            file=args[i];
        }
        else {
            erro("参数不匹配");
        }
    }
    if (flag == 0) erro("参数不匹配");

    execute(canshu,file,stopListFile,outputFile);
}

（2）按照功能需求编写各个模块

　　　　将文件中的字符一次性读到String中：

static public String retext(String fileName) {
    if (fileName == null) {
        return null;
    }
    InputStream is = null;
    try {
        is = new FileInputStream(fileName);
    } catch (FileNotFoundException e) {
        e.printStackTrace();
        erro("找不到指定文件");
    }
    String text = null;
    try {
        byte[] b = new byte[is.available()];//available函数将会获取输入流中字节总数
        is.read(b);//根据前面获得的字节总数，一次性读出所有字节
        text = new String(b);
        is.close();
    } catch (FileNotFoundException var5) {
        var5.printStackTrace();
    } catch (IOException var6) {
        var6.printStackTrace();
    }
    return text;
}

统计单词数，行数：
使用String类中的split函数用正则表达式匹配进行分割

String[] strings = text.split("\\.|,|\\s");

将结果写入文件：

static public void output(String text, String outputFile) {
    try {
        FileOutputStream fos = new FileOutputStream(outputFile,true);//第二个参数Ture表示从文件末尾追加写入
        fos.write(text.getBytes());
        fos.close();
    }
    catch (Exception e) {
        System.out.println(e.getMessage());
    }
}

停用词表：
首先将词表文件读出，然后分割成各个词，再进行单词统计的时候判断单词时候是否再停用词表中，以此决定计数与否

返回代码行/空行/注释行：

按行读取文件，
判断空行的时候就匹配正则表达式中的\s，以及是否是}单独一行
判断注释的时候，对于“//”可以这样写正则表达式

\\s*//.*|\\}//.*}
对于“/**/”就需要再匹配到“/*”的时候置一个用于标记的flag为ture，之后的每一行都是注释，直到匹配到“*/”并将flag置为0；
详细代码：


public class CountLine {
    private int cntCode=0, cntNode=0, cntSpace=0;
    private boolean flagNode = false;
    public int[] reA(String fileName) {
        BufferedReader br = null;
        try {
            br = new BufferedReader(new FileReader(fileName));
            String line=null;
            while((line = br.readLine()) != null)
                pattern(line);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
        System.out.println("注释行： " + cntNode);
        System.out.println("空行： " + cntSpace);
        System.out.println("代码行： " + cntCode);
        System.out.println("总行： " + (cntNode+cntSpace+cntCode));
        return new int[]{cntCode, cntSpace, cntNode};
    }

    private  void pattern(String line) {
        // TODO Auto-generated method stub
        String regxNodeBegin = "\\s*/\\*.*";
        String regxNodeEnd = ".*\\*/\\s*";
        String regx = "\\s*//.*|\\}//.*}";
        String regxSpace = "(\\s*)|(\\s*(\\{|\\})\\s*)";
        int i=line.length();
        if(line.matches(regxNodeBegin) && line.matches(regxNodeEnd)){
            ++cntNode;
            return ;
        }
        if(line.matches(regxNodeBegin)){
            ++cntNode;
            flagNode = true;
        } else if(line.matches(regxNodeEnd)){
            ++cntNode;
            flagNode = false;
        } else if(line.matches(regxSpace)||line.equals("\uFEFF"))
            ++cntSpace;
        else if(line.matches(regx))
            ++cntNode;
        else if(flagNode)
            ++cntNode;
        else ++cntCode;
    }
}
5.测试设计过程
考虑到字符串为空的时候可能会有意想不到的bug 所以再测试过程中尽量的会去进行边界测试。

        //测试函数retext
        String text = WordCount.retext("src/momo/File.txt");
//        String text1 = WordCount.retext(" ");
        System.out.println(text);;
        //测试函数reWord
        int testReWord1 = WordCount.reWorld(null, null);
        int testReWord2 = WordCount.reWorld(text, "whetu.txt");
        //测试reCount函数
        int testReCount = WordCount.reCount(null);
        int testReCount1 = WordCount.reCount(text);
        //测试reLine函数
        int testReLine = WordCount.reLine(null);
        int testReLine1 = WordCount.reLine(text);
        //测试output函数
        WordCount.output(null,"whetu.txt");
        WordCount.output("123","whetu.txt");
6.参考文件链接

 java文件读取的几种方式https://www.cnblogs.com/hudie/p/5845187.html
 java正则表达式http://www.runoob.com/java/java-regular-expressions.html
 java中获取文件夹或则文件路径的方法https://www.cnblogs.com/tk55/p/6064160.html
 idea如何打包jarhttps://jingyan.baidu.com/article/7e4409531fbf292fc1e2ef51.html

第二周作业wordcount

标签：size mpi cat width buffered 多个详细 param write

原文地址：https://www.cnblogs.com/liqia/p/8608602.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行