Java(面试题)：字符串截取

时间：2017-06-23 23:04:09 阅读：177 评论：0 收藏：0 [点我收藏+]

标签：默认结果 imp blog 字符串截取 class property 处理 length

在Java中，字符串“abcd”与字符串“ab你好”的长度是一样，都是四个字符。
但对应的字节数不同，一个汉字占两个字节。
定义一个方法，按照指定的字节数来取子串。
如：对于“ab你好”，如果取三个字节，那么子串就是ab与“你”字的半个，那么半个就要舍弃。如果取四个字节就是“ab你”，取五个字节还是“ab你”。

上面给出的是在gbk编码下的截取字符串。
下面我写了个代码，可以在utf-8和gbk编码下都能截取字符串。

注意：utf-8下的绝大多数汉字都是3个字节，所以，为了简化，全部当成了3个字节处理。

注意：
在上一个中，我把题意理解错了，其实题目要求的只是输出第一个n字节的字串就可以了。
在上一个中我是把一个字符串按照n拆分了。。。。

package io.app;

import java.io.IOException;

import org.junit.Test;

/**
 * 
 * @author 陈浩翔
 *
 * @version 1.0  2016-4-28
 */
public class StringCut {


    public static void main(String[] args) {

        String str = "ab你好a琲琲";
        byte bf[] = str.getBytes();//这里是采用默认编码，可能是GBK，也可能是UTF-8
        for(int i=0;i<=bf.length;i++){
            String res;
            try {
                res = cutString(str,i);
                System.out.println(i+" : "+res);
            } catch (IOException e) {
                e.printStackTrace();
            }

        }


    }

    /**
     * 根据传入的字符串，来判断是什么编码的，分别导向不同的方法
     * @param str
     * @param len
     * @return
     * @throws IOException 
     */
    private static String cutString(String str, int len) throws IOException {
        //System.getProperty("file.encoding")---获得系统的编码
        if(System.getProperty("file.encoding").equalsIgnoreCase("gbk")){
            return cutStringGbk(str, len);
        }
        if(System.getProperty("file.encoding").equalsIgnoreCase("utf-8")){
            return cutStringUtf8(str, len);
        }
        throw new RuntimeException("不支持当前系统的编码");
    }

    private static String cutStringUtf8(String str, int len) throws IOException {
        byte buf[] = str.getBytes("utf-8");
        int count=0;
        for(int i=len-1;i>=0;i--){
            if(buf[i]<0){
                count++;
            }else{
                break;
            }
        }
        int x = count%3;
        return new String(buf,0,len-x,"utf-8");
    }

    private static String cutStringGbk(String str, int len) throws IOException {
        byte buf[] = str.getBytes("gbk");
        int count=0;
        for(int i=len-1;i>=0;i--){
            if(buf[i]<0){
                count++;
            }else{
                break;
            }
        }
        if(count%2==0){
            return new String(buf,0,len,"gbk");
        }else{
            return new String(buf,0,len-1,"gbk");
        }
    }


    @Test
    /**
     * 可以不需要main方法进行运行单个方法！！！！
     * @throws IOException
     */
    public void analyze() throws IOException {
        //String str ="ab你好";
        String str ="ab你好a琲琲琲";
        //byte buf[] = str.getBytes("gbk");
        byte buf[] = str.getBytes("utf-8");
        for(byte b:buf){
            System.out.print(b+" ");
        }
        System.out.println();
    }


}

GBK下的运行结果：
（汉字为2个字节）

0 : 
1 : a
2 : ab
3 : ab
4 : ab你
5 : ab你
6 : ab你好
7 : ab你好a
8 : ab你好a
9 : ab你好a琲
10 : ab你好a琲
11 : ab你好a琲琲

UTF-8下的运行结果：
（汉字理解为3个字节）

0 : 
1 : a
2 : ab
3 : ab
4 : ab
5 : ab你
6 : ab你
7 : ab你
8 : ab你好
9 : ab你好a
10 : ab你好a
11 : ab你好a
12 : ab你好a琲
13 : ab你好a琲
14 : ab你好a琲
15 : ab你好a琲琲

Java(面试题)：字符串截取

标签：默认结果 imp blog 字符串截取 class property 处理 length

原文地址：http://www.cnblogs.com/ckwblogs/p/7071787.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行