首页 > 编程语言 > 详细

java 字符和字节的关系

时间：2014-11-02 22:36:00 阅读：213 评论：0 收藏：0 [点我收藏+]

标签：style http io color ar os 使用 java for

问题：

Java中中英文分别占几个字节？如果数据量很大，或者存储空间不足的时候，可能需要考虑字节的占用大小，用于估计使用机器的数量。

方案：

很简单的：

1个字符=2个字节

1个字节=8位

1个英文字符占一个字节，也就是0.5个字符

1个中文字符占2-4个字节，这个需要区分编码情况，具体如下：

UTF8编码下： 1个中文字符占3个字节（少数占4个字节）

GBK编码下： 1个中文字符占2个字节

UTF16编码下：1个中文字符占2个字节，Unicode扩展区的一些汉字存储需要4个字节

上面其实不好记，最好的方法是放到程序里面去实际运行一下看看。

例子：

01

/**

02

 *

03

 *
 描述：Java字符和字节测试例子

04

 *
 类名：BytesDemo.java

05

 *
 @author dutycode

06

 *
 @weibo ideaduty

07

 *
 @email dutycode@gmail.com

08

 *
 @website http://www.dutycode.com

09

 *
 2014-11-2

10

 *
 @version 1.0.1

11

 */

12

public class BytesDemo
 {

13

14

    public static void main(String[]
 args) {

15

        String
 e1 = "english";

16

        String
 c1 = "中文";

17

18

        byte[]
 eb1 = e1.getBytes();

19

        byte[]
 cb1 = c1.getBytes();

20

21

        byte[]
 ebUTF8 = e1.getBytes(Charset.forName("utf8"));

22

        byte[]
 cbUTF8 = c1.getBytes(Charset.forName("utf8"));

23

24

        byte[]
 ebGbk = e1.getBytes(Charset.forName("gbk"));

25

        byte[]
 cbGbk = c1.getBytes(Charset.forName("gbk"));

26

27

        System.out.println("英文字符：" +
 eb1.length);

28

        System.out.println("中文字符,默认（UTF8）:" +
 cb1.length);

29

        System.out.println("英文字符,（UTF8）:" +
 ebUTF8.length);

30

        System.out.println("中文字符,（UTF8）:" +
 cbUTF8.length);

31

        System.out.println("英文字符,（GBK）:" +
 ebGbk.length);

32

        System.out.println("中文字符,（GBK）:" +
 cbGbk.length);

33

    }

34

}

运行结果：

java 字符和字节的关系

标签：style http io color ar os 使用 java for

原文地址：http://blog.csdn.net/losetowin/article/details/40717009

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行

友情链接

兰亭集智国之画百度统计站长统计阿里云 chrome插件新版天听网

关于我们 - 联系我们 - 留言反馈

迷上了代码！