Hadoop学习笔记（2）

时间：2014-12-19 17:19:27 阅读：151 评论：0 收藏：0 [点我收藏+]

标签：

Hadoop序列化：Long 和Int---变长编码的方法：

如果整数在[ -112， 127] ，所需字节数为1，即第一个字节数就表示该值。

如果大于127，则第一个字节数在[-120,-113]之内，正数字节数为（-112-第一个字节）---最多八个字节。

如果小于-112，则第一个字节数在[-128,-121]之内，负数字节数为（-120-第一个字节）---最多八个字节。

原码-----反码（符号位不变，个位取反）-----补码（符号位不变，在反码基础上加1）

例如-5：

原码（1000 0101）----反码（1111 1010）----补码（1111 1011）

学习下one‘s complement ：

https://ccrma.stanford.edu/~jos/mdft/One_s_Complement_Fixed_Point_Format.html

One‘s Complement is a particular assignment of bit patterns to numbers. For example, in the case of 3-bit binary numbers, we have the assignments shown in Table G.2.

Table G.2: Three-bit one‘s-complement binaryfixed-pointnumbers.

In general, -bit numbers are assigned to binary counter values in the ``obvious way‘‘ as integers from 0 to , and then the negative numbers are assigned in reverse order, as shown in the example.

The term ``one‘s complement‘‘ refers to the fact that negating a number in this format is accomplished by simply complementing the bit pattern (inverting each bit).

Note that there are two representations for zero (all 0s and all 1s). This is inconvenient when testing if a number is equal to zero. For this reason, one‘s complement is generally not used。

two’s complement

https://ccrma.stanford.edu/~jos/mdft/Two_s_Complement_Fixed_Point_Format.html

In two‘s complement, numbers are negated by complementing the bit pattern and adding 1, with overflow ignored. From 0 to , positive numbers are assigned to binary values exactly as in one‘s complement. The remaining assignments (for the negative numbers) can be carried out using the two‘s complement negation rule. Regenerating the example in this way gives Table G.3.

Table G.3: Three-bit two‘s-complement binaryfixed-pointnumbers.

Note that according to our negation rule, . Logically, what has happened is that the result has ``overflowed‘‘ and ``wrapped around‘‘ back to itself. Note that also. In other words, if you compute 4 somehow, since there is no bit-pattern assigned to 4, you get -4, because -4 is assigned the bit pattern that would be assigned to 4 if were larger. Note that numerical overflows naturally result in ``wrap around‘‘ from positive to negative numbers (or from negative numbers to positive numbers). Computers normally ``trap‘‘ overflows as an ``exception.‘‘ The exceptions are usually handled by a software ``interrupt handler,‘‘ and this can greatly slow down the processing by the computer (one numerical calculation is being replaced by a rather sizable program).

Note that temporary overflows are ok in two‘s complement; that is, if you add to to get , adding to will give again. This is why two‘s complement is a nice choice: it can be thought of as placing all the numbers on a ``ring,‘‘ allowing temporary overflows of intermediate results in a long string of additions and/or subtractions. All that matters is that the final sum lie within the supported dynamic range.

Computers designed with signal processing in mind (such as so-called ``Digital Signal Processing (DSP) chips‘‘) generally just do the best they can without generating exceptions. For example, overflows quietly ``saturate‘‘ instead of ``wrapping around‘‘ (the hardware simply replaces the overflow result with the maximum positive or negative number, as appropriate, and goes on). Since the programmer may wish to know that an overflow has occurred, the first occurrence may set an ``overflow indication‘‘ bit which can be manually cleared. The overflow bit in this case just says an overflow happened sometime since it was last checked.

public static void writeVLong(DataOutput stream, long i) throws IOException {
    if (i >= -112 && i <= 127) {
      stream.writeByte((byte)i);
      return;
    }
      
    int len = -112;
    if (i < 0) {
      i ^= -1L; // take one‘s complement‘
      len = -120;
    }
      
    long tmp = i;
//多少个字节,就循环多少遍----求字节数

    while (tmp != 0) {
      tmp = tmp >> 8;
      len--;
    }
      
    stream.writeByte((byte)len);
      
    len = (len < -120) ? -(len + 120) : -(len + 112);
      
    for (int idx = len; idx != 0; idx--) {
      int shiftbits = (idx - 1) * 8;
      long mask = 0xFFL << shiftbits;//高位在前输出.
      stream.writeByte((byte)((i & mask) >> shiftbits));
    }
  }

Hadoop学习笔记（2）

标签：

原文地址：http://www.cnblogs.com/dorothychai/p/4174168.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行