Core Python | 2 - Core Python: Getting Started | 2.4 - Introducing Strings, Collections, and Iteration | 2.4.4 - Bytes

时间：2021-03-06 14:52:54 阅读：0 评论：0 收藏：0 [点我收藏+]

标签：contains ble nbsp you ams repr app png opera

Bytes are very similar to strings, except that rather than being sequences of Unicode code points, they are sequences of, well, bytes. As such, they are used for raw binary data and fixed?width single?byte character encodings such as ASCII. As with strings, they have a simple, literal form using quotes, the first of which is prefixed by a lower case b. There is also a bytes constructor, but it‘s an advanced feature and we won‘t cover it in this fundamentals course. At this point, it‘s sufficient for us to recognize bytes literals, and understand that they support most of the same operations as string, such as indexing, which returns the integer value of the specified byte, and splitting, which you‘ll see returns a list of bytes objects. To convert between bytes and strings, we must know the encoding of the byte sequence used to represent the string‘s Unicode code points as bytes. Python supports a wide variety of encodings, a full list of which can be found at python.org. Let‘s start with an interesting Unicode string which contains all the characters of the 29?letter Norwegian alphabet, a pangram. We‘ll now encode that using UTF?8 into a bytes object. See how the Norwegian characters have each been rendered as pairs of bytes. We can reverse that process using the decode method of the bytes object. Again, we must supply the correct encoding. We can check that the result is equal to what we started with, and display it for good measure. This may seem like an unnecessary detail so early in the course, especially if you operate in an anglophone environment, but it‘s a crucial point to understand since files and network resources such as HTTP responses are transmitted as byte streams, whereas we often prefer to work with the convenience of Unicode strings.

字节（bytes）与字符串（str）

字符串就是你说的话，human-reable，比如：I want my mother to call me eight times a day

当你要把这句话存储到计算机硬盘上，你要将它转换为计算机能懂的话，就是机器语言，"01010101..."，它由若干个比特（一个比特就是一个0或者一个1）组成，8个比特组成1个字节。

将huaman-readable转换成机器语言的过程叫做encode编码，反之叫做decode解码。文件和网络资源（如HTTP响应）是以字节流的形式传输的，所以这时候，也需要进行encode和decode转换

在Python3中，当涉及到encode和decode的时候，它会以默认的UTF-8的编码方法自动帮你encode和decode

技术图片

Python3可以直接对字节进行操作，方法类似于字符串，例如索引、切片。不过，为什么不把字节先decode成字符串，操作完之后，再encode回去呢？这样更直观

Core Python | 2 - Core Python: Getting Started | 2.4 - Introducing Strings, Collections, and Iteration | 2.4.4 - Bytes

标签：contains ble nbsp you ams repr app png opera

原文地址：https://www.cnblogs.com/hmlhml/p/14489434.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行