2015/8/31 Python基础(5):字符串

时间：2015-08-31 23:06:43 阅读：217 评论：0 收藏：0 [点我收藏+]

标签：

字符串是Python最常见的一种类型。通过在引号间包含字符的方式创建它。Python里单双引号的作用是一致的。Python的对象类型里不存在字符型，一般用单个字符的字符串来使用。
Python的字符串是一种直接量或者说标量，Python解释器在处理字符串时把它作为单一值并且不会包含其他Python类型的。Python的字符串也是不可改变类型。字符串里的字符可以通过切片操作访问。
Python有3类字符串，通常意义字符串(str)，Unicode字符串(unicode)和抽象类字符串(basestring)。实际上前两者是最后一个的子类。而basestring是不能实例化的，如果试图实例化，会得到以下的报错信息。

>>> basestring(‘foo‘)
Traceback (most recent call last): File "<stdin>", line 1, in <module>
TypeError: The basestring type cannot be instantiated

字符串的创建和赋值
创建一个字符串很简单，可以直接创建，也可以用str()这样的工厂函数创建。

>>> string1 = ‘Pyhton‘
>>> string2 = "easy" #单双引号等价
>>> string3 = str(123)
>>> string4 = str(range(4))
>>> string1
‘Pyhton‘
>>> string2
‘easy‘
>>> string3
‘123‘
>>> string4
‘[0, 1, 2, 3]‘

访问字符串的字符和子串用直接索引或切片运算符

>>> aString = ‘Hello World!‘
>>> aString[0]
‘H‘
>>> aString[1:5]
‘ello‘
>>> aString[6:]
‘World!‘

改变字符串，用赋值的方式“更新”字符串。
跟数字类型一样，字符串类型也是不可变的，每次更新都是创建新串。
删除字符和字符串
因为字符串是不可变的，要删除字符只能通过创建新串的方式实现。

>>> aString = ‘Hello World!‘
>>> aString = aString[:3] + aString[4:]
>>> aString
‘Helo World!‘

而删除字符串可以通过赋值一个空字符串或者del语句来清空或删除一个字符串。
在大部分应用程序里，没有必要显示删除字符串

字符串的大部分操作符是序列操作符部分，参看之前笔记。
下面是字符串和成员操作符的例子
Python的string模块里有如下预定义的字符串

>>> import string
>>> string.ascii_uppercase
‘ABCDEFGHIJKLMNOPQRSTUVWXYZ‘
>>> string.ascii_lowercase
‘abcdefghijklmnopqrstuvwxyz‘
>>> string.ascii_letters
‘abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ‘
>>> string.digits
‘0123456789‘

我们将用这些东西来做一个Python有效标识符的小脚本

import string

alphas = string.letters + ‘_‘
nums = string.digits

print ‘Welcome to the Identifier Checker v1.0‘
print ‘Testees must be at least 2 chars long.‘
inp = raw_input(‘Indentifier to test?‘)

if len(inp) > 1:
　　if inp[0] not in alphas:
　　　　print ‘invalid: first symbol must be alphabetic‘
　　else:
　　　　for otherChar in inp[1:]:
　　　　　　if otherChar not in alphas + nums:
　　　　　　　　print ‘invalid: remaining symbols must be alphanumeric‘
　　　　　　　　break
　　　　else:
　　　　　　print ‘okay as an identifier‘

核心提示：性能
一般来说，从性能来考虑，把重复操作作为参数放到循环里是低效的

while i < len(myString):
　　print ‘character %d is:‘, myString[i]

这里把大部分操作浪费到重复计算myString上。如果保存这个值就可以更高效地循环。

length = len(myString)
while i < length:
　　print ‘character %d is:‘, myString[i]

而上述例子中也存在重复的问题：

for otherChar in inp[1:]:
　　if otherChar not in alphas + nums:
　　...

每次都进行一次合并操作是低效的。可以如下操作：

alphnums = alphas + nums
for otherChar in inp[1:]:
　　if otherChar not in alphas + nums:

这个脚本是课本上的例程，并不完美，一是需要让标识符的长度大于1，而是没有考虑Python的关键字。所以为了解决这两个问题，我写了如下的脚本：

#! usr/bin/env python

import string
import keyword

alphas = string.letters + ‘_‘
nums = string.digits

print ‘Welcome to the Identifier Checker v2.0‘
inp = raw_input(‘Indentifier to test:‘)
if inp in keyword.kwlist:
　　print ‘It can‘‘t be a keyword.‘
elif len(inp) > 0:
　　if inp[0] not in alphas:
　　　　print ‘invalid: first symbol must be alphabetic‘
　　else:
　　　　for otherChar in inp[1:]:
　　　　　　if otherChar not in alphas + nums:
　　　　　　　　print ‘invalid: remaining symbols must be alphanumeric‘
　　　　　　　　break
　　　　else:
　　　　　　print ‘okay as an identifier‘
else:
　　print ‘It can‘‘t be None.‘

Python可以用 + 连接字符串，除此以外还有一种习惯用法。

>>> abc = ‘Hello‘‘World‘
>>> abc
‘HelloWorld‘

这种写法可以将字符串分成几部分来写，可以在换行时使用。
这个写法一样可以混用两种引号。
将普通字符串和Unicode字符串连接时，会转换成Unicode字符串

只适用于字符串的操作符

1、格式化操作符 %

格式化字符	转换方式
%c	转换成字符(ASCII 码值，或者长度为一的字符串)
%r	优先用repr()函数进行字符串转换
%s	优先用str()函数进行字符串转换
%d / %i	转成有符号十进制数
%u	转成无符号十进制数
%o	转成无符号八进制数
%x/%X	(Unsigned)转成无符号十六进制数(x/X 代表转换后的十六进制字符的大小写)
%e/%E	转成科学计数法(e/E 控制输出e/E)
%f/%F	转成浮点数(小数部分自然截断)
%g/%G	%e 和%f/%E 和%F 的简写
%%	输出%

这是只适用于字符串类型的操作符，和C语言中printf()的字符串格式化非常相似，包括符号都一致。
还有如下的格式化操作符辅助指令

符号	作用
*	定义宽度或者小数点精度
-	用做左对齐
+	在正数前面显示加号( + )
<sp>	在正数前面显示空格
#	在八进制数前面显示零(‘0‘)，在十六进制前面显示‘0x‘或者‘0X‘(取决于用的是‘x‘还是‘X‘)
%	‘%%‘输出一个单一的‘%‘
(var)	映射变量(字典参数)
m.n	m 是显示的最小总宽度,n 是小数点后的位数(如果可用的话)

如下使用示例

>>> ‘%x‘%108
‘6c‘
>>> ‘%X‘ %108
‘6C‘
>>> ‘%#X‘ % 108
‘0X6C‘
>>> ‘%f‘ % 1234.567890
‘1234.567890‘
>>> ‘%.2f‘ % 1234.567890
‘1234.57‘
>>> ‘%E‘ % 1234.567890
‘1.234568E+03‘
>>> ‘%e‘ % 1234.567890
‘1.234568e+03‘
>>> ‘%g‘ % 1234.567890
‘1234.57‘
>>> ‘%G‘ % 1234.567890
‘1234.57‘
>>> ‘%e‘ %(1111111111111)
‘1.111111e+12‘
>>> ‘%+d‘ % 4
‘+4‘
>>> ‘%+d‘ % -4
‘-4‘
>>> ‘we are at %d%%‘ % 100
‘we are at 100%‘
>>> ‘Your host is: %s‘ %‘earth‘
‘Your host is: earth‘
>>> ‘Host: %s\tPort: %d‘ % (‘mars‘, 80)
‘Host: mars\tPort: 80‘
>>> num = 13
>>> ‘dec: %d/oct: %#o/hex: %#X‘ % (num, num, num)
‘dec: 13/oct: 015/hex: 0XD‘
>>> "MM/DD/YY = %02d/%02d/%d" % (2, 15, 67)
‘MM/DD/YY = 02/15/67‘
>>> w, p = ‘Web‘, ‘page‘
>>> ‘http://xxx.yyy.zzz/%s/%s.html‘ % (w, p)
‘http://xxx.yyy.zzz/Web/page.html‘

字符串格式化操作符还是个调试工具。所有的Python对象都有一个字符串表示形式
print语句自动为每个对象调用str()

2、字符串模板
字符串的缺点是它不是那么直观，比如说用字典形式转换出现遗漏转换类型符号的错误。为了保证转换正确，必须记住转换类型参数。
新式的字符串模板的优势是不用去记住所有的相关细节。而是用美元符号($)
Template对象有两个方法，substitue()和safe_substitue().前者严谨，在key缺少的情况下它会报一个KeyError的异常，后者在缺少时，直接原封不动地显示字符串。

>>> from string import Template
>>> s = Template(‘There are ${howmany} ${lang} Quotation Symbols‘)
>>> print s.substitute(lang=‘Python‘, howmany=3)
There are 3 Python Quotation Symbols
>>> print s.substiture(lang=‘Python‘)

Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
print s.substiture(lang=‘Python‘)
AttributeError: ‘Template‘ object has no attribute ‘substiture‘
>>> print s.safe_substitute(lang=‘Python‘)
There are ${howmany} Python Quotation Symbols

3、原始字符串操作符(r/R)
有些字符是特殊字符转义字符，我们需要直接打印它们时会很麻烦。
所以Python提供了原始字符串，在原始字符串中，所有的字符只是字面意思，没有其他意义。

>>> ‘\n‘
‘\n‘
>>> print ‘\n‘
>>> r‘\n‘
‘\\n‘
>>> print r‘\n‘
\n

4、Unicode字符串操作符(u/U)
使用方法和原始字符操作符一致，将标准字符串对象转成Unicode字符串对象。

内建函数
标准内建函数的cmp()不赘述
序列类型函数
len()
max()和min()
enumerate()
zip()
除了zip()函数以外，都在之前博文中讲到了。zip()的作用如下：

>>> s, t = ‘foa‘, ‘obr‘
>>> zip(s, t)
[(‘f‘, ‘o‘), (‘o‘, ‘b‘), (‘a‘, ‘r‘)]

字符串类型函数
raw_input()
用给定字符串提示用户输入并将输入返回。
str()和unicode()
不赘述
chr(),unichr()和ord()
chr()是输入一个范围在0到255的整数参数，返回一个对应的字符串。
unichr()是返回Unicode字符，范围也更大。
ord()是chr()的配对函数，给字符返回ascii参数

待续

2015/8/31 Python基础(5):字符串

标签：

原文地址：http://www.cnblogs.com/SRL-Southern/p/4774352.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行