标记语言的比较
- XML
最早的通用信息标记语言,可扩展性好,但繁琐,html可以看作XML的子集
- JSON
信息有类型,适合程序处理(js),较XML简洁。移动应用云端和节点的信息通信,无注释
- YAML
信息无类型,文本信息比例最高,可读性好。各类系统的配置文件,有注释易读。
YAML 是专门用来写配置文件的语言,非常简洁和强大,远比 JSON 格式方便。
YAML在python语言中有PyYAML安装包,下载地址:https://pypi.python.org/pypi/PyYAML
YAML简介
YAML 语言(发音 /?j?m?l/ )的设计目标,就是方便人类读写。它实质上是一种通用的数据串行化格式。
它的基本语法规则如下:
大小写敏感
使用缩进表示层级关系
缩进时不允许使用Tab键,只允许使用空格。
缩进的空格数目不重要,只要相同层级的元素左侧对齐即可
#
表示注释,从这个字符一直到行尾,都会被解析器忽略,这个和python的注释一样
YAML 支持的数据结构有三种:
对象:键值对的集合,又称为映射(mapping)/ 哈希(hashes) / 字典(dictionary)
数组:一组按次序排列的值,又称为序列(sequence) / 列表(list)
纯量(scalars):单个的、不可再分的值。字符串、布尔值、整数、浮点数、Null、时间、日期
数据结构可以用类似大纲的缩排方式呈现,结构通过缩进来表示,
数组
YAML流是一个零或多个数组的集合。空流不包含数组,数组使用---
进行分离,数组可以选择以...
结尾
数组可以选择以---
进行标记。
- 隐式数组示例
- Multimedia
- Internet
- Education
- 一个显式数组的例子
---
- Afterstep
- CTWM
- Oroborus
...
同一流程中的几个数组示例
---
- Ada
- APL
- ASP
- Assembly
- Awk
---
- Basic
---
- C
- C# # Note that comments are denoted with ' #' (space then #).
- C++
- Cold Fusion
序列
连续的项目通过减号“-”来表示,
# YAML
- The Dagger 'Narthanc'
- The Dagger 'Nimthanc'
- The Dagger 'Dethanc'
# Python
["The Dagger 'Narthanc'", "The Dagger 'Nimthanc'", "The Dagger 'Dethanc'"]
分段可以进行嵌套
# YAML
-
- HTML
- LaTeX
- SGML
- VRML
- XML
- YAML
-
- BSD
- GNU Hurd
- Linux
# Python
[['HTML', 'LaTeX', 'SGML', 'VRML', 'XML', 'YAML'], ['BSD', 'GNU Hurd', 'Linux']]
不需要用新行来启动嵌套的序列
# YAML
- 1.1
- - 2.1
- 2.2
- - - 3.1
- 3.2
- 3.3
# Python
[1.1, [2.1, 2.2], [[3.1, 3.2, 3.3]]]
块序列可以嵌套到块映射中。注意,在这种情况下,没有必要缩进序列。
# YAML
left hand:
- Ring of Teleportation
- Ring of Speed
right hand:
- Ring of Resist Fire
- Ring of Resist Cold
- Ring of Resist Poison
# Python
{'right hand': ['Ring of Resist Fire', 'Ring of Resist Cold', 'Ring of Resist Poison'],
'left hand': ['Ring of Teleportation', 'Ring of Speed']}
对象
map结构里面的key/value对用冒号“:”来分隔。
# YAML
base armor class: 0
base damage: [4,4]
plus to-hit: 12
plus to-dam: 16
plus to-ac: 0
# Python
{
'plus to-hit': 12,
'base damage': [4, 4],
'base armor class': 0,
'plus to-ac': 0,
'plus to-dam': 16
}
用复杂的键用?和类型表示
# YAML
? !!python/tuple [0,0]
: The Hero
? !!python/tuple [0,1]
: Treasure
? !!python/tuple [1,0]
: Treasure
? !!python/tuple [1,1]
: The Dragon
# Python
{(0, 1): 'Treasure', (1, 0): 'Treasure', (0, 0): 'The Hero', (1, 1): 'The Dragon'}
对象同样可以进行嵌套
# YAML
hero:
hp: 34
sp: 8
level: 4
orc:
hp: 12
sp: 0
level: 2
# Python
{'hero': {'hp': 34, 'sp': 8, 'level': 4}, 'orc': {'hp': 12, 'sp': 0, 'level': 2}}
对象可以嵌套在数组中
# YAML
- name: PyYAML
status: 4
license: MIT
language: Python
- name: PySyck
status: 5
license: BSD
language: Python
# Python
[{'status': 4, 'language': 'Python', 'name': 'PyYAML', 'license': 'MIT'},
{'status': 5, 'license': 'BSD', 'name': 'PySyck', 'language': 'Python'}]
流集合
YAML中的流集合的语法非常接近于Python中的list和dictionary构造函数的语法:
# YAML
{ str: [15, 17], con: [16, 16], dex: [17, 18], wis: [16, 16], int: [10, 13], chr: [5, 8] }
# Python
{
'dex': [17, 18],
'int': [10, 13],
'chr': [5, 8],
'wis': [16, 16],
'str': [15, 17],
'con': [16, 16]
}
纯量
纯量是最基本的、不可再分的值。以下数据类型都属于 JavaScript 的纯量。
- 字符串
- 布尔值
- 整数
- 浮点数
- Null
- 时间
- 日期
PyYaml
python 主要解析YAML的库
Loading YAML
>>> yaml.load("""
... - Hesperiidae
... - Papilionidae
... - Apatelodidae
... - Epiplemidae
... """)
['Hesperiidae', 'Papilionidae', 'Apatelodidae', 'Epiplemidae']
pyyaml支持unicode
>>> yaml.load(u"""
... hello: Привет!
... """) # In Python 3, do not use the 'u' prefix
{'hello': u'\u041f\u0440\u0438\u0432\u0435\u0442!'}
>>> stream = file('document.yaml', 'r') # 'document.yaml' contains a single YAML document.
>>> yaml.load(stream)
[...] # A Python object corresponding to the document.
复杂例子
>>> documents = """
... ---
... name: The Set of Gauntlets 'Pauraegen'
... description: >
... A set of handgear with sparks that crackle
... across its knuckleguards.
... ---
... name: The Set of Gauntlets 'Paurnen'
... description: >
... A set of gauntlets that gives off a foul,
... acrid odour yet remains untarnished.
... ---
... name: The Set of Gauntlets 'Paurnimmen'
... description: >
... A set of handgear, freezing with unnatural cold.
... """
>>> for data in yaml.load_all(documents):
... print data
{'description': 'A set of handgear with sparks that crackle across its knuckleguards.\n',
'name': "The Set of Gauntlets 'Pauraegen'"}
{'description': 'A set of gauntlets that gives off a foul, acrid odour yet remains untarnished.\n',
'name': "The Set of Gauntlets 'Paurnen'"}
{'description': 'A set of handgear, freezing with unnatural cold.\n',
'name': "The Set of Gauntlets 'Paurnimmen'"}
转化为Python类型
>>> yaml.load("""
... none: [~, null]
... bool: [true, false, on, off]
... int: 42
... float: 3.14159
... list: [LITE, RES_ACID, SUS_DEXT]
... dict: {hp: 13, sp: 5}
... """)
{'none': [None, None], 'int': 42, 'float': 3.1415899999999999,
'list': ['LITE', 'RES_ACID', 'SUS_DEXT'], 'dict': {'hp': 13, 'sp': 5},
'bool': [True, False, True, False]}
即使是Python类的实例也可以使用!!python/object
对象标签。
>>> class Hero:
... def __init__(self, name, hp, sp):
... self.name = name
... self.hp = hp
... self.sp = sp
... def __repr__(self):
... return "%s(name=%r, hp=%r, sp=%r)" % (
... self.__class__.__name__, self.name, self.hp, self.sp)
>>> yaml.load("""
... !!python/object:__main__.Hero
... name: Welthyr Syxgon
... hp: 1200
... sp: 0
... """)
Hero(name='Welthyr Syxgon', hp=1200, sp=0)
Dumping YAML
将Python对象序列化
>>> print yaml.dump({'name': 'Silenthand Olleander', 'race': 'Human',
... 'traits': ['ONE_HAND', 'ONE_EYE']})
name: Silenthand Olleander
race: Human
traits: [ONE_HAND, ONE_EYE]
转化数据流
>>> stream = file('document.yaml', 'w')
>>> yaml.dump(data, stream) # Write a YAML representation of data to 'document.yaml'.
>>> print yaml.dump(data) # Output the document to the screen.
深层序列化
>>> print yaml.dump([1,2,3], explicit_start=True)
--- [1, 2, 3]
>>> print yaml.dump_all([1,2,3], explicit_start=True)
--- 1
--- 2
--- 3
类的序列化
>>> class Hero:
... def __init__(self, name, hp, sp):
... self.name = name
... self.hp = hp
... self.sp = sp
... def __repr__(self):
... return "%s(name=%r, hp=%r, sp=%r)" % (
... self.__class__.__name__, self.name, self.hp, self.sp)
>>> print yaml.dump(Hero("Galain Ysseleg", hp=-3, sp=2))
!!python/object:__main__.Hero {hp: -3, name: Galain Ysseleg, sp: 2}
序列化字符串的格式和数据类型
>>> print yaml.dump(range(50))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42,
43, 44, 45, 46, 47, 48, 49]
>>> print yaml.dump(range(50), width=50, indent=4)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27,
28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49]
>>> print yaml.dump(range(5), canonical=True)
---
!!seq [
!!int "0",
!!int "1",
!!int "2",
!!int "3",
!!int "4",
]
>>> print yaml.dump(range(5), default_flow_style=False)
- 0
- 1
- 2
- 3
- 4
>>> print yaml.dump(range(5), default_flow_style=True, default_style='"')
[!!int "0", !!int "1", !!int "2", !!int "3", !!int "4"]