标签:array abs tran raw fas round sch example pre
转自: https://github.com/TWKB/Specification/blob/master/twkb.md
Version
Release
0.23
May 1, 2015
TWKB is a multi-purpose format for serializing vector geometry data into a byte buffer, with an emphasis on minimizing size of the buffer.
The original OGC "well-known binary" format is a simple format, and is capable of easily representing complex OGC geometries like nested collections, but it has two important drawbacks for use as a production serialization:
A new serialization format can address the problem of alignment, or the problem of size, but not both. Given that most current client/server perfomance issues are bottlenecked on network transport times, TWKB concentrates on solving the problem of serialization size.
TWKB applies the following principles:
Every TWKB geometry contains standard attributes at the top of the object.
Size: 1 byte, holding geometry type and number of dimensions
The type-byte stores both the geometry type, and the dimensionality of the coordinates of the geometry.
Bits
Role
Purpose
1-4
Unsigned Integer
Geometry type
5-8
Signed Integer
Geometry precision
Bits 1-4 store the geometry type (there is space for 16 and we currently use 7):
Bits 5-8 store the "precision", which refers to the number of base-10 decimal places stored.
A positive precision implies retaining information to the right of the decimal place
A negative precision implies rounding up to the left of the decimal place
In order to support negative precisions, the precision number should be stored using zig-zag encoding (see "ZigZag Encode" below).
type & 0x0F
).
(type & 0xF0) >> 4
).Size: 1 byte
The metadata byte of TWKB is mandatory, and encodes the following information:
Bits
Role
Purpose
1
Boolean
Is there a bounding box?
2
Boolean
Is there a size attribute?
3
Boolean
Is there an ID list?
4
Boolean
Is there extended precision information?
5
Boolean
Is this an empty geometry?
6-8
Boolean
Unused
Size: 1 byte
For some coordinate reference systems, and dimension combinations, it makes no sense to store all dimensions using the same precision.
For example, for data with X/Y/Z/T dimensions, using a geographic coordinate system, it might make sense to store the X/Y (longitude/latitude) dimensions with precision of 6 (about 10 cm), the Z with a precision of 1 (also 10 cm), and the T with a precision of 0 (whole seconds). A single precision number cannot handle this case, so the extended precision slot is optionally available for it.
The extended precision byte holds:
Bits
Role
Purpose
1
Boolean
Geometry has Z coordinates?
2
Boolean
Geometry has M coordinates?
3-5
Unsigned Integer
Precision for Z coordinates.
6-8
Unsigned Integer
Precision for M coordinates.
The extended precision values are always positive (only deal with digits to the left of the decimal point)
(extended & 0x01)
(extended & 0x02)
(extended & 0x1C) >> 2
(extended & 0xE0) >> 5
Size: 1 unsigned varint (so, variable size)
If the size attribute bit is set in the metadata header, a varInt with size information comes next. The value is the size in bytes of the remainder of the geometry after the size attribute.
When encountered in collections, an application can use the size attribute to advance the read pointer the the start of the next geometry. This can be used for a quick scan through a set of geometries, for reading just bounding boxes, or to distibute the read process to different threads.
Size: 2 signed varints per dimension (also variable size)
Each dimension of a bounding box is represented by a pair of varints:
So, for example:
Size: N signed varints, one per sub-geometry
The TWKB collection types (multipoint, multilinestring, multipolygon, geometrycollection)
In the latter case, it makes sense to include a unique identifier for each sub-geometry that is being wrapped into the collection. The "idlist" attribute is an array of varint that has one varint for each sub-geometry in the collection.
Every object contains a point array, which is an array of coordinates, stored as varints. The final values in the point arrays are a result of four steps:
The result is a very compact representation of the coordinates.
All storage using varints, which are integers. However, before being encoded most coordinate are doubles. How are the double coordinates converted into integers? And how are the integers converted ino the final array of varints?
Each coordinate is multiplied by the geometry precision value (from the metadata header), and then rounded to the nearest integer value (round(doublecoord * 10^precision)
). When converting from TWKB back to double precision, the reverse process is applied.
Rather than storing the absolute position of every vertex, we store the difference between each coordinate and the previous coordinate of that dimension. So the coordinate values of a point array are:
x1, y1
(x2 - x1), (y2 - y1)
(x3 - x2), (y3 - y2)
(x4 - x3), (y4 - y3)
...
(xn - xn-1), (yn - yn-1)
Delta values of integer coordinates are nice and small, and they store very compactly as varints.
Every coordinate value in a geometry except the very first one is a delta value relative to the previous value processed. Examples:
Basically, implementors should always keep the value of the previous coordinate in hand while processing a geometry to use to difference against the incoming coordinate. By setting the very first "previous coordinate" value to [0,0] it is possible to use the same differencing code throughout, since the first value processed will thus receive a "delta" value of itself.
The varint scheme for indicating a value larger than one byte involves setting the "high bit", which is the same thing that standard integers use for storing negative numbers.
So negative numbers need to first be encoded using a "zig zag" encoding, which keep absolute values small, but do not set the high bit as part of encoding negative numbers.
The effect is to convert small negative numbers into small positive numbers, which is exactly what we want, since delta values will tend to be small negative and positive numbers:
-1 => 1
1 => 2
-2 => 3
2 => 4
A computationally fast formula to generate the zig-zag value is
/* for 32-bit signed integer n */
unsigned int zz = (n << 1) ^ (n >> 31)
/* for 64-bit signed integer n */
unsigned long zz = (n << 1) ^ (n >> 63)
Variable length integers are a clever way of only using as many bytes as necessary to store a value. The method is described here:
https://developers.google.com/protocol-buffers/docs/encoding#varints
The varint scheme sets the high bit of a byte as a flag to indicate when more bytes are needed to fully represent a number. Decoding a varint accumulates the information in each flagged byte until an unflagged byte is found and the integer is complete and ready to return.
Because points only have one coordinate, the coordinates will be the true values (not relative) except in cases where the point is embedded in a collection.
Bounding boxes are permitted on points, but discouraged since they just duplicate the values already available in the point coordinate. Similarly, unless the point is part of a collection (where random access is a possible use case), the size should also be omitted.
type_and_prec byte
metadata_header byte
[extended_dims] byte
[size] uvarint
[bounds] bbox
pointarray varint[]
A linestring has, in addition to the standard metadata:
The layout is:
type_and_prec byte
metadata_header byte
[extended_dims] byte
[size] uvarint
[bounds] bbox
npoints uvarint
pointarray varint[]
A polygon has, in addition to the standard metadata:
The layout is:
type_and_prec byte
metadata_header byte
[extended_dims] byte
[size] uvarint
[bounds] bbox
nrings uvarint
npoints[0] uvarint
pointarray[0] varint[]
...
npoints[n] uvarint
pointarray[n] varint[]
A multipoint has, in addition to the standard metadata:
The layout is:
type_and_prec byte
metadata_header byte
[extended_dims] byte
[size] uvarint
[bounds] bbox
npoints uvarint
[idlist] varint[]
pointarray varint[]
A multilinestring has, in addition to the standard metadata:
The layout is:
type_and_prec byte
metadata_header byte
[extended_dims] byte
[size] uvarint
[bounds] bbox
nlinestrings uvarint
[idlist] varint[]
npoints[0] uvarint
pointarray[0] varint[]
...
npoints[n] uvarint
pointarray[n] varint[]
A multipolygon has, in addition to the standard metadata:
The layout is:
type_and_prec byte
metadata_header byte
[extended_dims] byte
[size] uvarint
[bounds] bbox
npolygons uvarint
[idlist] varint[]
nrings[0] uvarint
npoints[0][0] uvarint
pointarray[0][0] varint[]
...
nrings[n] uvarint
npoints[n][m] uvarint
pointarray[n][m] varint[]
A geometrycollection has, in addition to the standard metadata:
The layout is:
type_and_prec byte
metadata_header byte
[extended_dims] byte
[size] varint
[bounds] bbox
ngeometries varint
[idlist] varint[]
geom twkb[]
标签:array abs tran raw fas round sch example pre
原文地址:https://www.cnblogs.com/gispathfinder/p/13554506.html