标签:
By David Chait, Tegra Developer Technology
Since the dawn of the GPU, developers have been trying to cram bigger and better textures into memory. Sometimes that is accomplished with more RAM but more often it is achieved with native support for compressed texture formats. The objective of texture compression is to reduce data size, while minimizing impact on visual quality.
With the flood of mobile devices came even more urgency to use texture compression. Memory was typically shared with the CPU and thus even more of a scarce resource. In addition, mobile data networks were slow, and storage in the device was small, so smaller data over the wire was important.
Also important to the mobile space is power use. For textures, there are many places where smaller data saves power: reading from storage, any unpack/unzip or transcode step, writing data to memory, and reading that data during rendering.
This guide aims to provide developers with the following:
An introduction to the modern texture formats available on the newest chips (specifically focusing on ASTC).
A quick-start guide for applying these new formats to existing assets.
A list of some texture compression tools.
Advice as to how developers can write their own internal "design guide".
The overall goal is to assist developers in achieving their desired balance between image quality and texture size. In addition, at the end is a more detailed walk through of some real-world compression testing results.
For developers who want to "get right to it", consider the general guidance below to try first when targeting the latest GPU hardware. This jumps ahead of the discussion in the following chapters, and describes one possible approach to immediately leverage ASTC for your base of assets.
Figure 1: Example of ASTC block compression
(Click for larger version)
Take your textures, and separate them into groups of similar content based on the content categories in the next section. For each category choose if you want:
High image quality with low-but-good compression ("Higher IQ");
Balanced medium compression with medium image quality, allowing for some minor artifacts ("Medium IQ"); or
Higher compression with lower image quality, often loss of fine details with some blurring/blotching ("Lower IQ").
Note: although many of the tools use the term "best compression" we avoid it as it is easily overloaded. The best compression might mean the highest compression levels and lowest bit rates, but with too many artifacts. That said, you may find content which "tolerates" highest compression well, or decide that smaller size is more important than the resulting artifacts. Experiment with increased compression when time permits.
This is only a "quick start guide". You must evaluate your particular needs, desires, and content. It is a starting point from which you can decide to further increase or decrease compression levels to hit your desired balance of size and quality.
One last note: ASTC compression takes time for best results. DXT tools might run 2-10x faster. On the plus side, ASTC tools are 4-20x faster than its closest competitor, ETC2, for production output. ASTC can run in faster modes, generating lower quality results in short times, when you are testing and don‘t yet need or want production quality.
In each case below, "Higher" quality should be less than or equal to the file size of DXT but significantly better image quality. This is especially the case if previously you have avoided using DXT on specific assets; you might now try ASTC formats vs leaving the assets uncompressed.
The following suggestions are based on the ARM "astcenc.exe" command line tool in "thorough" mode to ASTC-compress a small set of test assets at different bit rates.
IQ |
Higher |
Medium |
Lower |
Codec |
ASTC 6x6 |
ASTC 8x5 |
ASTC 8x6 |
bpp |
3.56 |
3.2 |
2.67 |
Size vs Raw |
14.8% |
13.3% |
11.1% |
Size vs DXT1 |
89% |
80% |
64% |
IQ |
Higher |
Medium |
Lower |
Codec |
ASTC 4x4 |
ASTC 5x5 |
ASTC 8x5 |
bpp |
8 |
5.12 |
3.2 |
Size vs Raw |
50% |
32% |
20% |
Size vs DXT5 |
100% |
64% |
40% |
IQ |
Higher |
Medium |
Lower |
Codec |
ASTC 4x4 |
ASTC 5x4 |
ASTC 5x5 |
bpp |
8 |
6.40 |
5.12 |
Size vs Raw |
50% |
40% |
32% |
Size vs DXT5 |
100% |
80% |
64% |
IQ |
Higher |
Medium |
Lower |
Codec |
ASTC 6x6 |
ASTC 8x5 |
ASTC 10x5 |
bpp |
3.56 |
3.2 |
2.56 |
Size vs Raw |
44.5% |
40% |
32% |
Size vs DXT1 |
89% |
80% |
64% |
In the modern desktop computing age, there has classically been one answer to texture compression: DXT. It is also referred to as S3TC due to its origin, or BCn for certain DirectX implementations. At its most basic, it is a fixed 4x4 block format that uses 4bpp to encode each RGB block. To store alpha information, there is either 1-bit alpha (using 1bpp of the encoding space), or a second 4bpp alpha-only block for high quality alpha encoding. There have been later variants of this approach that store LA or RG data in two separate 4bpp blocks, for better quality.
In the mobile space, NVIDIA is one of the few vendors enabling easy porting of desktop content to devices, with its support for DXT in the Tegra processors. Other mobile chip vendors created other compression formats.
Imagination develops the PowerVR GPU and created a proprietary PVRTC texture compression format. It is well known for its use on iOS devices for texture compression. It offers both 4 and 2bpp options for greater reduction in size. More recently Imagination has released the PVRTC2 formats, which offer better compression quality in the same footprint.
To avoid the myriad vendor-specific codecs, Khronos defined vendor-neutral codecs. With the introduction of OpenGL ES 2.0, the ETC compression format became an available multi-vendor extension, offering DXT-like compression at better quality levels. However, it had issues that hampered its widespread adoption: it was not a required piece of ES 2.0, there was no alpha support, and it had only one mode, 4bpp RGB.
With the release of OpenGL ES 3.0 in 2013 (and full OpenGL 4.3), the ETC2 format became a standard, with backwards compatibility and important new features. First, it added full alpha support like DXT3/5 (in 8bpp), as well as 1-bit "punch-through" alpha (in 4bpp). Second, it brought the EAC format, which supports 1 and 2-channel data (R and RG, in 4 and 8bpp respectively). Third, it added sRGB data support. And last but not least, it claimed to offer better quality than competitors at the same bitrate. The biggest issue with ETC2 are complaints about its extremely slow compression tools.
In parallel to Khronos defining OpenGL ES 3.0, there was an effort to develop an industry-leading compression format that provided developers with finer grained control. This resulted in the mid-2012 launch of the ASTC texture compression format. The key to ASTC is that while it uses a fixed 128 bits-per-block, each texture can have a different size block fit in those 128 bits, unlike the fixed 4x4 block of prior formats. Leveraging a large variety of square and non-square block sizes, ASTC delivers a wide range of derived compression ratios, scaling from 8bpp down to just under 1bpp, as follows:
Block Size |
Bits Per Pixel |
---|---|
4x4 |
8.00 |
5x4 |
6.40 |
5x5 |
5.12 |
6x5 |
4.27 |
6x6 |
3.56 |
8x5 |
3.20 |
8x6 |
2.67 |
10x5 |
2.56 |
10x6 |
2.13 |
8x8 |
2.00 |
10x8 |
1.60 |
10x10 |
1.28 |
12x10 |
1.07 |
12x12 |
0.89 |
Thus, ASTC offers a huge advantage of tuning quality vs size. The alternative formats generally offer 4bpp for RGB or single channel data, some can fit alpha in that footprint, some offer higher quality alpha or a second single channel data block in a second 4bpp section. PVRTC is the other to offer a smaller variant, with its 2bpp mode.
In addition, ASTC offers support for 1-4 channels, including full alpha RGBA, normal RGB, 2-channel RG (LA), and 1-channel R (L/A) support, and custom X+Y and XY+Z normal map support. The net result being ASTC handles most types of texture.
One other key advantage of ASTC is that the method of encoding endpoints, weighting, etc. is selected per-block rather than globally, so the encoder can adapt on the fly to allocate the 128 bits to better represent the contents in each block. This delivers better image quality than previous formats, even at higher compression.
Hardware supporting ASTC has achieved sufficient enough market share that developers should seriously consider how to leverage it in their titles: to improve quality, decrease storage size, or both. This is especially true in titles that require a high enough level of graphics hardware such that ASTC is a given.
For your product, you need to decide overall, as well as per asset, whether quality (vs the raw asset) or shrinking the file size is most important. More than likely, you will pick something in between. You should have at least a high design goal in mind for your project before starting to look at individual textures.
With so many texture compression standards and tools, how to find the best match for each texture seems a bit daunting for existing large-scale projects.
Start with what you are willing and able to use.
Look at the potential for cross-platform sharing, versus different assets per platform.
Batch compress a small set of test assets (maybe one or two dozen files) and look for where quality starts to drop below the required bar for the product. Try to find a "baseline" for each class of asset, so you can take a "quick pass" at the entire project.
Do a high-quality compression of all assets overnight (or over days). Then, only do incremental high-quality recompression as assets change (maybe only once per day, using "fast" mode otherwise unless you need production quality assets in-engine).
Historically, developers would either target larger desktop screens or smaller mobile screens. Today, you must think carefully about your market. While many phones are higher and higher resolution, the visual impact of extremely high resolution textures may be small. However, with the addition of HDMI, you could be running on a 50" 1080P television, where pixels are large once again.
Developers porting to mobile devices may immediately look to cut down texture resolution first as a space saving method. But before you shrink the resolution of your textures, consider modern device screens. Then consider whether you want to use higher compression on larger textures where artifacts might be noticed less, or go with lower resolution and lower compression with fewer visual issues: a drop in resolution means 1/4 the data, but could lose key detail. In striking a balance, you might push for lower bit rates and be more flexible with what artifacts you are willing to accept.
File size is critical with modern content that can be a gigabyte or more in size. Thus the on-the-wire, on-disk and in RAM sizes are all key. How can we deal with these topics?
One alternate method that has been used in the industry is to compress assets with something like JPEG, tuning for a particular file-size reduction and trying to not trade off much quality. Assets are sent over the wire in that form, and transcoded to the optimal GPU format for the device either at first run or on-demand. That is a reasonable approach on desktop platforms, but on mobile devices the transcode can be more costly in time and battery. Plus there are additional, possibly undesirable, quality issues both starting with JPEG and resulting from a speed-tuned "fast encoder".
With the coming of ASTC, developers now have a wider range of compression bit rates, and should evaluate dropping ASTC bit rate (and quality) on assets before considering adding significant extra complexity with a secondary encoding tools and loading/transcoding process.
A second approach is to experiment with different content packaging of post-codec content. Some presentations have noted better-than-JPEG sizes achieved using ZIP or LZW compression as a second pass. Decompression on the client is still required but it should be an order of magnitude or two faster than any transcoding process.
In quick testing, we used a tiny set of example ASTC textures (the same used in the next chapter) and compressed them with standard ZIP compression. The result was compelling: just over 20MB of total texture data was reduced to a 9MB zip file.
Assuming you pick texture compression that fits into memory limits well enough, layering standard compression on top can make a difference in both on-the-wire transmission bandwidth as well as on-disk storage. So this is a new wrinkle to factor in: if you can push compression enough to fit your RAM footprint, you may be able to use LZW or ZIP as a final step to achieve file size goals.
To better discuss the effects of different codecs on content, we performed a batch compression run against a small set of textures and assigned a visual image quality (IQ) rating to each. We then considered which compression format seemed best for quality, gave a balance of quality and compression, or delivered a higher level of compression without resulting in extreme artifacts.
In this section we present tables of the acquired data along with some samples of the resulting textures using two of the tested codec variants. The presented textures show the same region under a high zoom to provide a direct visual comparison of the different compression methods.
The tables provide the details and evaluation of the batch runs on each texture for each of the tested compressors. The data includes: the image quality rating for each codec, some brief notes as to the rating (where comparisons are drawn to the original/raw image), and information on each codec in terms of bits per pixel and improved size if we were to compare to some base/assumed DXT format compression.
Figure 1: Compression of normals.png, a normal map, with DXT vs ASTC
(Click for larger version)
Texture File & Details |
Codec |
Codec bpp |
Size vs Base |
IQ |
Visual examination notes |
|
---|---|---|---|---|---|---|
normals |
ASTC 4x4 |
8 |
1.00 |
2 |
minor artifacts, otherwise very good match |
|
2 channel RG |
EAC_RG11 |
8 |
1.00 |
2 |
minor acne, otherwise very good match |
|
DXT base: 8bpp |
ASTC 4x5 |
6.4 |
0.80 |
1 |
hint of artifacts, otherwise reasonable |
|
BC3n |
8 |
1.00 |
0 |
blotchy artifacts, some edge AA lost, some dithering, barely usable |
||
PVRTC2_4 |
4 |
0.50 |
0 |
some blotching, ‘acne‘, on the line of usable |
||
ASTC 5x5 |
5.12 |
0.64 |
0 |
hint of block artifacts, barely on the usable line |
||
ASTC 8x5 |
3.2 |
0.40 |
-1 |
block artifacts, banding |
||
ASTC 6x6 |
3.56 |
0.45 |
-2 |
block artifacts, banding |
||
BC1n |
4 |
0.50 |
-2 |
bad block/banding artifacts, loss of AA along edges |
Figure 2: Compression of body.png, an ambient occlusion map, with DXT vs ASTC
(Click for larger version)
Texture File & Details |
Codec |
Codec bpp |
Size vs Base |
IQ |
Visual examination notes |
|
---|---|---|---|---|---|---|
body |
ASTC 6x6 |
3.56 |
0.89 |
2 |
very minor visual shifts from original |
|
1 channel R |
ASTC 8x5 |
3.2 |
0.80 |
1 |
minor artifacts/acne |
|
DXT base: 4bpp |
ASTC 8x6 |
2.67 |
0.67 |
0.5 |
minor artifacts/acne |
|
PVRTC2_4 |
4 |
1.00 |
0 |
some acne, basic artifacts |
||
EAC_R11 |
4 |
1.00 |
-0.5 |
banding/blocking in gradients; encoding as 1ch may be improper |
||
ASTC 10x5 |
2.56 |
0.64 |
-0.5 |
on the line to -1, hints of block artifacts |
||
ASTC 8x8 |
2 |
0.50 |
-1 |
block artifacts more visible, some detail blurring |
||
BC1 |
4 |
1.00 |
-2 |
loss of some detail, some block artifacts |
The rest of the results are below, without image annotations.
Texture File & Details |
Codec |
Codec bpp |
Size vs Base |
IQ |
Visual examination notes |
|
---|---|---|---|---|---|---|
gun |
ASTC 6x6 |
3.56 |
0.89 |
2 |
minor loss of dither details/blurring, minor color shifts |
|
3 channel RGB |
ASTC 8x5 |
3.2 |
0.80 |
1 |
more blurring, loss of dither details, starting blocking/blotching |
|
DXT base: 4bpp |
ASTC 8x6 |
2.67 |
0.67 |
0 |
loss of fine details, deeper blurring and blotching |
|
ETC2_RGB |
4 |
1.00 |
-1 |
odd color shifts/desaturation, detail loss esp in AA regions |
||
PVRTC2_4 |
4 |
1.00 |
-1 |
intensity drops, detail loss esp in AA, blurring/blotching |
||
ASTC 10x5 |
2.56 |
0.64 |
-2 |
blurring, block artifacts, loss of detail, intensity drop, color shifts |
||
BC1 |
4 |
1.00 |
-2 |
block artifacts, significant loss of smooth aa/gradients |
||
diffuse |
ASTC 6x6 |
3.56 |
0.89 |
2 |
pretty close match overall |
|
3 channel RGB |
ASTC 8x5 |
3.2 |
0.80 |
1 |
subtle blocking |
|
DXT base: 4bpp |
ETC2_RGB |
4 |
1.00 |
1 |
subtle blocking |
|
PVRTC2_4 |
4 |
1.00 |
0.5 |
minor blurring |
||
BC1 |
4 |
1.00 |
0.5 |
minor blocking |
||
ASTC 8x6 |
2.67 |
0.67 |
0.5 |
minor blurring |
||
ASTC 10x5 |
2.56 |
0.64 |
0 |
minor blurring |
||
ASTC 8x8 |
2 |
0.50 |
-1 |
blurry, but survives |
||
PVRTC2_2 |
2 |
0.50 |
-1 |
blurry, but survives |
||
decals |
ASTC 4x4 |
8 |
1.00 |
2 |
very few detail shifts |
|
4 channel ARGB |
ETC2_RGBA |
8 |
1.00 |
2 |
some minor detail shifts, but otherwise good |
|
DXT base: 8bpp |
ASTC 5x5 |
5.12 |
0.64 |
1 |
minor loss of fine details and subtle shading/highlights |
|
ASTC 6x6 |
3.56 |
0.45 |
1 |
loss of some fine highlight details, but otherwise reasonable |
||
ASTC 8x5 |
3.2 |
0.40 |
0 |
more block artifacts evident and finer details lost |
||
ASTC 8x6 |
2.67 |
0.33 |
-1 |
many more artifacts in midrange alpha/color regions and edges |
||
BC3 |
8 |
1.00 |
-1 |
blurring/block artifacts on certain edges, loss of few fine details |
||
PVRTC2_4 |
4 |
0.50 |
-2 |
loss of details, some blurring, odd shading/haloing at alpha edges |
||
leaves |
ASTC 4x4 |
8 |
1.00 |
2 |
pretty darn close match |
|
4 channel ARGB |
ETC2_RGBA |
8 |
1.00 |
2 |
pretty darn close match |
|
DXT base: 8bpp |
ASTC 5x5 |
5.12 |
0.64 |
1 |
slight softening of details, slight expansion of alpha border |
|
BC3 |
8 |
1.00 |
1 |
minor saturation shift, minor alpha edge shift |
||
ASTC 8x5 |
3.2 |
0.40 |
0 |
softened details but keeps patterns |
||
ASTC 6x6 |
3.56 |
0.45 |
-1 |
really blurred details |
||
ASTC 8x6 |
2.67 |
0.33 |
-1 |
really blurred, block artifacts |
||
PVRTC2_4 |
4 |
0.50 |
-2 |
blurred with odd swirling |
||
detailmask |
ASTC 6x6 |
3.56 |
0.89 |
2 |
very tiny artifacts, very close match |
|
1 channel R |
EAC_R11 |
4 |
1.00 |
1 |
small block artifacts, some faint banding in gradient areas |
|
DXT base: 4bpp |
PVRTC2_4 |
4 |
1.00 |
1 |
some dithering, acne, otherwise reasonable |
|
ASTC 8x5 |
3.2 |
0.80 |
0 |
minor block artifacts, some acne |
||
BC1 |
4 |
1.00 |
-1 |
lots of faint block artifacts throughout |
||
ASTC 8x6 |
2.67 |
0.67 |
-2 |
more visible acne, more block artifacts |
||
ASTC 8x8 |
2 |
0.50 |
-2 |
visible acne, subtle block artifacts |
The following is a brief selection of tools available on the internet to help developers deal with texture compression.
Location: /gpu-accelerated-texture-compression
One of the oldest tools in the compression market, it is no surprise it is still in heavy use today in many projects. The code is open source, and has CUDA compression paths for many formats. It is of course focused on DXTn/BCn format compression.
An interactive GUI for testing codecs and reviewing side-by-side before and after results, plus an error/diff view. With the ability to zoom and pan simultaneously across all three views, it makes it very easy to review the results of a single compression run. The actual codecs are command-line executables, so can be used for batch processing without the GUI. It has extensive ASTC and ETC2 format support.
Location: http://community.imgtec.com/developers/powervr/tools/pvrtextool/
Imagination‘s texture compression pack includes a GUI, a separate command line tool, a linkable library, and plugins for major DCC apps (Maya, Max, Photoshop). Its primary focus is the PVRTC format, but also has ETC2 support as well.
Using ASTC Texture Compression for Game Assets
标签:
原文地址:http://my.oschina.net/jjyuangu/blog/525946