Unity使用LZF实时压缩字符串/byte[](Runtime)
 Unity使用LZF实时压缩字符串/byte[](Runtime)](http://pic-cdn.azimiao.com/wp-content/uploads/2017/10/imageUnityOk02.jpg)
最近在搞一个帧同步的 Demo,涉及到网络消息转发。之前内网版 Demo 使用 String + Protobuf 二次封装发送消息,而公网带宽和流量都要钱,有必要压缩下。
LZMA、GZip 与 LZF
三种压缩算法的优缺点:
- LZMA:7z 默认的压缩算法,压缩率较高,但是时间很长;
- GZip:压缩率较 LZMA 低,但时间略短,常用于 Web 服务器与浏览器通信;
- LZF:Redis 内置的压缩算法,侧重点是压缩/解压时间低,自然地,其压缩率最低
同样侧重于执行效率的库还有谷歌的 Snappy,不做讨论。
为什么选择 LZF
压缩解压主要用在帧同步消息发送与接收上的,因此可以看作是实时运行。其每秒会压缩 22 次,并解压 22 次。实时运行就要求 CPU 耗时和 GC 不能过高,同时内存占用也要低。
其实一开始我是选的 LZMA ,毕竟消息字符串长度不是很长,按理来说执行时耗 CPU 时间不会太高。
先找了一个老外的 LZMA 库,Update 里调用压缩解压测试下,速度倒是还好,不过有一点,每秒内存泄漏达到了 50MiB,不一会儿,我的 PC 内存就被吃满了,这肯定不行。
后来又找了一个国人封装的 LZMA C 库,同样测试下,内存泄漏问题没有了,每帧 GC 也只有 1.2 KiB 左右,完全可以接受。但有一点,方法每帧的 CPU 耗时高达 10ms – 20ms,追了一下,追到 C 库的压缩、解压接口,再往下就只能去读 C 源码了,遂放弃。这么高的 CPU 耗时肯定不行,一个压缩解压就这么高,其他的逻辑根本跑不动。
后来看到了 LZF 算法,其自称是高性能压缩解压算法,遂引入了一个该算法的纯 C# 实现脚本,测试了下,发现其完全满足需求(Android、PC)。
工具脚本
脚本原作者在 Unity 国际版论坛里(非 Unity CN 问答论坛),同时这个脚本已收录进 unity-ui-extensions 仓库中(非官方但很有名的 Unity 工具合集仓库)。
//
// http://forum.unity3d.com/threads/lzf-compression-and-decompression-for-unity.152579/
//
/*
* Improved version to C# LibLZF Port:
* Copyright (c) 2010 Roman Atachiants <kelindar@gmail.com>
*
* Original CLZF Port:
* Copyright (c) 2005 Oren J. Maurice <oymaurice@hazorea.org.il>
*
* Original LibLZF Library Algorithm:
* Copyright (c) 2000-2008 Marc Alexander Lehmann <schmorp@schmorp.de>
*
* Redistribution and use in source and binary forms, with or without modifica-
* tion, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* 3. The name of the author may not be used to endorse or promote products
* derived from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED
* WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MER-
* CHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
* EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPE-
* CIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
* PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
* OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
* WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTH-
* ERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
* OF THE POSSIBILITY OF SUCH DAMAGE.
*
* Alternatively, the contents of this file may be used under the terms of
* the GNU General Public License version 2 (the "GPL"), in which case the
* provisions of the GPL are applicable instead of the above. If you wish to
* allow the use of your version of this file only under the terms of the
* GPL and not to allow others to use your version of this file under the
* BSD license, indicate your decision by deleting the provisions above and
* replace them with the notice and other provisions required by the GPL. If
* you do not delete the provisions above, a recipient may use your version
* of this file under either the BSD or the GPL.
*/
using System;
namespace UnityEngine.UI.Extensions
{
/// <summary>
/// Improved C# LZF Compressor, a very small data compression library. The compression algorithm is extremely fast.
/// Note for strings, ensure you only use Unicode else special characters may get corrupted.
public static class CLZF2
{
private static readonly uint HLOG = 14;
private static readonly uint HSIZE = (1 << 14);
private static readonly uint MAX_LIT = (1 << 5);
private static readonly uint MAX_OFF = (1 << 13);
private static readonly uint MAX_REF = ((1 << 8) + (1 << 3));
/// <summary>
/// Hashtable, that can be allocated only once
/// </summary>
private static readonly long[] HashTable = new long[HSIZE];
// Compresses inputBytes
public static byte[] Compress(byte[] inputBytes)
{
// Starting guess, increase it later if needed
int outputByteCountGuess = inputBytes.Length * 2;
byte[] tempBuffer = new byte[outputByteCountGuess];
int byteCount = lzf_compress(inputBytes, ref tempBuffer);
// If byteCount is 0, then increase buffer and try again
while (byteCount == 0)
{
outputByteCountGuess *= 2;
tempBuffer = new byte[outputByteCountGuess];
byteCount = lzf_compress(inputBytes, ref tempBuffer);
}
byte[] outputBytes = new byte[byteCount];
Buffer.BlockCopy(tempBuffer, 0, outputBytes, 0, byteCount);
return outputBytes;
}
// Decompress outputBytes
public static byte[] Decompress(byte[] inputBytes)
{
// Starting guess, increase it later if needed
int outputByteCountGuess = inputBytes.Length * 2;
byte[] tempBuffer = new byte[outputByteCountGuess];
int byteCount = lzf_decompress(inputBytes, ref tempBuffer);
// If byteCount is 0, then increase buffer and try again
while (byteCount == 0)
{
outputByteCountGuess *= 2;
tempBuffer = new byte[outputByteCountGuess];
byteCount = lzf_decompress(inputBytes, ref tempBuffer);
}
byte[] outputBytes = new byte[byteCount];
Buffer.BlockCopy(tempBuffer, 0, outputBytes, 0, byteCount);
return outputBytes;
}
/// <summary>
/// Compresses the data using LibLZF algorithm
/// </summary>
/// <param name="input">Reference to the data to compress</param>
/// <param name="output">Reference to a buffer which will contain the compressed data</param>
/// <returns>The size of the compressed archive in the output buffer</returns>
public static int lzf_compress(byte[] input, ref byte[] output)
{
int inputLength = input.Length;
int outputLength = output.Length;
Array.Clear(HashTable, 0, (int)HSIZE);
long hslot;
uint iidx = 0;
uint oidx = 0;
long reference;
uint hval = (uint)(((input[iidx]) << 8) | input[iidx + 1]); // FRST(in_data, iidx);
long off;
int lit = 0;
for (;;)
{
if (iidx < inputLength - 2)
{
hval = (hval << 8) | input[iidx + 2];
hslot = ((hval ^ (hval << 5)) >> (int)(((3 * 8 - HLOG)) - hval * 5) & (HSIZE - 1));
reference = HashTable[hslot];
HashTable[hslot] = (long)iidx;
if ((off = iidx - reference - 1) < MAX_OFF
&& iidx + 4 < inputLength
&& reference > 0
&& input[reference + 0] == input[iidx + 0]
&& input[reference + 1] == input[iidx + 1]
&& input[reference + 2] == input[iidx + 2]
)
{
/* match found at *reference++ */
uint len = 2;
uint maxlen = (uint)inputLength - iidx - len;
maxlen = maxlen > MAX_REF ? MAX_REF : maxlen;
if (oidx + lit + 1 + 3 >= outputLength)
return 0;
do
len++;
while (len < maxlen && input[reference + len] == input[iidx + len]);
if (lit != 0)
{
output[oidx++] = (byte)(lit - 1);
lit = -lit;
do
output[oidx++] = input[iidx + lit];
while ((++lit) != 0);
}
len -= 2;
iidx++;
if (len < 7)
{
output[oidx++] = (byte)((off >> 8) + (len << 5));
}
else
{
output[oidx++] = (byte)((off >> 8) + (7 << 5));
output[oidx++] = (byte)(len - 7);
}
output[oidx++] = (byte)off;
iidx += len - 1;
hval = (uint)(((input[iidx]) << 8) | input[iidx + 1]);
hval = (hval << 8) | input[iidx + 2];
HashTable[((hval ^ (hval << 5)) >> (int)(((3 * 8 - HLOG)) - hval * 5) & (HSIZE - 1))] = iidx;
iidx++;
hval = (hval << 8) | input[iidx + 2];
HashTable[((hval ^ (hval << 5)) >> (int)(((3 * 8 - HLOG)) - hval * 5) & (HSIZE - 1))] = iidx;
iidx++;
continue;
}
}
else if (iidx == inputLength)
break;
/* one more literal byte we must copy */
lit++;
iidx++;
if (lit == MAX_LIT)
{
if (oidx + 1 + MAX_LIT >= outputLength)
return 0;
output[oidx++] = (byte)(MAX_LIT - 1);
lit = -lit;
do
output[oidx++] = input[iidx + lit];
while ((++lit) != 0);
}
}
if (lit != 0)
{
if (oidx + lit + 1 >= outputLength)
return 0;
output[oidx++] = (byte)(lit - 1);
lit = -lit;
do
output[oidx++] = input[iidx + lit];
while ((++lit) != 0);
}
return (int)oidx;
}
/// <summary>
/// Decompresses the data using LibLZF algorithm
/// </summary>
/// <param name="input">Reference to the data to decompress</param>
/// <param name="output">Reference to a buffer which will contain the decompressed data</param>
/// <returns>Returns decompressed size</returns>
public static int lzf_decompress(byte[] input, ref byte[] output)
{
int inputLength = input.Length;
int outputLength = output.Length;
uint iidx = 0;
uint oidx = 0;
do
{
uint ctrl = input[iidx++];
if (ctrl < (1 << 5)) /* literal run */
{
ctrl++;
if (oidx + ctrl > outputLength)
{
//SET_ERRNO (E2BIG);
return 0;
}
do
output[oidx++] = input[iidx++];
while ((--ctrl) != 0);
}
else /* back reference */
{
uint len = ctrl >> 5;
int reference = (int)(oidx - ((ctrl & 0x1f) << 8) - 1);
if (len == 7)
len += input[iidx++];
reference -= input[iidx++];
if (oidx + len + 2 > outputLength)
{
//SET_ERRNO (E2BIG);
return 0;
}
if (reference < 0)
{
//SET_ERRNO (EINVAL);
return 0;
}
output[oidx++] = output[reference++];
output[oidx++] = output[reference++];
do
output[oidx++] = output[reference++];
while ((--len) != 0);
}
}
while (iidx < inputLength);
return (int)oidx;
}
}
使用方法
上文的静态工具类提供了两个方法,Compress
与Decompress
,分别是压缩与解压。
1.压缩或解压 byte[]
直接调用 Compress 与 Decompress 即可。
2.压缩或解压字符串
压缩:先使用System.Text.Encoding.UTF8.GetBytes
将字符串转为 byte[],再通过上文的压缩方法压缩。
解压:先通过上文的解压方法解压 byte[],再通过System.Text.Encoding.UTF8.GetString
获取字符串常量。
效率与压缩率
经过我的测试,对于四百长度的 byte[],其压缩与解压时间几乎小于 1ms,同时其带来的 GC 也可以忽略不计,不会造成性能瓶颈。
在如此高的执行效率下,其压缩率也可以接受,对于目前的消息字段,一般情况下 400 长度的 byte[] 能够压缩到 260 左右。
当然,压缩率要根据具体文件内容来看,如果重复字段很多,那么压缩率还能更高,反之,压缩率就会更低。