facebook/zstd - zstd - Haisto: Git with a cup of tea

mirror of https://github.com/facebook/zstd.git synced 2024-10-23 08:44:28 +08:00

Author	SHA1	Message	Date
Yann Collet	b880f20d52	Merge pull request #4171 from facebook/lvl3_ratio+ Improve compression ratio of levels 3 & 4	2024-10-17 11:39:41 -07:00
Yann Collet	41d870fbbf	updated regression tests results	2024-10-17 11:06:26 -07:00
Yann Collet	ff8e98bebe	enable regression tests at pull request time was transferred from circleci, but was only triggered on push into dev, i.e. after pull request is merged.	2024-10-17 09:45:16 -07:00
Yann Collet	47d4f5662d	rewrite code in the manner suggested by @terrelln	2024-10-17 09:37:23 -07:00
Yann Collet	61d08b0e42	fix test a margin of 4 is insufficient to guarantee compression success.	2024-10-17 09:37:23 -07:00
Yann Collet	6326775166	slightly improved compression ratio at levels 3 & 4 The compression ratio benefits are small but consistent, i.e. always positive. On `silesia.tar` corpus, this modification saves ~75 KB at level 3. The measured speed cost is negligible, i.e. below noise level, between 0 and -1%.	2024-10-17 09:37:23 -07:00
Yann Collet	18a42190c2	Merge pull request #4170 from facebook/dict_cSpeed Improve dictionary compression speed	2024-10-16 17:36:49 -07:00
Yann Collet	730d2dce41	fix test	2024-10-15 18:44:40 -07:00
Yann Collet	c2abfc5ba4	minor improvement to level 3 dictionary compression ratio	2024-10-15 17:58:33 -07:00
Yann Collet	e63896eb58	small dictionary compression speed improvement not as good as small-blocks improvement, but generally positive.	2024-10-15 17:48:35 -07:00
Yann Collet	def3ee9548	Merge pull request #4167 from facebook/ci_m32test_faster attempt to make 32-bit tests faster	2024-10-12 01:57:55 -07:00
Yann Collet	e6740355e3	attempt parallel test running with -j	2024-10-11 18:01:28 -07:00
Yann Collet	6f2e29a234	measure if -O2 makes the test complete faster	2024-10-11 17:30:55 -07:00
Yann Collet	1024aa9252	attempt to make 32-bit tests faster this is the longest CI test, reaching ~40mn on last PR	2024-10-11 16:24:25 -07:00
Yann Collet	8c38bda935	Merge pull request #4165 from facebook/cspeed_cmov Improve compression speed on small blocks	2024-10-11 16:20:19 -07:00
Yann Collet	8e5823b65c	rename variable name findMatch -> matchFound since it's a test, as opposed to an active search operation. suggested by @terrelln	2024-10-11 15:38:12 -07:00
Yann Collet	83de00316c	fixed parameter ordering in `dfast` noticed by @terrelln	2024-10-11 15:36:15 -07:00
Yann Collet	7ba43091b8	Merge pull request #4164 from facebook/spec_043 spec update: huffman prefix code paragraph	2024-10-10 16:56:02 -07:00
Yann Collet	fa1fcb08ab	minor: better variable naming	2024-10-10 16:07:20 -07:00
Yann Collet	3e7c66acd1	added ascending order example	2024-10-09 01:06:24 -07:00
Yann Collet	d45aee43f4	make __asm__ a __GNUC__ specific	2024-10-08 16:38:35 -07:00
Yann Collet	741b860fc1	store dummy bytes within ZSTD_match4Found_cmov() feels more logical, better contained	2024-10-08 16:34:40 -07:00
Yann Collet	197c258a79	introduce memory barrier to force test order suggested by @terrelln	2024-10-08 15:54:48 -07:00
Yann Collet	186b132495	made search strategy switchable between cmov and branch and use a simple heuristic based on wlog to select between them. note: performance is not good on clang (yet)	2024-10-08 13:52:56 -07:00
Yann Collet	2cc600bab2	refactor search into an inline function for easier swapping with a parameter	2024-10-08 11:10:48 -07:00
Yann Collet	3b343dcfb1	refactor huffman prefix code paragraph	2024-10-07 17:15:07 -07:00
Yann Collet	1e7fa242f4	minor refactor zstd_fast make hot variables more local	2024-10-07 11:22:40 -07:00
Yann Collet	da23998e9a	Merge pull request #4160 from facebook/fix_nightly fix dependency for nightly github actions tests	2024-10-03 21:02:39 -07:00
Yann Collet	b84653fc83	fix dependency for nightly github actions tests	2024-10-03 15:10:16 -07:00
Yann Collet	b7e1eef048	Merge pull request #4159 from facebook/spec_refactor_fse specification update	2024-10-03 14:54:16 -07:00
Yann Collet	a8b86d024a	refactor documentation of the FSE decoding table build process	2024-10-02 23:09:06 -07:00
Yann Collet	75b0f5f4f5	Merge pull request #4153 from artem/fix-meson-includes meson: Do not export private headers in libzstd_dep to avoid name clash	2024-10-02 16:51:44 -07:00
Yann Collet	dda3cdfdec	Merge pull request #4156 from facebook/rm_circleci removing nightly tests built on circleci	2024-10-02 16:51:15 -07:00
Yann Collet	751bf1ffd8	Merge pull request #4157 from facebook/fix_result_c fix incorrect pointer manipulation	2024-10-02 16:50:45 -07:00
Yann Collet	dcc8fd0472	Merge pull request #4158 from facebook/benchzstd_fclose fix missing fclose()	2024-10-02 16:49:43 -07:00
Yann Collet	8edd147686	fix missing fclose() fix #4151	2024-10-01 09:52:45 -07:00
Yann Collet	de6cc98e07	fix incorrect pointer manipulation fix #4155	2024-10-01 09:25:26 -07:00
Yann Collet	3d5d3f5630	removing nightly tests built on circleci	2024-09-30 21:38:29 -07:00
Yann Collet	27bf1362fe	Merge pull request #4154 from dearblue/freebsd-14.1 Update FreeBSD VM image to 14.1	2024-09-30 11:54:32 -07:00
Artem Labazov	ccc02a9a77	meson: Fix contrib and tests build	2024-09-30 18:05:57 +03:00
Artem Labazov	d2d49a1161	meson: Do not export private headers in libzstd_dep to avoid name clash This way libzstd_dep does not override, for instance, <xxhash.h>	2024-09-30 17:03:42 +03:00
dearblue	a3b5c4521c	Update FreeBSD VM image to 14.1 FreeBSD 14.0 will reach the end of life on 2024-09-30. The updated 14.1 is scheduled to end-of-life on 2025-03-31. ref. https://www.freebsd.org/releases/14.2R/schedule/	2024-09-30 22:45:17 +09:00
Yann Collet	984d11a4d1	Merge pull request #4146 from facebook/dictBench_Doc update documentation: specify that Dictionary can be used for benchmark	2024-09-27 13:44:42 -07:00
Yann Collet	d2212c680a	Merge pull request #4013 from elasota/spec-clarify-offset-code-overflow Specify that decoders may reject non-zero probabilities for larger offset codes than implementation supports	2024-09-27 13:42:32 -07:00
Yann Collet	039f404faa	update documentation to specify that Dictionary can be used for benchmark fix #4139	2024-09-25 16:56:01 -07:00
inventor500	9215de52c7	Included suggestion from @neheb	2024-09-25 09:51:05 -07:00
inventor500	a8b544d460	Fixed warning when compiling pzstd with CPPFLAGS=-Wunused-result and CXXFLAGS=-std=c++17	2024-09-25 09:51:05 -07:00
Yann Collet	bc96d4b077	Merge pull request #4119 from xionghul/dev Fix zstd-pgo run error	2024-09-24 17:55:43 -07:00
Yann Collet	d27a4cd4ac	Merge pull request #4143 from facebook/fix_dictsizemin_dic fix doc nit: ZDICT_DICTSIZE_MIN	2024-09-24 17:55:25 -07:00
Ilya Tokar	e8fce38954	Optimize compression by avoiding unpredictable branches Avoid unpredictable branch. Use conditional move to generate the address that is guaranteed to be safe and compare unconditionally. Instead of if (idx < limit && x[idx] == val ) // mispredicted idx < limit branch Do addr = cmov(safe,x+idx) if (*addr == val && idx < limit) // almost always false so well predicted Using microbenchmarks from https://github.com/google/fleetbench, I get about ~10% speed-up: name old cpu/op new cpu/op delta BM_ZSTD_COMPRESS_Fleet/compression_level:-7/window_log:15 1.46ns ± 3% 1.31ns ± 7% -9.88% (p=0.000 n=35+38) BM_ZSTD_COMPRESS_Fleet/compression_level:-7/window_log:16 1.41ns ± 3% 1.28ns ± 3% -9.56% (p=0.000 n=36+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-5/window_log:15 1.61ns ± 1% 1.43ns ± 3% -10.70% (p=0.000 n=30+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-5/window_log:16 1.54ns ± 2% 1.39ns ± 3% -9.21% (p=0.000 n=37+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-3/window_log:15 1.82ns ± 2% 1.61ns ± 3% -11.31% (p=0.000 n=37+40) BM_ZSTD_COMPRESS_Fleet/compression_level:-3/window_log:16 1.73ns ± 3% 1.56ns ± 3% -9.50% (p=0.000 n=38+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-1/window_log:15 2.12ns ± 2% 1.79ns ± 3% -15.55% (p=0.000 n=34+39) BM_ZSTD_COMPRESS_Fleet/compression_level:-1/window_log:16 1.99ns ± 3% 1.72ns ± 3% -13.70% (p=0.000 n=38+38) BM_ZSTD_COMPRESS_Fleet/compression_level:0/window_log:15 3.22ns ± 3% 2.94ns ± 3% -8.67% (p=0.000 n=38+40) BM_ZSTD_COMPRESS_Fleet/compression_level:0/window_log:16 3.19ns ± 4% 2.86ns ± 4% -10.55% (p=0.000 n=40+38) BM_ZSTD_COMPRESS_Fleet/compression_level:1/window_log:15 2.60ns ± 3% 2.22ns ± 3% -14.53% (p=0.000 n=40+39) BM_ZSTD_COMPRESS_Fleet/compression_level:1/window_log:16 2.46ns ± 3% 2.13ns ± 2% -13.67% (p=0.000 n=39+36) BM_ZSTD_COMPRESS_Fleet/compression_level:2/window_log:15 2.69ns ± 3% 2.46ns ± 3% -8.63% (p=0.000 n=37+39) BM_ZSTD_COMPRESS_Fleet/compression_level:2/window_log:16 2.63ns ± 3% 2.36ns ± 3% -10.47% (p=0.000 n=40+40) BM_ZSTD_COMPRESS_Fleet/compression_level:3/window_log:15 3.20ns ± 2% 2.95ns ± 3% -7.94% (p=0.000 n=35+40) BM_ZSTD_COMPRESS_Fleet/compression_level:3/window_log:16 3.20ns ± 4% 2.87ns ± 4% -10.33% (p=0.000 n=40+40) I've also measured the impact on internal workloads and saw similar ~10% improvement in performance, measured by cpu usage/byte of data.	2024-09-20 16:07:01 -04:00

1 2 3 4 5 ...

10740 Commits