44 Commits

Author SHA1 Message Date
Mitchell Hashimoto
f1c42c9f8c synthetic package
This introduces a new package `src/synthetic` for generating synthetic
data, currently primarily for benchmarking but other use cases can
emerge.

The synthetic package exports a runtime-dispatched type `Generator` that
can generate data of various types. To start, we have a bytes, utf8,
and OSC generator. The goal of each generator is to expose knobs to tune the
probabilities of various outcomes. For example, the UTF-8 generator has
a knob to tune the probability of generating 1, 2, 3, or 4-byte UTF-8
sequences.

Ultimately, the goal is to be able to collect probability data
empirically that we can then use for benchmarks so we can optimize
various parts of the codebase on real-world data shape distributions.
2025-05-21 10:20:09 -07:00
Mitchell Hashimoto
a74e352726 bench: add --mode=gen-osc to generate synthetic OSC sequences
This commit adds a few new mode flags to the `bench-stream` program
to generator synthetic OSC sequences. The new modes are `gen-osc`,
`gen-osc-valid`, and `gen-osc-invalid`. The `gen-osc` mode generates
equal parts valid and invalid OSC sequences, while the suffixed variants
are for generating only valid or invalid sequences, respectively.

This commit also fixes our build system to actually be able to build the
benchmarks. It turns out we were just rebuilding the main Ghostty binary
for `-Demit-bench`. And, our benchmarks didn't run under Zig 0.14, which
is now fixed.

An important new design I'm working towards in this commit is to split
out synthetic data generation to a dedicated package in
`src/bench/synth` although I'm tempted to move it to `src/synth` since
it may be useful outside of benchmarks.

The synth package is a work-in-progress, but it contains a hint of
what's to come. I ultimately want to able to generate all kinds of
synthetic data with a lot of knobs to control dimensionality (e.g. in
the case of OSC sequences: valid/invalid, length, operation types,
etc.).
2025-05-14 12:26:31 -07:00
Mitchell Hashimoto
a416d4236a remove old terminal implementation 2024-03-26 16:14:25 -07:00
Mitchell Hashimoto
37251dca95 fix bench compilation 2024-03-22 20:27:59 -07:00
Mitchell Hashimoto
9b4ab0e209 zig build test with renamed terminal package 2024-03-22 20:27:44 -07:00
Mitchell Hashimoto
100e6ed254 terminal/new => terminal2 so we can figure out what depends on what 2024-03-22 20:27:36 -07:00
Mitchell Hashimoto
9006a3f431 bench/resize 2024-03-22 20:27:33 -07:00
Mitchell Hashimoto
06376fcb0b terminal/new: clone can take a shared pool 2024-03-22 20:27:31 -07:00
Mitchell Hashimoto
2725b7d9b2 bench/screen-copy 2024-03-22 20:27:31 -07:00
Mitchell Hashimoto
0f63cd6f01 terminal/new: scrollDown, top/bot margin tests, fix insertLines bug 2024-03-22 20:27:26 -07:00
Mitchell Hashimoto
e7bf9dc53c vt-insert-lines bench 2024-03-22 20:27:26 -07:00
Mitchell Hashimoto
de3d1e4df7 terminal/new: clean up 2024-03-22 20:27:19 -07:00
Mitchell Hashimoto
396cf5eb7a bench/page-init: page count 2024-03-22 20:27:19 -07:00
Mitchell Hashimoto
7ad94caaeb bench/page-init 2024-03-22 20:27:19 -07:00
Mitchell Hashimoto
f7c597fa95 terminal/new 2024-03-22 20:27:18 -07:00
Mitchell Hashimoto
46b59b4c7d terminal/new: scrollactive 2024-03-22 20:27:18 -07:00
Mitchell Hashimoto
5628fa36d8 terminal/new: scrollDown 2024-03-22 20:27:18 -07:00
Mitchell Hashimoto
dc6de51472 terminal/new: add bench 2024-03-22 20:27:17 -07:00
Qwerasd
58b925d4c3 fix(bench): update std options format 2024-02-10 22:20:24 -05:00
Mitchell Hashimoto
132fbb3a46 unicode: use packed struct for break state 2024-02-09 20:29:36 -08:00
Mitchell Hashimoto
5f3574a4bf unicode: direct port of ziglyph to start 2024-02-09 19:44:57 -08:00
Mitchell Hashimoto
6437623500 bench/grapheme-break 2024-02-09 09:12:05 -08:00
Mitchell Hashimoto
fc459ad827 Merge pull request #1486 from mitchellh/unilut
Use precomputed lookup tables for even faster codepoint width computations
2024-02-08 21:51:33 -08:00
Mitchell Hashimoto
4834b8e925 remove utf8proc 2024-02-08 21:11:11 -08:00
Mitchell Hashimoto
f6e694bf80 bench: update codepoint-width 2024-02-08 21:10:06 -08:00
Mitchell Hashimoto
9755d0696e unicode: generate our own lookup tables 2024-02-08 21:01:11 -08:00
Qwerasd
68c0813397 terminal/stream: Added ESC parsing fast tracks 2024-02-08 21:49:58 -05:00
Mitchell Hashimoto
4ae41579da add utf8proc back for bench 2024-02-08 13:21:36 -08:00
Mitchell Hashimoto
17dc64053e terminal: swap codepointwidth implementations 2024-02-07 09:38:17 -08:00
Mitchell Hashimoto
3c31217f3c simd: minor tweaks 2024-02-07 09:28:56 -08:00
Mitchell Hashimoto
5692d39067 bench/codepoint-width: add wcwidth 2024-02-07 09:17:26 -08:00
Mitchell Hashimoto
88d81602fa simd/codepoint-width: wip 2024-02-06 22:28:26 -08:00
Mitchell Hashimoto
d4fa0fcabf bench/codepoint-width 2024-02-06 17:11:11 -08:00
Qwerasd
d96243fa5b bench/stream: script adjustments 2024-02-06 19:30:27 -05:00
Qwerasd
2db24fdd57 bench/stream: add gen-rand (arbitrary random bytes) 2024-02-06 19:29:06 -05:00
Qwerasd
b31099daf4 bench/stream: only generate benchmark input once, improve utf8 gen 2024-02-06 18:22:59 -05:00
Mitchell Hashimoto
03fceb81a5 move bench script 2024-02-05 21:22:28 -08:00
Mitchell Hashimoto
0c8dd34ea7 build: update build and comments 2024-02-05 21:22:28 -08:00
Mitchell Hashimoto
e88292f096 bench/stream: add utf8 2024-02-05 21:22:28 -08:00
David Rubin
a853277dbf make benchmarks more accurate
by adding a `--size` flag we make the size of buffers not comptime known, which prevents certain unrolling optimizations from happening.

secondly, `noinline` creates a more reproducable env for benchmarking by isolating the contents of the functions
2024-02-05 21:22:27 -08:00
Mitchell Hashimoto
caf9405db0 bench/stream: add terminal option 2024-02-05 21:22:27 -08:00
Mitchell Hashimoto
b030663e03 bench/stream: benchmark for stream processing 2024-02-05 21:22:27 -08:00
Mitchell Hashimoto
f1227a3ebd build: get benchmarks building again 2024-02-04 20:27:53 -08:00
Mitchell Hashimoto
81fbc94b3c Add a benchmark exe for testing parser throughput 2022-11-13 11:29:05 -08:00