Back out "perf(styles): greatly improve style.hash performance"
This backs out commit 3bfe4cd25ca7a5ae4d4084818b86ada9236b3bb5, but
keeps the hash algorithm as XxHash3 which showed improvements in
performance.
By switching to one-shot hashing of the raw bytes of the struct with
XxHash3 instead of using `autoHash` with Wyhash, a performance gain of
around 20% can be observed in DOOM-fire-zig.
Fixes#2841
We were incorrectly applying the start/end x offset for the first/last
row of every single page. If a selection spanned multiple pages this
would trim data incorrectly.
Unit test updated to cover this case.
Fixes#2497
While investigating the issue I added an integrity check that found a
problem hiding in the insert logic that was unrelated- in fixing that I
greatly simplified the insert logic.
It turns out that #2497 is ultimately just a case of bad luck,
pathological inputs that result in very non-uniform hashes so the
clustering overwhelms things. The solution was just to add a check and
claim we're out of memory.
I tried adding an entropy folding function to fix the hash a little but
it had a measurable negative impact on performance and isn't necessary
so I've not included it here. Currently there's an open PR to Zig to
[add RapidHash](https://github.com/ziglang/zig/pull/22085), which is the
successor to Wyhash and apparently has much better statistical
characteristics on top of being faster. I imagine it will land in time
for 0.14 so whenever we update to 0.14 we should probably switch our
standard hash function to RapidHash, which I imagine should yield
improvements across the board.
Using the AutoHasher may also be not the best idea, I may explore ways
to improve how we generate our style hashes in the future.
Fixes#2958
The y was alround bounded but we allowed any x value and assumed the
caller would handle it. This is not the case so we now check the x and
return null if it's out of bounds (same as y, which was already doing
this).
Poor hash uniformity and/or a crafted or unlucky input could cause the
bounds of the PSL stats array to be exceeded, which caused memory
corruption (not good!) -- we avoid such cases now by returning an
OutOfMemory error if we're about to insert and there's an item with a
PSL in the last slot.
Previous logic had multiple issues that were hiding in edge cases of
edge cases with the ressurected item handling among other things; the
added assertIntegrity method finds these issues, the primary one being
an edge case where an ID is present in two different buckets.
Added more comments to explain logic in more detail and fixed a couple
little things like always using `+%` when incrementing the probe pos,
and replacing a silent return on an integrity issue that should be
impossible (`table[item.meta.bucket] != id`) with an assert.
Our semantic prompts are row-based, so the last prompt marker set on a
row "wins". In the case of at least our bash shell integration, this
means that consecutive prompt lines will all be marked as .input (OSC
133;B -- end-of-prompt, start of input).
Previously, clearPrompt() identified the current prompt's "area" by
searching upward from the current row until it encounters a .prompt
marker or some command output. In the bash case, .prompt is never the
dominant ("last") marker, so clearPrompt() would aggressively clear all
immediately preceding consecutive prompts.
With this change, we'll stop searching upwards when we encounter some
command output, a .prompt marker, _or another .input marker_. That last
case prevents clearPrompt() from unintentionally clearing earlier prompt
lines.
There may be improvements we can make to the way that our bash shell
integration emits semantic prompt markers, but I think this logic is
generally sound for all cases, and it specifically improves the current
bash prompt-clearing experience.
In multiple tests we create 1 or more pages by growing them 1 row at a
time, which results in an integrity check of the page for each row grown
which is just... horrible. By simply pausing integrity checks while
growing these pages (since growing them is not the point of the test) we
MASSIVELY speed up all of these tests.
Also reduced grapheme bytes during testing and made the Terminal "glitch
text" test actually assert what it intends to achieve, rather than just
blindly assuming 100 copies of the text will be the right amount -- this
lets us stop a lot earlier, making it take practically no time.
In multiple tests we create 1 or more pages by growing them 1 row at a
time, which results in an integrity check of the page for each row grown
which is just... horrible. By simply pausing integrity checks while
growing these pages (since growing them is not the point of the test) we
MASSIVELY speed up all of these tests.
Also reduced grapheme bytes during testing and made the Terminal "glitch
text" test actually assert what it intends to achieve, rather than just
blindly assuming 100 copies of the text will be the right amount -- this
lets us stop a lot earlier, making it take practically no time.
… break
Fixes#2941
This fixes the rendering of the text below. For those that can't see it,
it is the following in UTF-32: `0x22 0x1F3FF 0x22`.
```
"🏿"
```
`0x1F3FF` is the Fitzpatrick modifier for dark skin tone. It has the
Unicode property `Emoji_Modifier`. Emoji modifiers are defined in UTS
#51 and are only valid based on ED-13:
```
emoji_modifier_sequence := emoji_modifier_base emoji_modifier
emoji_modifier_base := \p{Emoji_Modifier_Base}
emoji_modifier := \p{Emoji_Modifier}
```
Additional quote from UTS #51:
> To have an effect on an emoji, an emoji modifier must immediately
follow
> that base emoji character. Emoji presentation selectors are neither
needed
> nor recommended for emoji characters when they are followed by emoji
> modifiers, and should not be used in newly generated emoji modifier
> sequences; the emoji modifier automatically implies the emoji
presentation
> style.
Our precomputed grapheme break table was mistakingly not following this
rule. This commit fixes that by adding a check for that every
`Emoji_Modifier` character must be preceded by an `Emoji_Modifier_Base`.
This only has a cost during compilation (table generation). The runtime
cost is identical; the table size didn't increase since we had leftover
bits we could use.
Fixes#2941
This fixes the rendering of the text below. For those that can't see it,
it is the following in UTF-32: `0x22 0x1F3FF 0x22`.
```
"🏿"
```
`0x1F3FF` is the Fitzpatrick modifier for dark skin tone. It has the
Unicode property `Emoji_Modifier`. Emoji modifiers are defined in UTS
#51 and are only valid based on ED-13:
```
emoji_modifier_sequence := emoji_modifier_base emoji_modifier
emoji_modifier_base := \p{Emoji_Modifier_Base}
emoji_modifier := \p{Emoji_Modifier}
```
Additional quote from UTS #51:
> To have an effect on an emoji, an emoji modifier must immediately follow
> that base emoji character. Emoji presentation selectors are neither needed
> nor recommended for emoji characters when they are followed by emoji
> modifiers, and should not be used in newly generated emoji modifier
> sequences; the emoji modifier automatically implies the emoji presentation
> style.
Our precomputed grapheme break table was mistakingly not following this
rule. This commit fixes that by adding a check for that every
`Emoji_Modifier` character must be preceded by an `Emoji_Modifier_Base`.
This only has a cost during compilation (table generation). The runtime
cost is identical; the table size didn't increase since we had leftover
bits we could use.
Load and display (`T`) was responding even with implicit IDs, because
the display was achieved with an early return in the transmit function
that bypassed the logic to silence implicit ID responses- by making it
not an early return we fix this.
If a transmit and display command does not specify an ID or a number,
then it should not be responded to. We currently automatically assign
IDs in this situation, which isn't ideal since collisions can happen
which shouldn't, but this at least fixes the glaring observable issue
where transmit-and-display commands yield unexpected OK responses.
Fixes#2877
As the comment in the diff states, we rely on `mmap` to zero our memory.
When we reset we are reusing previously allocated memory so we won't hit
an `mmap`. We need to zero the memory ourselves.
This is pretty slow if there is a lot of memory but in every case except
allocation failures, we expect there to be only a few pages allocated.
Fixes#2857
Some terminal modes always reset, but there are others that should be
conditional based on how the terminal's default state is configured.
Primarily from #2857 is the grapheme clustering mode (mode 2027) which
was always resetting to false but should be conditional based on the
the `grapheme-width-method` configuration.
Fixes#2817
The test is pretty explanatory. I also renamed `end` to `count` since I
think this poor naming was the reason for the bug. In `eraseChars`, the
`count` (nee `end`) is the number of cells to erase, not the index of
the last cell to erase.
When an empty string is given to OSC7, the pwd is reset to nil (as if
the terminal never received a pwd report to begin with). This is
analogous to how OSC0/2 reset the title to nil when given an empty
string.
This is practically useful for macOS because it allows our proxy icon to
also be reset instead of being stuck on the last known path.
This breaks from any known terminal behavior. As far as I can find, this
is totally unspecified so we're somewhat free to do what we want. I
don't think any terminal programs depend on this behavior, so I think
it's safe to change it.
Fixes#2651
First, our OSC parser didn't allow blank OSC 0 or 2 requests. This
should be allowed and this fixes that with a test.
Second, it seems many terminals (iTerm2, Kitty) treat setting a blank
title as resetting to whatever the default title is rather than
explicitly setting it as blank. If a program wants a title to be blank
they should send a single space. This commit follows this behavior.