Andrzej Janik
7887237ff0
Minor fixes
2025-07-23 22:47:11 +00:00
Andrzej Janik
0fc1028279
Regeenrate lib
2025-07-23 22:18:27 +00:00
Andrzej Janik
aceebf959e
Merge commit '119b635b9dffccc2de699b188897d8077529b0d6' into inst_fixes
2025-07-23 22:17:53 +00:00
Andrzej Janik
63f02c4158
Add tanh
2025-07-23 22:15:23 +00:00
Violet
119b635b9d
Emit correct alignment for loads and stores ( #429 )
v5-preview.59
2025-07-23 14:55:52 -07:00
Violet
a86ba3d642
Remove Type::Pointer ( #428 )
v5-preview.58
2025-07-23 11:22:17 -07:00
Andrzej Janik
eb2d1f81fb
Fix lg2 implementation
2025-07-23 16:48:44 +00:00
Andrzej Janik
224e1ca1da
Improve ex2 handling
2025-07-23 16:13:22 +00:00
Andrzej Janik
14532dc9c1
Fix rsqrt and rcp
2025-07-23 02:11:44 +00:00
Andrzej Janik
321201dd2a
Fix sqrt
2025-07-22 20:28:21 +00:00
Violet
27cfd50ddd
Implement nanosleep.u32
( #421 )
v5-preview.57
2025-07-21 17:42:04 -07:00
Violet
72e2fe5b9a
Remove unnecessary unsafe block ( #426 )
v5-preview.56
2025-07-21 13:20:12 -07:00
Andrzej Janik
e147d1af43
Use more precise sqrt.approx by default
2025-07-19 01:33:32 +00:00
Andrzej Janik
d837400085
Merge commit '2d582660921913c3adefb37b3d3209c174f434bb' into inst_fixes
2025-07-18 20:47:01 +00:00
Violet
f5712d9d5a
Add parser support for hyphenated IDs in arguments ( #425 )
...
The syntax description for [`cp.async`](https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cp-async ) has several elements not supported by the current parser. One such element is that the `cp-size` and `src-size` operands have hyphens in their IDs. This PR adds support for these IDs, and translates them as `cp_size` and `src_size`
v5-preview.55
2025-07-18 13:45:09 -07:00
Andrzej Janik
72e15a9351
Fix neg.ftz
2025-07-18 20:44:07 +00:00
Andrzej Janik
2d58266092
Update mod.rs
2025-07-18 11:50:27 -07:00
Andrzej Janik
5cb222f042
Add set instruction and add bool post op support for setp
2025-07-18 17:29:57 +00:00
Andrzej Janik
242bef5dca
Add set, make sure fix to abs actually fixes things
2025-07-18 01:00:04 +00:00
Andrzej Janik
e23e3ff114
Add sub with .sat
2025-07-17 22:40:37 +00:00
Andrzej Janik
f04f0ec6e0
Fix bit shifts
2025-07-17 22:21:26 +00:00
Andrzej Janik
38e8a3fe1f
Fix abs.ftz, add test for integer abs
2025-07-17 18:53:06 +00:00
Andrzej Janik
2f27c47acc
Improve error recovery ( #418 )
v5-preview.54
2025-07-17 10:02:03 -07:00
林博仁 Buo-ren Lin
0f8d4bb834
Fix typo in README.md (either) ( #419 )
v5-preview.53
2025-07-17 09:32:41 -07:00
Violet
dc69808e54
Add support for shfl.sync.MODE.b32
( #409 )
v5-preview.52
v5-preview.51
2025-07-16 17:23:11 -07:00
Andrzej Janik
36f0ba9cbb
Apply rounding mode in fp div ( #416 )
2025-07-16 17:22:59 -07:00
Violet
95d66df18e
Only allow (.u32, .pred) for multiple return ( #417 )
v5-preview.50
2025-07-16 17:03:28 -07:00
Violet
7c6b95a8e3
Allow messages for error_todo ( #415 )
v5-preview.49
2025-07-16 15:54:40 -07:00
林博仁 Buo-ren Lin
039689253d
Fix grammar errors in README.md ( #414 )
v5-preview.48
2025-07-16 12:19:00 -07:00
林博仁 Buo-ren Lin
777392f69f
Fix typo in README.md(self-contained) ( #413 )
v5-preview.47
2025-07-16 11:41:07 -07:00
Violet
6fb09f393a
Handle WARP_SZ
( #412 )
...
* Add tests for `WARP_SZ`
* Handle WARP_SZ in parser
v5-preview.46
2025-07-16 11:02:17 -07:00
Violet
06b28cfec7
More descriptive message for unknown symbol ( #411 )
v5-preview.45
2025-07-14 15:01:38 -07:00
Violet
373d6d9e6e
Remove duplicate call to linker ( #410 )
v5-preview.44
2025-07-10 12:44:22 -07:00
Andrzej Janik
081f7d0976
Enable sccache in Rust builds, publish prerelease builds ( #408 )
v5-preview.43
2025-07-09 09:20:03 -07:00
Violet
6e27f78ae7
Add support for multiple return arguments ( #406 )
2025-07-09 08:17:15 -07:00
Violet
fa7ecb2e02
Update README.md ( #407 )
2025-07-08 17:13:53 -07:00
Andrzej Janik
059b8ca0f6
Make sure it is possible to log 32bit PhysX ( #374 )
2025-07-08 10:19:49 -07:00
aiwhskruht
9bd8125c53
Implement more CUDA driver API to enable simple cuda-samples ( #405 )
2025-07-08 10:18:30 -07:00
aiwhskruht
8d5b734c30
Add initialized check to protect zluda from cuda driver calls during shutdown ( #404 )
2025-07-07 11:08:09 -07:00
Andrzej Janik
ef0c4afcf9
Run unit tests on every commit ( #401 )
2025-07-03 16:07:00 -07:00
Violet
5cb0a9b8e8
Add support for bar.red.and.pred
( #402 )
...
Implements bar.red.and.pred and bar.red.or.pred, using the undocument __ockl_wgred functions. Doesn't yet add support for numbered barriers and threadcount, as these are not needed for llm.c.
2025-07-03 11:56:20 -07:00
Violet
7bdd20f0dd
Add warp-wide tests ( #400 )
2025-07-02 18:11:36 -07:00
Andrzej Janik
6d56fa8c34
Fix floating point min/max ( #399 )
2025-07-01 15:58:16 -07:00
Violet
b824424367
Read test files at runtime for development ergonomics ( #395 )
2025-07-01 10:31:06 -07:00
Violet
1cf345329c
Make derive_parser
work with all optional arguments ( #397 )
...
The current implementation using `winnow`'s `opt` does not work for optional arguments that are in the middle of the command. For example, `bar{.cta}.red.op.pred p, a{, b}, {!}c;`. This is because `opt` is greedy, and will always match `{, b}` instead of `,{!} c`. This change switches to using a custom combinator that handles this properly
2025-06-30 18:54:31 -07:00
aiwhskruht
d4ad17d75a
Unified fatbin versions behind a single iterator. ( #398 )
2025-06-27 15:56:46 -07:00
Violet
80607c07db
Check LLVM IR for test_ptx!
with no input/output ( #394 )
2025-06-24 11:53:30 -07:00
Andrzej Janik
22608d7420
Bump dependencies ( #392 )
...
zip 2.6.1 was yanked and microlp 2.10 has a major bug
2025-06-23 18:04:08 -07:00
Violet
5edfeb04eb
Error instead of infinite loop when parsing enum without a derive attribute in derive_parser! ( #391 )
2025-06-23 16:18:21 -07:00
Violet
74ff9ebf96
Remove trailing zeroes from end of ptx ( #390 )
2025-06-23 16:14:07 -07:00