Violet
dc69808e54
Add support for shfl.sync.MODE.b32
( #409 )
2025-07-16 17:23:11 -07:00
Andrzej Janik
36f0ba9cbb
Apply rounding mode in fp div ( #416 )
2025-07-16 17:22:59 -07:00
Violet
95d66df18e
Only allow (.u32, .pred) for multiple return ( #417 )
2025-07-16 17:03:28 -07:00
Violet
7c6b95a8e3
Allow messages for error_todo ( #415 )
2025-07-16 15:54:40 -07:00
Violet
6fb09f393a
Handle WARP_SZ
( #412 )
...
* Add tests for `WARP_SZ`
* Handle WARP_SZ in parser
2025-07-16 11:02:17 -07:00
Violet
06b28cfec7
More descriptive message for unknown symbol ( #411 )
2025-07-14 15:01:38 -07:00
Violet
6e27f78ae7
Add support for multiple return arguments ( #406 )
2025-07-09 08:17:15 -07:00
Andrzej Janik
059b8ca0f6
Make sure it is possible to log 32bit PhysX ( #374 )
2025-07-08 10:19:49 -07:00
Andrzej Janik
ef0c4afcf9
Run unit tests on every commit ( #401 )
2025-07-03 16:07:00 -07:00
Violet
5cb0a9b8e8
Add support for bar.red.and.pred
( #402 )
...
Implements bar.red.and.pred and bar.red.or.pred, using the undocument __ockl_wgred functions. Doesn't yet add support for numbered barriers and threadcount, as these are not needed for llm.c.
2025-07-03 11:56:20 -07:00
Violet
7bdd20f0dd
Add warp-wide tests ( #400 )
2025-07-02 18:11:36 -07:00
Andrzej Janik
6d56fa8c34
Fix floating point min/max ( #399 )
2025-07-01 15:58:16 -07:00
Violet
b824424367
Read test files at runtime for development ergonomics ( #395 )
2025-07-01 10:31:06 -07:00
Violet
80607c07db
Check LLVM IR for test_ptx!
with no input/output ( #394 )
2025-06-24 11:53:30 -07:00
Andrzej Janik
22608d7420
Bump dependencies ( #392 )
...
zip 2.6.1 was yanked and microlp 2.10 has a major bug
2025-06-23 18:04:08 -07:00
Andrzej Janik
2a374ad880
Add fp saturation, fix various bugs in cvt instruction exposed by ptx_tests ( #379 )
2025-06-16 19:14:16 -07:00
Andrzej Janik
3361046760
Fix mad.wide, replace external CUDA library in test with our own ( #376 )
2025-06-09 21:33:18 -07:00
Andrzej Janik
3d3e38aadc
Fix ROCm 6.4 failures ( #364 )
...
Lazy load comgr and dispatch to different code paths based on the name of the comgr .dll/.so
2025-05-02 00:38:22 +02:00
Joëlle van Essen
7cdab7abc2
Implement mul24 ( #351 )
2025-04-08 12:27:19 +02:00
Andrzej Janik
d704e92c97
Support instruction modes (denormal and rounding) on AMD GPUs ( #342 )
2025-03-17 21:37:26 +01:00
Joëlle van Essen
867e4728d5
LLVM unit tests ( #324 )
...
* LLVM unit tests: add assembly files
* LLVM unit tests: first attempt
* LLVM unit tests: fix - parse bitcode in context
* LLVM unit tests: use pretty_assertions for line-by-line diff
* LLVM unit tests: Write IR to file for failed test
* LLVM unit tests: just use the stack
* LLVM unit tests: use MaybeUninit
* LLVM unit tests: add mul24.ll
* LLVM unit tests: Adjustments after review
* LLVM unit tests: Include emit_llvm::Context in emit_llvm::Module
* LLVM unit tests: Fix typo
* LLVM unit tests: Context need not be pub
2025-02-19 21:21:20 +01:00
Andrzej Janik
646d746e02
Start working on mul24
2025-02-07 19:37:11 +00:00
Andrzej Janik
7ac67a89e9
Enable Geekbench 5 ( #304 )
2024-12-10 21:48:10 +01:00
Andrzej Janik
7a6df9dcbf
Fix host code and update to CUDA 12.4 ( #299 )
2024-12-02 00:29:57 +01:00
Andrzej Janik
970ba5aa25
Fix linking of AMD device libraries ( #296 )
...
It's weird that it fails without `-mno-link-builtin-bitcode-postopt`. I've tested it only on ROCm 6.2, might be broken on older or newer ROCm
2024-11-02 16:07:44 +01:00
Andrzej Janik
3870a96592
Re-enable all failing PTX tests ( #277 )
...
Additionally remove unused compilation paths
2024-10-16 03:15:48 +02:00
Andrzej Janik
c92abba2bb
Refactor compilation passes ( #270 )
...
The overarching goal is to refactor all passes so they are module-scoped and not function-scoped. Additionally, make improvements to the most egregiously buggy/unfit passes (so the code is ready for the next major features: linking, ftz handling) and continue adding more code to the LLVM backend
2024-09-23 16:33:46 +02:00
Andrzej Janik
46def3e7e0
Connect new parser to LLVM bitcode backend ( #269 )
...
This is very incomplete. Just enough code to emit LLVM bitcode and continue further development
2024-09-13 01:07:31 +02:00
Andrzej Janik
193eb29be8
PTX parser rewrite ( #267 )
...
Replaces traditional LALRPOP-based parser with winnow-based parser to handle out-of-order instruction modifer. Generate instruction type and instruction visitor from a macro instead of writing by hand. Add separate compilation path using the new parser that only works in tests for now
2024-09-04 15:47:42 +02:00
Andrzej Janik
971951bc9e
Improve reporting of recovered unrecognized statement/directive
2021-12-14 00:02:23 +01:00
Andrzej Janik
0ca14d740f
Better reporting of unrecognized tokens
2021-12-13 22:25:26 +01:00
Andrzej Janik
7ba1586d6c
Make all user errors recoverable
2021-12-13 17:20:06 +01:00
Andrzej Janik
816365e7df
Fix shared munging pass and add fix cuModuleLoadData
2021-09-29 21:49:47 +00:00
Andrzej Janik
0172dc58e5
Redo shared memory transformation
2021-09-29 02:24:32 +02:00
Andrzej Janik
b763415006
Add CUDA tests showing problems with .shared unification
2021-09-27 00:42:10 +02:00
Andrzej Janik
c23be576e8
Finish fixing shared memory pass
2021-09-26 01:24:14 +02:00
Andrzej Janik
370c0bd09e
Start implementing .shared unification
2021-09-24 01:31:50 +02:00
Andrzej Janik
9609f86033
Fix minor bugs
2021-09-19 00:39:43 +00:00
Andrzej Janik
afe9120868
Fix linkage
2021-09-18 22:49:00 +00:00
Andrzej Janik
04a411fe22
Have an implementation for vprintf
2021-09-18 20:22:47 +00:00
Andrzej Janik
ccf3c02ac1
Minor fixes
2021-09-18 01:36:12 +00:00
Andrzej Janik
3de01b3f8b
Handle ld.volatile/st.volatile
2021-09-17 21:26:15 +00:00
Andrzej Janik
d5a4b068dd
Redo handling of sregs
2021-09-17 20:53:44 +00:00
Andrzej Janik
6ef19d6501
Add early support for more sregs
2021-09-17 18:31:12 +00:00
Andrzej Janik
5b2352723f
Implement function pointers and activemask
2021-09-17 16:24:25 +00:00
Andrzej Janik
ca0d8ec666
Add missing vray instructions
2021-09-16 01:25:09 +02:00
Andrzej Janik
467782b1d0
Fix some unhandled cases in cvt instruction
2021-09-14 23:38:06 +00:00
Andrzej Janik
2cd0fcb650
Parse and test const buffers
2021-09-14 22:41:46 +02:00
Andrzej Janik
986fa49097
Zero out buffer on creation
2021-09-13 23:43:50 +00:00
Andrzej Janik
dbb6f09ffa
Continue HIP conversion
2021-09-13 17:59:40 +00:00