365 Commits

Author SHA1 Message Date
Violet
5edfeb04eb Error instead of infinite loop when parsing enum without a derive attribute in derive_parser! (#391) 2025-06-23 16:18:21 -07:00
Violet
74ff9ebf96 Remove trailing zeroes from end of ptx (#390) 2025-06-23 16:14:07 -07:00
Violet
f4cd545677 Fix bug in get_payload (#389) 2025-06-18 17:29:21 -07:00
Violet
4da3978f94 Implement cuLibraryLoadData (#388) 2025-06-18 16:05:53 -07:00
Violet
8ce70c5095 Add integrity_check implementation to ZLUDA (#387) 2025-06-17 15:00:10 -07:00
Andrzej Janik
2a374ad880 Add fp saturation, fix various bugs in cvt instruction exposed by ptx_tests (#379) 2025-06-16 19:14:16 -07:00
Violet
4d4053194a Implement runtime_callback_hooks_fn6 (#386) 2025-06-16 17:00:47 -07:00
Violet
9c5f1ed9fb Handle new attributes in cuDeviceGetAttribute (#383) 2025-06-16 13:20:04 -07:00
Andrzej Janik
f179868b8e Add automated builds (#358) 2025-06-16 09:53:18 -07:00
Violet
9773d20945 Implement cudart_interface_fn2 (#382) 2025-06-13 14:01:14 -07:00
Violet
1715830d82 Implement cuModuleGetLoadingMode (#381) 2025-06-11 15:54:48 -07:00
Violet
25a9d1c40e Implement runtime_callback_hooks_fn2 (#380) 2025-06-11 15:15:43 -07:00
Violet
62f3e63355 Implement cuGetProcAddress and cuGetProcAddress_v2 (#377) 2025-06-10 16:07:35 -07:00
Andrzej Janik
3361046760 Fix mad.wide, replace external CUDA library in test with our own (#376) 2025-06-09 21:33:18 -07:00
Andrzej Janik
c790ab45ec Redo logging to better log dark API and performance libraries (#372) 2025-06-09 15:29:14 -07:00
Andrzej Janik
5935cfec78 Work around broken AMD Adrenalin 25.5.1 driver (#366)
For reasons unknown AMD Adrenalin 25.5.1 ships with comgr that presents itself as version 2, but expects ABI for veersion 3. Add a workaround
2025-05-13 02:20:23 +02:00
Andrzej Janik
3d3e38aadc Fix ROCm 6.4 failures (#364)
Lazy load comgr and dispatch to different code paths based on the name of the comgr .dll/.so
2025-05-02 00:38:22 +02:00
Andrzej Janik
cc83b9f1f6 Create infrastructure for performance libraries (#363) 2025-05-01 22:37:18 +02:00
Andrzej Janik
adc4673a20 Explicitly fail compilation on ROCm 6.4 (#361)
AMD broke comgr ABI in 6.4. This is a temporary solution.
2025-04-20 17:02:05 +02:00
Joëlle van Essen
7cdab7abc2 Implement mul24 (#351) 2025-04-08 12:27:19 +02:00
Andrzej Janik
d704e92c97 Support instruction modes (denormal and rounding) on AMD GPUs (#342) 2025-03-17 21:37:26 +01:00
Joëlle van Essen
867e4728d5 LLVM unit tests (#324)
* LLVM unit tests: add assembly files

* LLVM unit tests: first attempt

* LLVM unit tests: fix - parse bitcode in context

* LLVM unit tests: use pretty_assertions for line-by-line diff

* LLVM unit tests: Write IR to file for failed test

* LLVM unit tests: just use the stack

* LLVM unit tests: use MaybeUninit

* LLVM unit tests: add mul24.ll

* LLVM unit tests: Adjustments after review

* LLVM unit tests: Include emit_llvm::Context in emit_llvm::Module

* LLVM unit tests: Fix typo

* LLVM unit tests: Context need not be pub
2025-02-19 21:21:20 +01:00
Andrzej Janik
646d746e02 Start working on mul24 2025-02-07 19:37:11 +00:00
Andrzej Janik
df5a96d935 Improve build system (#329)
Also fix Dockerfile and Windows build
2025-01-28 01:55:36 +01:00
Alexander Zaitsev
9c0747a5f7 fix: missing inherits in a release-lto profile (#319) 2025-01-03 16:58:19 +01:00
Alexander Zaitsev
fee20e54d9 feat: enable LTO and codegen-units = 1 optimization (#318) 2025-01-02 19:07:39 +01:00
Joëlle van Essen
7399132d5d Fix test in zluda_dump (#316) 2025-01-01 23:02:59 +01:00
Andrzej Janik
ecd61a8e2a Update README for version 4 (#315) 2024-12-31 17:33:59 +01:00
Joëlle van Essen
de870db1f1 Fix build error (#314) v4 2024-12-20 18:33:05 +01:00
Andrzej Janik
7ac67a89e9 Enable Geekbench 5 (#304) 2024-12-10 21:48:10 +01:00
Andrzej Janik
7a6df9dcbf Fix host code and update to CUDA 12.4 (#299) 2024-12-02 00:29:57 +01:00
Rayyan Ul Haq
870fed4bb6 Update README.md (#300) 2024-11-25 00:45:09 +01:00
Andrzej Janik
970ba5aa25 Fix linking of AMD device libraries (#296)
It's weird that it fails without `-mno-link-builtin-bitcode-postopt`. I've tested it only on ROCm 6.2, might be broken on older or newer ROCm
2024-11-02 16:07:44 +01:00
Andrzej Janik
b4cb3ade63 Recover from and report unknown instructions and directives (#295) 2024-11-02 15:57:57 +01:00
Andrzej Janik
3870a96592 Re-enable all failing PTX tests (#277)
Additionally remove unused compilation paths
2024-10-16 03:15:48 +02:00
F. St.
1a63ef62b7 Add note about submodules to README.md (#280) 2024-10-05 15:41:46 +02:00
Andrzej Janik
7b2ecdd725 Update README (#279) 2024-10-04 16:35:40 +02:00
Andrzej Janik
c92abba2bb Refactor compilation passes (#270)
The overarching goal is to refactor all passes so they are module-scoped and not function-scoped. Additionally, make improvements to the most egregiously buggy/unfit passes (so the code is ready for the next major features: linking, ftz handling) and continue adding more code to the LLVM backend
2024-09-23 16:33:46 +02:00
Andrzej Janik
46def3e7e0 Connect new parser to LLVM bitcode backend (#269)
This is very incomplete. Just enough code to emit LLVM bitcode and continue further development
2024-09-13 01:07:31 +02:00
Andrzej Janik
193eb29be8 PTX parser rewrite (#267)
Replaces traditional LALRPOP-based parser with winnow-based parser to handle out-of-order instruction modifer. Generate instruction type and instruction visitor from a macro instead of writing by hand. Add separate compilation path using the new parser that only works in tests for now
2024-09-04 15:47:42 +02:00
Andrzej Janik
872054ae40 Fix linguist instructions 2024-08-07 13:29:03 +02:00
Andrzej Janik
90a1f77891 Update README 2024-08-06 16:32:23 +02:00
Andrzej Janik
164c172236 Clean up ZLUDA redirection helper 2022-02-04 14:14:51 +01:00
Andrzej Janik
2753d956df Overhaul DLL injection 2022-02-04 00:50:25 +01:00
Andrzej Janik
c869a0d611 Add tests for injecting into CLR process 2022-02-03 12:28:42 +01:00
Andrzej Janik
9923a36b76 Redo DLL injection 2022-02-01 23:57:36 +01:00
Andrzej Janik
89bc40618b Implement static typing for dynamically-loaded CUDA DLLs 2022-01-28 16:44:46 +01:00
Andrzej Janik
07aa1103aa Add OGL interop to cuda proc macros 2022-01-26 11:32:20 +01:00
Andrzej Janik
6f76c8b34c Fix crash when printing arrays 2022-01-08 18:44:59 +01:00
Andrzej Janik
2e56871643 Fix luid printing 2022-01-08 00:33:26 +01:00