Andrzej Janik
2a374ad880
Add fp saturation, fix various bugs in cvt instruction exposed by ptx_tests ( #379 )
2025-06-16 19:14:16 -07:00
Violet
4d4053194a
Implement runtime_callback_hooks_fn6
( #386 )
2025-06-16 17:00:47 -07:00
Violet
9c5f1ed9fb
Handle new attributes in cuDeviceGetAttribute
( #383 )
2025-06-16 13:20:04 -07:00
Andrzej Janik
f179868b8e
Add automated builds ( #358 )
2025-06-16 09:53:18 -07:00
Violet
9773d20945
Implement cudart_interface_fn2 ( #382 )
2025-06-13 14:01:14 -07:00
Violet
1715830d82
Implement cuModuleGetLoadingMode ( #381 )
2025-06-11 15:54:48 -07:00
Violet
25a9d1c40e
Implement runtime_callback_hooks_fn2 ( #380 )
2025-06-11 15:15:43 -07:00
Violet
62f3e63355
Implement cuGetProcAddress and cuGetProcAddress_v2 ( #377 )
2025-06-10 16:07:35 -07:00
Andrzej Janik
3361046760
Fix mad.wide, replace external CUDA library in test with our own ( #376 )
2025-06-09 21:33:18 -07:00
Andrzej Janik
c790ab45ec
Redo logging to better log dark API and performance libraries ( #372 )
2025-06-09 15:29:14 -07:00
Andrzej Janik
5935cfec78
Work around broken AMD Adrenalin 25.5.1 driver ( #366 )
...
For reasons unknown AMD Adrenalin 25.5.1 ships with comgr that presents itself as version 2, but expects ABI for veersion 3. Add a workaround
2025-05-13 02:20:23 +02:00
Andrzej Janik
3d3e38aadc
Fix ROCm 6.4 failures ( #364 )
...
Lazy load comgr and dispatch to different code paths based on the name of the comgr .dll/.so
2025-05-02 00:38:22 +02:00
Andrzej Janik
cc83b9f1f6
Create infrastructure for performance libraries ( #363 )
2025-05-01 22:37:18 +02:00
Andrzej Janik
adc4673a20
Explicitly fail compilation on ROCm 6.4 ( #361 )
...
AMD broke comgr ABI in 6.4. This is a temporary solution.
2025-04-20 17:02:05 +02:00
Joëlle van Essen
7cdab7abc2
Implement mul24 ( #351 )
2025-04-08 12:27:19 +02:00
Andrzej Janik
d704e92c97
Support instruction modes (denormal and rounding) on AMD GPUs ( #342 )
2025-03-17 21:37:26 +01:00
Joëlle van Essen
867e4728d5
LLVM unit tests ( #324 )
...
* LLVM unit tests: add assembly files
* LLVM unit tests: first attempt
* LLVM unit tests: fix - parse bitcode in context
* LLVM unit tests: use pretty_assertions for line-by-line diff
* LLVM unit tests: Write IR to file for failed test
* LLVM unit tests: just use the stack
* LLVM unit tests: use MaybeUninit
* LLVM unit tests: add mul24.ll
* LLVM unit tests: Adjustments after review
* LLVM unit tests: Include emit_llvm::Context in emit_llvm::Module
* LLVM unit tests: Fix typo
* LLVM unit tests: Context need not be pub
2025-02-19 21:21:20 +01:00
Andrzej Janik
646d746e02
Start working on mul24
2025-02-07 19:37:11 +00:00
Andrzej Janik
df5a96d935
Improve build system ( #329 )
...
Also fix Dockerfile and Windows build
2025-01-28 01:55:36 +01:00
Alexander Zaitsev
9c0747a5f7
fix: missing inherits in a release-lto profile ( #319 )
2025-01-03 16:58:19 +01:00
Alexander Zaitsev
fee20e54d9
feat: enable LTO and codegen-units = 1 optimization ( #318 )
2025-01-02 19:07:39 +01:00
Joëlle van Essen
7399132d5d
Fix test in zluda_dump ( #316 )
2025-01-01 23:02:59 +01:00
Andrzej Janik
ecd61a8e2a
Update README for version 4 ( #315 )
2024-12-31 17:33:59 +01:00
Joëlle van Essen
de870db1f1
Fix build error ( #314 )
v4
2024-12-20 18:33:05 +01:00
Andrzej Janik
7ac67a89e9
Enable Geekbench 5 ( #304 )
2024-12-10 21:48:10 +01:00
Andrzej Janik
7a6df9dcbf
Fix host code and update to CUDA 12.4 ( #299 )
2024-12-02 00:29:57 +01:00
Rayyan Ul Haq
870fed4bb6
Update README.md ( #300 )
2024-11-25 00:45:09 +01:00
Andrzej Janik
970ba5aa25
Fix linking of AMD device libraries ( #296 )
...
It's weird that it fails without `-mno-link-builtin-bitcode-postopt`. I've tested it only on ROCm 6.2, might be broken on older or newer ROCm
2024-11-02 16:07:44 +01:00
Andrzej Janik
b4cb3ade63
Recover from and report unknown instructions and directives ( #295 )
2024-11-02 15:57:57 +01:00
Andrzej Janik
3870a96592
Re-enable all failing PTX tests ( #277 )
...
Additionally remove unused compilation paths
2024-10-16 03:15:48 +02:00
F. St.
1a63ef62b7
Add note about submodules to README.md ( #280 )
2024-10-05 15:41:46 +02:00
Andrzej Janik
7b2ecdd725
Update README ( #279 )
2024-10-04 16:35:40 +02:00
Andrzej Janik
c92abba2bb
Refactor compilation passes ( #270 )
...
The overarching goal is to refactor all passes so they are module-scoped and not function-scoped. Additionally, make improvements to the most egregiously buggy/unfit passes (so the code is ready for the next major features: linking, ftz handling) and continue adding more code to the LLVM backend
2024-09-23 16:33:46 +02:00
Andrzej Janik
46def3e7e0
Connect new parser to LLVM bitcode backend ( #269 )
...
This is very incomplete. Just enough code to emit LLVM bitcode and continue further development
2024-09-13 01:07:31 +02:00
Andrzej Janik
193eb29be8
PTX parser rewrite ( #267 )
...
Replaces traditional LALRPOP-based parser with winnow-based parser to handle out-of-order instruction modifer. Generate instruction type and instruction visitor from a macro instead of writing by hand. Add separate compilation path using the new parser that only works in tests for now
2024-09-04 15:47:42 +02:00
Andrzej Janik
872054ae40
Fix linguist instructions
2024-08-07 13:29:03 +02:00
Andrzej Janik
90a1f77891
Update README
2024-08-06 16:32:23 +02:00
Andrzej Janik
164c172236
Clean up ZLUDA redirection helper
2022-02-04 14:14:51 +01:00
Andrzej Janik
2753d956df
Overhaul DLL injection
2022-02-04 00:50:25 +01:00
Andrzej Janik
c869a0d611
Add tests for injecting into CLR process
2022-02-03 12:28:42 +01:00
Andrzej Janik
9923a36b76
Redo DLL injection
2022-02-01 23:57:36 +01:00
Andrzej Janik
89bc40618b
Implement static typing for dynamically-loaded CUDA DLLs
2022-01-28 16:44:46 +01:00
Andrzej Janik
07aa1103aa
Add OGL interop to cuda proc macros
2022-01-26 11:32:20 +01:00
Andrzej Janik
6f76c8b34c
Fix crash when printing arrays
2022-01-08 18:44:59 +01:00
Andrzej Janik
2e56871643
Fix luid printing
2022-01-08 00:33:26 +01:00
Andrzej Janik
869efbe0e2
Move zluda_dump to the new CUDA infrastructure
2022-01-07 04:20:33 +01:00
Andrzej Janik
9390db962b
Start converting everything to the new log formatting
2021-12-20 09:35:13 +01:00
Andrzej Janik
bdcef897cc
Start converting zluda_dump logging to provide more detailed
2021-12-19 01:18:03 +01:00
Andrzej Janik
971951bc9e
Improve reporting of recovered unrecognized statement/directive
2021-12-14 00:02:23 +01:00
Andrzej Janik
0ca14d740f
Better reporting of unrecognized tokens
2021-12-13 22:25:26 +01:00