Violet
dc69808e54
Add support for shfl.sync.MODE.b32
( #409 )
2025-07-16 17:23:11 -07:00
Violet
5cb0a9b8e8
Add support for bar.red.and.pred
( #402 )
...
Implements bar.red.and.pred and bar.red.or.pred, using the undocument __ockl_wgred functions. Doesn't yet add support for numbered barriers and threadcount, as these are not needed for llm.c.
2025-07-03 11:56:20 -07:00
Andrzej Janik
7ac67a89e9
Enable Geekbench 5 ( #304 )
2024-12-10 21:48:10 +01:00
Andrzej Janik
7a6df9dcbf
Fix host code and update to CUDA 12.4 ( #299 )
2024-12-02 00:29:57 +01:00
Andrzej Janik
970ba5aa25
Fix linking of AMD device libraries ( #296 )
...
It's weird that it fails without `-mno-link-builtin-bitcode-postopt`. I've tested it only on ROCm 6.2, might be broken on older or newer ROCm
2024-11-02 16:07:44 +01:00
Andrzej Janik
3870a96592
Re-enable all failing PTX tests ( #277 )
...
Additionally remove unused compilation paths
2024-10-16 03:15:48 +02:00
Andrzej Janik
04a411fe22
Have an implementation for vprintf
2021-09-18 20:22:47 +00:00
Andrzej Janik
d5a4b068dd
Redo handling of sregs
2021-09-17 20:53:44 +00:00
Andrzej Janik
6ef19d6501
Add early support for more sregs
2021-09-17 18:31:12 +00:00
Andrzej Janik
5b2352723f
Implement function pointers and activemask
2021-09-17 16:24:25 +00:00
Andrzej Janik
18245be7d5
Make ptx unit tests run on AMD (except denormals)
2021-09-07 23:24:49 +00:00
Andrzej Janik
638786b0ec
Hack enough functionality that AMD GPU code builds
2021-08-03 00:22:47 +02:00
Andrzej Janik
ad2059872a
Regenerate SPIR-V for ptx_impl and fix weird handling of ptr-ptr add or sub
2021-07-03 02:13:38 +02:00
Andrzej Janik
e328ecc550
Be more correct when emitting brev, refactor inst->func call pass
2021-07-02 22:45:09 +02:00
Andrzej Janik
17291019e3
Implement atomic float add
2021-03-03 22:41:47 +01:00
Andrzej Janik
178ec59af6
Implement bfi instruction
2021-03-01 23:01:53 +01:00
Andrzej Janik
bcd1740ba9
Add README and rebuild .spv library
2020-11-23 21:50:21 +01:00
Andrzej Janik
eb7c9aeeee
Rename everything
2020-11-23 20:01:10 +01:00
Andrzej Janik
6e39c4a90c
Fix linking with shl/shr, add memset on host and support __assertfail
2020-11-21 01:53:07 +01:00
Andrzej Janik
ac6265f257
Implement instructions bfe, rem, xor
2020-11-06 00:56:45 +01:00
Andrzej Janik
a82eb20817
Implement atomic instructions
2020-10-31 21:28:15 +01:00