6 Commits

Author SHA1 Message Date
Violet
27cfd50ddd Implement nanosleep.u32 (#421) 2025-07-21 17:42:04 -07:00
Violet
dc69808e54 Add support for shfl.sync.MODE.b32 (#409) 2025-07-16 17:23:11 -07:00
Andrzej Janik
36f0ba9cbb Apply rounding mode in fp div (#416) 2025-07-16 17:22:59 -07:00
Violet
5cb0a9b8e8 Add support for bar.red.and.pred (#402)
Implements bar.red.and.pred and bar.red.or.pred, using the undocument __ockl_wgred functions. Doesn't yet add support for numbered barriers and threadcount, as these are not needed for llm.c.
2025-07-03 11:56:20 -07:00
Andrzej Janik
2a374ad880 Add fp saturation, fix various bugs in cvt instruction exposed by ptx_tests (#379) 2025-06-16 19:14:16 -07:00
Andrzej Janik
d704e92c97 Support instruction modes (denormal and rounding) on AMD GPUs (#342) 2025-03-17 21:37:26 +01:00