Update README for version 4 (#315)

2025-08-02 14:57:43 +03:00 · 2024-12-31 17:33:59 +01:00
parent de870db1f1
commit ecd61a8e2a
4 changed files with 47 additions and 81 deletions
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -1,61 +0,0 @@
-# Dependencies
-
-Development builds of ZLUDA requires following dependencies:
-
-* CMake
-* Python 3
-
-Additionally the repository has to be cloned with Git submodules initalized. If you cloned the repo without initalizing submodules, do this:
-```
-git submodule update --init --recursive
-```
-
-# Tests
-
-Tests should be executed with `--workspace` option to test non-default targets:
-```
-cargo test --workspace
-```
-
-# Debugging
-
-## Debuggging CUDA applications
-
-When running an application with ZLUDA quite often you will run into subtle bugs or incompatibilities in the generated GPU code. The best way to debug an application's GPU CUDA code is to use ZLUDA dumper.
-
-Library `zluda_dump` can be injected into a CUDA application and produce a trace which, for every launched GPU function contains:
-* PTX source
-* Launch arguments (block size, grid size, shared memory size)
-* Dump of function arguments. Both after and before
-
-Example use with GeekBench:
-```
-set ZLUDA_DUMP_KERNEL=knn_match
-set ZLUDA_DUMP_DIR=C:\temp\zluda_dump
-"<ZLUDA_PATH>\zluda_with.exe" "<ZLUDA_PATH>\zluda_dump.dll" -- "geekbench_x86_64.exe" --compute CUDA
-```
-
-The example above, for every execution of GPU function `knn_match`, will save its details into the directory `C:\temp\zluda_dump`
-
-This dump can be replayed with `replay.py` script from `zluda_dump` source directory. Use it like this:
-```
-python replay.py "C:\temp\zluda_dump\geekbench_x86_64.exe"
-```
-You must copy (or symlink) ZLUDA `nvcuda.dll` into PyCUDA directory, so it will run using ZLUDA. Example output:
-```
-Intel(R) Graphics [0x3e92] [github.com/vosen/ZLUDA]
-C:\temp\zluda_dump\geekbench_x86_64.exe\4140_scale_pyramid
-C:\temp\zluda_dump\geekbench_x86_64.exe\4345_convolve_1d_vertical_grayscale
-    Skipping, launch block size (512) bigger than maximum block size (256)
-C:\temp\zluda_dump\geekbench_x86_64.exe\4480_scale_pyramid
-6: 
-Arrays are not equal
-
-Mismatched elements: 1200 / 19989588 (0.006%)
-Max absolute difference: 255
-Max relative difference: 255.
- x: array([  7,   6,   8, ..., 193, 195, 193], dtype=uint8)
- y: array([  7,   6,   8, ..., 193, 195, 193], dtype=uint8)
-```
-From this output one can observe that in kernel launch 4480, 6th argument to function `scale_pyramid` differs between what was executed on an NVIDIA GPU using CUDA and Intel GPU using ZLUDA.  
-__Important__: It's impossible to infer what was the type (and semantics) of argument passed to a GPU function. At our level it's a buffer of bytes and by default `replay.py` simply checks if two buffers are byte-equal. That means you will have a ton of false negatives when running  `replay.py`. You should override them for your particular case in `replay.py` - it already contains some overrides for GeekBench kernels
--- a/GeekBench_5_2_3.svg
+++ b/GeekBench_5_2_3.svg
--- a/README.md
+++ b/README.md
@ -4,18 +4,23 @@

 ZLUDA is a drop-in replacement for CUDA on non-NVIDIA GPU. ZLUDA allows to run unmodified CUDA applications using non-NVIDIA GPUs with near-native performance.

+ZLUDA supports AMD Radeon RX 5000 series and newer GPUs (both desktop and integrated).
+
+![GeekBench 5.5.1 chart](geekbench.svg)
+
 ZLUDA is work in progress. Follow development here and say hi on [Discord](https://discord.gg/sg6BNzXuc7). For more details see the announcement: https://vosen.github.io/ZLUDA/blog/zludas-third-life/

-
 ## Usage
-**Warning**: ZLUDA is under heavy development (see news [here](https://vosen.github.io/ZLUDA/blog/zludas-third-life/)). Instructions below might not work.
+**Warning**: This version ZLUDA is under heavy development (more [here](https://vosen.github.io/ZLUDA/blog/zludas-third-life/)) and right now only supports Geekbench. ZLUDA probably will not work with your application just yet.

 ### Windows
-You should have the most recent ROCm  installed.\
-Run your application like this:
-```
-<ZLUDA_DIRECTORY>\zluda_with.exe -- <APPLICATION> <APPLICATIONS_ARGUMENTS>
-```
+You should have recent AMD GPU driver ("AMD Software: Adrenalin Edition") installed.\
+To run your application you should etiher:
+* (Recommended approach) Copy ZLUDA-provided `nvcuda.dll` and `nvml.dll` from `target\release` (if built from sources) or `zluda` (if downloaded a zip package) into a path which your application uses to load CUDA. Paths vary application to application, but usually it's the directory where the .exe file is located
+* Use ZLUDA launcher like below. ZLUDA launcher is known to be buggy and incomplete:
+    ```
+    <ZLUDA_DIRECTORY>\zluda_with.exe -- <APPLICATION> <APPLICATIONS_ARGUMENTS>
+    ```

 ### Linux

@ -24,33 +29,55 @@ Run your application like this:
 LD_LIBRARY_PATH=<ZLUDA_DIRECTORY> <APPLICATION> <APPLICATIONS_ARGUMENTS>
 ```

+where `<ZLUDA_DIRECTORY>` is the directory which contains ZLUDA-provided `libcuda.so`: `target/release` if you built from sources or `zluda` if you downloaded prebuilt package.
+
 ### MacOS

 Not supported

 ## Building
-**Warning**: ZLUDA is under heavy development (see news [here](https://vosen.github.io/ZLUDA/blog/zludas-third-life/)). Instructions below might not work.

-_Note_: This repo has submodules. Make sure to recurse submodules when cloning this repo, e.g.: `git clone --recursive https://github.com/vosen/ZLUDA.git`
+### Dependencies

- You should have a relatively recent version of Rust installed, then you just do:
+* Git
+* CMake
+* Python 3
+* Rust compiler (recent version)
+* C++ compiler
+* (Optional, but recommended) [Ninja build system](https://ninja-build.org/)
+
+### Build steps
+
+* Git clone the repo (make sure to use `--recursive` option to fetch submodules):  
+`git clone --recursive https://github.com/vosen/ZLUDA.git`  
+* Enter freshly cloned `ZLUDA` directory and build with cargo (this takes a while):  
+`cargo build --release`

-```
-cargo build --release
-```
-in the main directory of the project.  
 ### Linux

-If you are building on Linux you must also symlink (or rename) the ZLUDA output binaries after ZLUDA build finishes:
+If you are building on Linux you must also symlink the ZLUDA output binaries after ZLUDA build finishes:
 ```
-ln -s libnvcuda.so target/release/libcuda.so
-ln -s libnvcuda.so target/release/libcuda.so.1
-ln -s libnvml.so target/release/libnvidia-ml.so
+cd target/release
+ln -s libnvcuda.so libcuda.so
+ln -s libnvcuda.so libcuda.so.1
+ln -s libnvml.so libnvidia-ml.so
+ln -s libnvml.so libnvidia-ml.so.1
 ```

 ## Contributing

-If you want to develop ZLUDA itself, read [CONTRIBUTING.md](CONTRIBUTING.md), it contains instructions how to set up dependencies and run tests
+ZLUDA project has a commercial backing and _does not_ accept donations.
+ZLUDA project accepts pull requests and other non-monetary contributions.
+
+If you want to contribute a code fix or documentation update feel free to open a Pull Request.
+
+### Getting started
+
+There's no architecture document (yet). Two most important crates in ZLUDA are `ptx` (PTX compiler) and `zluda` (AMD GPU runtime). A good starting point to tinkering the project is to run one of the `ptx` unit tests under a debugger and understand what it is doing. `cargo test -p ptx -- ::add_hip` is a simple test that adds two numbers.
+
+Github issues tagged with ["help wanted"](https://github.com/vosen/ZLUDA/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) are tasks that are self-containted. Their level of difficulty varies, they are not always good beginner tasks, but they defined unambiguously.
+
+If you have questions feel free to ask on [#devtalk channel on Discord](https://discord.com/channels/1273316903783497778/1303329281409159270).


 ## License
--- a/geekbench.svg
+++ b/geekbench.svg