Update README for version 4 (#315)

This commit is contained in:
Andrzej Janik
2024-12-31 17:33:59 +01:00
committed by GitHub
parent de870db1f1
commit ecd61a8e2a
4 changed files with 47 additions and 81 deletions

View File

@ -1,61 +0,0 @@
# Dependencies
Development builds of ZLUDA requires following dependencies:
* CMake
* Python 3
Additionally the repository has to be cloned with Git submodules initalized. If you cloned the repo without initalizing submodules, do this:
```
git submodule update --init --recursive
```
# Tests
Tests should be executed with `--workspace` option to test non-default targets:
```
cargo test --workspace
```
# Debugging
## Debuggging CUDA applications
When running an application with ZLUDA quite often you will run into subtle bugs or incompatibilities in the generated GPU code. The best way to debug an application's GPU CUDA code is to use ZLUDA dumper.
Library `zluda_dump` can be injected into a CUDA application and produce a trace which, for every launched GPU function contains:
* PTX source
* Launch arguments (block size, grid size, shared memory size)
* Dump of function arguments. Both after and before
Example use with GeekBench:
```
set ZLUDA_DUMP_KERNEL=knn_match
set ZLUDA_DUMP_DIR=C:\temp\zluda_dump
"<ZLUDA_PATH>\zluda_with.exe" "<ZLUDA_PATH>\zluda_dump.dll" -- "geekbench_x86_64.exe" --compute CUDA
```
The example above, for every execution of GPU function `knn_match`, will save its details into the directory `C:\temp\zluda_dump`
This dump can be replayed with `replay.py` script from `zluda_dump` source directory. Use it like this:
```
python replay.py "C:\temp\zluda_dump\geekbench_x86_64.exe"
```
You must copy (or symlink) ZLUDA `nvcuda.dll` into PyCUDA directory, so it will run using ZLUDA. Example output:
```
Intel(R) Graphics [0x3e92] [github.com/vosen/ZLUDA]
C:\temp\zluda_dump\geekbench_x86_64.exe\4140_scale_pyramid
C:\temp\zluda_dump\geekbench_x86_64.exe\4345_convolve_1d_vertical_grayscale
Skipping, launch block size (512) bigger than maximum block size (256)
C:\temp\zluda_dump\geekbench_x86_64.exe\4480_scale_pyramid
6:
Arrays are not equal
Mismatched elements: 1200 / 19989588 (0.006%)
Max absolute difference: 255
Max relative difference: 255.
x: array([ 7, 6, 8, ..., 193, 195, 193], dtype=uint8)
y: array([ 7, 6, 8, ..., 193, 195, 193], dtype=uint8)
```
From this output one can observe that in kernel launch 4480, 6th argument to function `scale_pyramid` differs between what was executed on an NVIDIA GPU using CUDA and Intel GPU using ZLUDA.
__Important__: It's impossible to infer what was the type (and semantics) of argument passed to a GPU function. At our level it's a buffer of bytes and by default `replay.py` simply checks if two buffers are byte-equal. That means you will have a ton of false negatives when running `replay.py`. You should override them for your particular case in `replay.py` - it already contains some overrides for GeekBench kernels

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 259 KiB

View File

@ -4,18 +4,23 @@
ZLUDA is a drop-in replacement for CUDA on non-NVIDIA GPU. ZLUDA allows to run unmodified CUDA applications using non-NVIDIA GPUs with near-native performance.
ZLUDA supports AMD Radeon RX 5000 series and newer GPUs (both desktop and integrated).
![GeekBench 5.5.1 chart](geekbench.svg)
ZLUDA is work in progress. Follow development here and say hi on [Discord](https://discord.gg/sg6BNzXuc7). For more details see the announcement: https://vosen.github.io/ZLUDA/blog/zludas-third-life/
## Usage
**Warning**: ZLUDA is under heavy development (see news [here](https://vosen.github.io/ZLUDA/blog/zludas-third-life/)). Instructions below might not work.
**Warning**: This version ZLUDA is under heavy development (more [here](https://vosen.github.io/ZLUDA/blog/zludas-third-life/)) and right now only supports Geekbench. ZLUDA probably will not work with your application just yet.
### Windows
You should have the most recent ROCm installed.\
Run your application like this:
```
<ZLUDA_DIRECTORY>\zluda_with.exe -- <APPLICATION> <APPLICATIONS_ARGUMENTS>
```
You should have recent AMD GPU driver ("AMD Software: Adrenalin Edition") installed.\
To run your application you should etiher:
* (Recommended approach) Copy ZLUDA-provided `nvcuda.dll` and `nvml.dll` from `target\release` (if built from sources) or `zluda` (if downloaded a zip package) into a path which your application uses to load CUDA. Paths vary application to application, but usually it's the directory where the .exe file is located
* Use ZLUDA launcher like below. ZLUDA launcher is known to be buggy and incomplete:
```
<ZLUDA_DIRECTORY>\zluda_with.exe -- <APPLICATION> <APPLICATIONS_ARGUMENTS>
```
### Linux
@ -24,33 +29,55 @@ Run your application like this:
LD_LIBRARY_PATH=<ZLUDA_DIRECTORY> <APPLICATION> <APPLICATIONS_ARGUMENTS>
```
where `<ZLUDA_DIRECTORY>` is the directory which contains ZLUDA-provided `libcuda.so`: `target/release` if you built from sources or `zluda` if you downloaded prebuilt package.
### MacOS
Not supported
## Building
**Warning**: ZLUDA is under heavy development (see news [here](https://vosen.github.io/ZLUDA/blog/zludas-third-life/)). Instructions below might not work.
_Note_: This repo has submodules. Make sure to recurse submodules when cloning this repo, e.g.: `git clone --recursive https://github.com/vosen/ZLUDA.git`
### Dependencies
You should have a relatively recent version of Rust installed, then you just do:
* Git
* CMake
* Python 3
* Rust compiler (recent version)
* C++ compiler
* (Optional, but recommended) [Ninja build system](https://ninja-build.org/)
### Build steps
* Git clone the repo (make sure to use `--recursive` option to fetch submodules):
`git clone --recursive https://github.com/vosen/ZLUDA.git`
* Enter freshly cloned `ZLUDA` directory and build with cargo (this takes a while):
`cargo build --release`
```
cargo build --release
```
in the main directory of the project.
### Linux
If you are building on Linux you must also symlink (or rename) the ZLUDA output binaries after ZLUDA build finishes:
If you are building on Linux you must also symlink the ZLUDA output binaries after ZLUDA build finishes:
```
ln -s libnvcuda.so target/release/libcuda.so
ln -s libnvcuda.so target/release/libcuda.so.1
ln -s libnvml.so target/release/libnvidia-ml.so
cd target/release
ln -s libnvcuda.so libcuda.so
ln -s libnvcuda.so libcuda.so.1
ln -s libnvml.so libnvidia-ml.so
ln -s libnvml.so libnvidia-ml.so.1
```
## Contributing
If you want to develop ZLUDA itself, read [CONTRIBUTING.md](CONTRIBUTING.md), it contains instructions how to set up dependencies and run tests
ZLUDA project has a commercial backing and _does not_ accept donations.
ZLUDA project accepts pull requests and other non-monetary contributions.
If you want to contribute a code fix or documentation update feel free to open a Pull Request.
### Getting started
There's no architecture document (yet). Two most important crates in ZLUDA are `ptx` (PTX compiler) and `zluda` (AMD GPU runtime). A good starting point to tinkering the project is to run one of the `ptx` unit tests under a debugger and understand what it is doing. `cargo test -p ptx -- ::add_hip` is a simple test that adds two numbers.
Github issues tagged with ["help wanted"](https://github.com/vosen/ZLUDA/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) are tasks that are self-containted. Their level of difficulty varies, they are not always good beginner tasks, but they defined unambiguously.
If you have questions feel free to ask on [#devtalk channel on Discord](https://discord.com/channels/1273316903783497778/1303329281409159270).
## License

1
geekbench.svg Normal file

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 287 KiB