Commits
Click on a commit to change the comparison range:bug: Better handle CudaCogReader import logic
CudaCogReader might not be available on some platforms, so hide it behind some gates. :alembic: Debug cupy's cuda stream handling
For some reason, calling CudaCogReader twice makes things work, i.e. the returned cupy.ndarray has the correct numbers. Thinking it might be some CUDA stream issue (https://docs.cupy.dev/en/v13.6.0/user_guide/basic.html#current-stream), but cupy should already be using the default null stream by default.
Putting some print() and dbg!() statements here and there. Bumped cupy-cuda12x to cupy-cuda13x. :twisted_rightwards_arrows: Merge branch 'main' into dlpack_to_cupy :construction_worker: Install nvTIFF on Linux CI for python wheel builds
Install nvTIFF binaries from nvidia repos following instructions on https://developer.nvidia.com/nvtiff-0-5-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=24.04&target_type=deb_network. Could have tried to get it from PyPI following https://docs.nvidia.com/cuda/nvtiff/installation.html#pypi, but then would need to figure out the lib paths and stuff. :wrench: Modify cuda repo for aarch64
Xref https://developer.nvidia.com/nvtiff-0-5-0-download-archive?target_os=Linux&target_arch=arm64-sbsa&Compilation=Native&Distribution=Ubuntu&target_version=24.04&target_type=deb_network :green_heart: Install clang-devel to compile nvtiff-sys
Fix `Unable to find libclang: "couldn't find any valid shared libraries matching: ['libclang.so', 'libclang-*.so', 'libclang.so.*', 'libclang-*.so.*'], set the `LIBCLANG_PATH` environment variable to a path where one of these files can be found (invalid: [])"`. Need to install this inside the manylinux_2_28 docker container. :beers: Install nvTIFF and clang-dev with either dnf or apt
Depending on which manylinux_2_28 docker image is pulled for each target arch, the underlying distribution could either be AlmaLinux or Ubuntu based, so need to handle either way of installing nvTIFF and clang-dev. :adhesive_bandage: Install cuda deps and patch nvtiff.h file
Fix nvtiff-sys compilation errors by installing missing CUDA runtime dependencies (cuda-crt and cuda-cudart-devel) and patching the nvtiff.h file following https://docs.rs/nvtiff-sys/0.1.2/nvtiff_sys/#instructions :bug: Patch to build wheels with cuda and pyo3 feature flags
Default `pyo3` flag set in pyproject.toml is overidden when passing `--features` flag to maturin build, so need to set `cuda,pyo3` instead. Also copy code from e75d17164669c37f43c94dadad213d53520eefb1 to free-threaded build section. :necktie: Only build wheels with cuda flag on linux-x86_64 and aarch64
Too tricky to get nvTIFF working on armv7, s390x and ppc64le due to some linker error like `/usr/armv7-unknown-linux-gnueabihf/lib/gcc/armv7-unknown-linux-gnueabihf/7.5.0/../../../../armv7-unknown-linux-gnueabihf/bin/ld: cannot find -lnvtiff`, so disabling them on those platforms. :memo: Add CudaCogReader class to API docs
Need to include 'cuda' feature flag to maturin on ReadtheDocs, and get libnvtiff-dev from conda-forge. Added a warning to the docstring indicating that CudaCogReader is experimental, and only available on linux-x86_64 and linux-aarch64 builds. :beers: Pass include dir to LD_LIBRARY_PATH and BINDGEN_EXTRA_CLANG_ARGS
Point to where the header files are located. nvtiff.h is in $CONDA_PREFIX/include. cuda_runtime.h and crt/host_config.h are in $CONDA_PREFIX/targets/x86_64-linux/include. :rotating_light: Use pyclass(unsendable) instead of deriving Send/Sync
Not sure if raw pointer in CudaCogReader is thread-safe enough to do `unsafe impl Send/Sync`, so using unsendable instead for now. Xref https://pyo3.rs/v0.27.1/migration.html#pyclass-structs-must-now-be-send-or-unsendable :sparkles: Support stream and max_version kwargs to __dlpack__ method
Check that stream and max_version arguments are valid. Currently only supporting stream=1 or None, and DLPack version 1.x (dlpark is using DLPack 1.1). Have added some docstrings for these parameters. Not implementing copy kwarg yet though.