Commits
Click on a commit to change the comparison range:alembic: Benchmark nvTIFF CUDA GPU-based decoding
Run benchmarks reading the LZW-compressed GeoTIFF to CUDA GPU memory via DLPack. Using cog3pio's CudaCogReader which uses bindings to the nvTIFF library. :truck: Perform host to device copy for CPU benchmarks
When 'cuda' feature flag is enabled, copy decoded bytes from host (CPU) to device (GPU) to allow fair comparison with nvTIFF benchmark where data resides in CUDA memory. Well, not exactly fair since nvTIFF is winning, but still need this.
Note that async-tiff's decoded byte length seems longer than expected, not sure why... Added some extra docs and links to the main README.md too. :recycle: Collapse async-tiff tile decode into single flat_map_iter call
No need for separate `.flat_map` and `.map`. Can coerce Bytes into u8 directly apparently. Still need to figure out if there's a more efficient way of multi-threaded decoding to raw bytes though.