Commits
Click on a commit to change the comparison rangeFast performing edges for FP32 GEMM of RVV.2 months ago
by ChipKerchner Add bool types for C.2 months ago
by ChipKerchner Add K-unrolling to M = 8. Other small changes.2 months ago
by ChipKerchner Unroll K for N less than or equal to 4.2 months ago
by ChipKerchner Common unroll code.2 months ago
by ChipKerchner Preserve K.2 months ago
by ChipKerchner Better K.2 months ago
by ChipKerchner Global optimizations.2 months ago
by ChipKerchner Use mf2 instead of m1.2 months ago
by ChipKerchner Simplier loops.2 months ago
by ChipKerchner More global optimzation and clean up.2 months ago
by ChipKerchner Merge remote-tracking branch 'origin/develop' into fasterRVVEdges2 months ago
by ChipKerchner Avoid greater than 4 segment load and store penalties by using 2. Fix mf2 length.2 months ago
by ChipKerchner Only initialize unused variables to prevent GCC warnings.2 months ago
by ChipKerchner Fix typo.2 months ago
by ChipKerchner Fix another typo.1 month ago
by ChipKerchner Convert 2X LMUL1 instructions to 1X LMUL2. Improved FP64 GEMM edges - up to more than 3X faster.1 month ago
by ChipKerchner Remove shadow variable.1 month ago
by ChipKerchner Use LMUL2 loads in main block.1 month ago
by ChipKerchner Use LMUL2 for calculations in main block - just break them apart before last stage.1 month ago
by ChipKerchner Forgot files from previous check-in.1 month ago
by ChipKerchner