Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Account for factor in interleave memory op costs #111511

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Commits on Oct 8, 2024

  1. [RISCV] Account for factor in interleave memory op costs

    Currently we cost an interleaved memory op as if it were a load/store of the widened vector type.
    
    However this doesn't take into account that we'll most likely need to perform at least Factor uops because we're writing/reading from Factor number of registers.
    
    E.g. Today an i8 VF=2 Factor=8 interleave is costed as a single LMUL=1 op with +zvl128b, because the widened type is <16 x i8>.
    
    This changes it to be calculated as <2 x i8> * Factor=8, i.e. 8 LMUL=1 ops.
    
    Thankfully the FIXME about illegal vectors seems to have been fixed in llvm#100436, and even then I think the LT.first should have been multiplied, not added.
    
    Note we still have a quirk where the loop vectorizer will happily emit interleaved accesses for what could be strided accesses, because the costs are break-even in LoopVectorizationCostModel::setCostBasedWideningDecision:
    
    	void f(int8_t* a, int n) {
    	    for (int i = 0; i < n; i++) {
    	        a[i * 2] += 1;
    	    }
    	}
    
    	vsetvli	t1, zero, e8, m2, ta, ma
    	vlseg2e8.v	v24, (t0)
    	vadd.vi	v24, v24, 1
    	vsse8.v	v24, (a6), a5
    
    I think we may need to either adjust the cost or add a hook to get the loop vectorizer to s
    lukel97 committed Oct 8, 2024
    Configuration menu
    Copy the full SHA
    fba93ab View commit details
    Browse the repository at this point in the history