Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verifying C written with C11 features using SAW #2099

Open
pennyannn opened this issue Aug 22, 2024 · 8 comments
Open

Verifying C written with C11 features using SAW #2099

pennyannn opened this issue Aug 22, 2024 · 8 comments

Comments

@pennyannn
Copy link
Contributor

When migrating AWS-LC to C11, all of our existing proofs failed with the following error at LLVM module loading step:

Mismatched value types:
cmp value: ValIdent (Ident "4")
new value: ValIdent (Ident "7")
cmp type:  PrimType (Integer 32)
new type:  PrimType (Integer 32)
from:
	FUNC_CODE_INST_CMPXCHG
	@CRYPTO_refcount_inc
	FUNCTION_BLOCK
	FUNCTION_BLOCK_ID
	value symbol table
	MODULE_BLOCK
	Bitstream

The function CRYPTO_refcount_inc is defined at https://github.com/aws/aws-lc/blob/main/crypto/refcount_lock.c#L29 for C99 and defined at https://github.com/aws/aws-lc/blob/main/crypto/refcount_c11.c#L37 for C11. Note that the C11 version is using the new atomic feature. I'm wondering if this means that SAW does not support new features in C11, therefore the LLVM loader couldn't understand the syntax. Any recommendations for how to get around this problem?

@RyanGlScott
Copy link
Contributor

What versions of SAW and LLVM are you using? I ask since this looks a lot like the bug reported in GaloisInc/llvm-pretty-bc-parser#250, where the LLVM bitcode parser's treatment of the the cmpxchg instruction was incorrect in the presence of opaque pointers (which are enabled by default in LLVM 15 or later). This bug has since been fixed in GaloisInc/llvm-pretty-bc-parser#251 and is included as part of the SAW 1.1 release.

Generally speaking, SAW should support most C11 features, provided that they compile down to LLVM instructions that SAW knows how to reason about. That's usually the case.

@RyanGlScott
Copy link
Contributor

By the way, here is a test case (distilled from GaloisInc/llvm-pretty-bc-parser#251) that you can use to easily tell whether your version of SAW is affected by GaloisInc/llvm-pretty-bc-parser#250 or not:

// test.c
#include <stdint.h>
#include <stdatomic.h>

atomic_int val = 0x928;

int do_atomic_update(atomic_int newval) {
    int old_val = 0x928;
    return atomic_compare_exchange_weak(&val, &old_val, newval);
}
// test.saw
let do_atomic_update_spec = do {
  llvm_alloc_global "val";
  llvm_points_to (llvm_global "val") (llvm_term {{ zext 0x928 : [32] }});
  newval <- llvm_fresh_var "newval" (llvm_int 32);

  llvm_execute_func [llvm_term newval];

  llvm_return (llvm_term {{ 1 : [32] }});
};

m <- llvm_load_module "test.bc";
llvm_verify m "do_atomic_update" [] false do_atomic_update_spec z3;

With SAW 1.1, this test case passes regardless of whether you use a version of LLVM does or does not require opaque pointers:

# LLVM 12 does not use opaque pointers
$ ~/Software/clang+llvm-12.0.0/bin/clang test.c -emit-llvm -g -c
$ ~/Software/saw-1.1/bin/saw test.saw 
[09:21:46.597] Loading file "/home/ryanscott/Documents/Hacking/SAW/test.saw"
[09:21:46.657] Verifying do_atomic_update ...
[09:21:46.658] Simulating do_atomic_update ...
[09:21:46.660] Checking proof obligations do_atomic_update ...
[09:21:46.660] Proof succeeded! do_atomic_update

# LLVM 16 does use opaque pointers
$ ~/Software/clang+llvm-16.0.4/bin/clang test.c -emit-llvm -g -c
$ ~/Software/saw-1.1/bin/saw test.saw 
[09:21:52.497] Loading file "/home/ryanscott/Documents/Hacking/SAW/test.saw"
[09:21:52.558] Verifying do_atomic_update ...
[09:21:52.559] Simulating do_atomic_update ...
[09:21:52.560] Checking proof obligations do_atomic_update ...
[09:21:52.560] Proof succeeded! do_atomic_update

@pennyannn
Copy link
Contributor Author

pennyannn commented Aug 29, 2024

Hi @RyanGlScott , thanks for mentioning this fix PR. It actually fixed the module loading issue, which is great. However, this means that I have to upgrade SAW in our CI and that hasn't been a pleasant experience. Above is the PR I worked on for fixing failures after upgrading SAW. There are a total of three lemmas failing. Two of them I was able to fix without too much trouble. There is one that revealed a change in SAW that caused the term to be much bigger than before, causing Z3 to fail to prove it.

The lemma is called verify_HMAC_Init_ex_array_spec. I looked into the generated term for that theorem using the old SAW and the upgraded SAW. I found that specifically, the subterm for slicing has changed. For example in old SAW, the following is the subterm:

 x@149 = arrayCopy 64 x@2 x@137 x@6 key#1595 x@6 key_len#1594
  x@184 = append 56 8 Bool
                (append 48 8 Bool
                   (append 40 8 Bool
                      (append 32 8 Bool
                         (append 24 8 Bool
                            (append 16 8 Bool
                               (append 8 8 Bool (arrayLookup x@1 x@2 x@149 x@10)
                                  (arrayLookup x@1 x@2 x@149 x@12))
                               (arrayLookup x@1 x@2 x@149 x@17))
                            (arrayLookup x@1 x@2 x@149 x@53))
                         (arrayLookup x@1 x@2 x@149 x@13))
                      (arrayLookup x@1 x@2 x@149 x@16))
                   (arrayLookup x@1 x@2 x@149 x@9))
                (arrayLookup x@1 x@2 x@149 x@6)

I suspect the arrayCopy is coming from OPENSSL_memcpy(key_block, key, key_len); in the C code https://github.com/aws/aws-lc/blob/main/crypto/fipsmodule/hmac/hmac.c#L399, and the slicing is coming from key_block[i] in https://github.com/aws/aws-lc/blob/main/crypto/fipsmodule/hmac/hmac.c#L402. Seems to make sense.
However, using the upgraded SAW, I get this:

  x@283 = ite x@1 x@273 x@14
                (ite x@1 x@274 (append 56 8 Bool x@143 x@153)
                   (ite x@1 x@275
                      (append 48 16 Bool x@276 (append 8 8 Bool x@154 x@153))
                      (ite x@1 x@277
                         (append 40 24 Bool x@278
                            (append 16 8 Bool (append 8 8 Bool x@155 x@154) x@153))
                         (ite x@1 x@279
                            (append 32 32 Bool x@145
                               (append 24 8 Bool
                                  (append 16 8 Bool (append 8 8 Bool x@156 x@155) x@154)
                                  x@153))
                            (ite x@1 x@280
                               (append 24 40 Bool x@144
                                  (append 32 8 Bool
                                     (append 24 8 Bool
                                        (append 16 8 Bool (append 8 8 Bool x@157 x@156) x@155)
                                        x@154)
                                     x@153))
                               (ite x@1 x@281
                                  (append 16 48 Bool x@152
                                     (append 40 8 Bool
                                        (append 32 8 Bool
                                           (append 24 8 Bool
                                              (append 16 8 Bool (append 8 8 Bool x@158 x@157) x@156)
                                              x@155)
                                           x@154)
                                        x@153))
                                  (ite x@1 x@282
                                     (append 8 56 Bool x@147
                                        (append 48 8 Bool
                                           (append 40 8 Bool
                                              (append 32 8 Bool
                                                 (append 24 8 Bool
                                                    (append 16 8 Bool (append 8 8 Bool x@159 x@158) x@157)
                                                    x@156)
                                                 x@155)
                                              x@154)
                                           x@153))
                                     (append 56 8 Bool
                                        (append 48 8 Bool
                                           (append 40 8 Bool
                                              (append 32 8 Bool
                                                 (append 24 8 Bool
                                                    (append 16 8 Bool
                                                       (append 8 8 Bool (arrayLookup x@1 x@3 key#1484 x@18) x@159)
                                                       x@158)
                                                    x@157)
                                                 x@156)
                                              x@155)
                                           x@154)
                                        x@153))))))))

The if conditions are key_len = 0, key_len = 1, ..., which breaks key_len into cases. for each case it tries to assemble the bytes from key back. I'm sorry if these terms are too dense and doesn't make sense to you.

I'm simply wondering if this kind of change is something you could directly recognize. Do you have any suggestions on what I could do? Is there some option that I can use to achieve the old version which is more pleasant to the eyes and easier to prove?

@pennyannn
Copy link
Contributor Author

Hi @RyanGlScott, any ideas for how to resolve this issue?

@RyanGlScott
Copy link
Contributor

Sorry for taking so long to reply.

I don't immediately recognize what is going on, but I suspect that changing the Clang compiler flags in awslabs/aws-lc-verification#162 may cause the LLVM bitcode to be different enough to matter. I'd have to take a closer look to see if there are finer-grained Clang flags that would allow restoring the old behavior.

@pennyannn
Copy link
Contributor Author

Thanks! I would be really surprised if it's the compiler flag causing this issue but I could double check the LLVM generated. Btw, it sounds like there is no change in SAW that would lead to this behaviour?

@RyanGlScott
Copy link
Contributor

No, there hasn't been any deliberate changes on SAW's end (to my knowledge) that would affect the shape of proof goals in such a noticeable way. My understanding is that CMake passes different flags to Clang depending on whether you are compiling for C99 versus C11, so any information you can provide on how your setup might affect the resulting calls to Clang would be helpful.

@pennyannn
Copy link
Contributor Author

Thanks for the confirmation, I will look into it. It's unfortunate that it is very easy to get caught off-guard by compilation instability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants