项目作者: mmisono

项目描述 :
Convert cBPF program to LLVM IR (to compile eBPF program)
高级语言: Rust
项目地址: git://github.com/mmisono/cbpf-to-llvm-ir.git
创建时间: 2017-10-09T07:20:22Z
项目社区:https://github.com/mmisono/cbpf-to-llvm-ir

开源协议:Other

下载


cbpf-to-llvm-ir

Linux Build Status

Convert cBPF program to LLVM IR

cbpf2ir

This crate has a program called cbpf2ir, which can generate LLVM IR from libpcap’s filter expressions.

  1. % cargo run --bin cbpf2ir -- --help
  2. cbpf2ir 0.1.0
  3. Convert cBPF program to LLVM IR from libpcap's expression
  4. USAGE:
  5. cbpf2ir [FLAGS] [OPTIONS] <expression> --outfile <outfile>
  6. FLAGS:
  7. -d, --debug Activate debug mode
  8. -h, --help Prints help information
  9. -n, --noopt no optimization
  10. -V, --version Prints version information
  11. OPTIONS:
  12. -l, --linktype <linktype> LinkType (http://www.tcpdump.org/linktypes.html) [default: 1]
  13. -o, --outfile <outfile> Output file
  14. ARGS:
  15. <expression> cBPF filter expression

Example

  1. cargo run --bin cbpf2ir -- -o a.ll "tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)"

cBPF program:

  1. LD H ABS {code: 28, jt: 00, jf: 00, k: 0000000C} ldh [12]
  2. JEQ K {code: 15, jt: 00, jf: 06, k: 000086DD} jeq 34525 0 6
  3. LD B ABS {code: 30, jt: 00, jf: 00, k: 00000014} ldb [20]
  4. JEQ K {code: 15, jt: 00, jf: 04, k: 00000006} jeq 6 0 4
  5. LD H ABS {code: 28, jt: 00, jf: 00, k: 00000036} ldh [54]
  6. JEQ K {code: 15, jt: 0E, jf: 00, k: 00000050} jeq 80 14 0
  7. LD H ABS {code: 28, jt: 00, jf: 00, k: 00000038} ldh [56]
  8. JEQ K {code: 15, jt: 0C, jf: 00, k: 00000050} jeq 80 12 0
  9. LD H ABS {code: 28, jt: 00, jf: 00, k: 0000000C} ldh [12]
  10. JEQ K {code: 15, jt: 00, jf: 45, k: 00000800} jeq 2048 0 69
  11. LD B ABS {code: 30, jt: 00, jf: 00, k: 00000017} ldb [23]
  12. JEQ K {code: 15, jt: 00, jf: 43, k: 00000006} jeq 6 0 67
  13. LD H ABS {code: 28, jt: 00, jf: 00, k: 00000014} ldh [20]
  14. JSET K {code: 45, jt: 41, jf: 00, k: 00001FFF} jset 8191 65 0
  15. LDX B MSH {code: B1, jt: 00, jf: 00, k: 0000000E} ldxb ([14] & 0xf) << 2
  16. LD H IND {code: 48, jt: 00, jf: 00, k: 0000000E} ldh [14+X]
  17. JEQ K {code: 15, jt: 03, jf: 00, k: 00000050} jeq 80 3 0
  18. LDX B MSH {code: B1, jt: 00, jf: 00, k: 0000000E} ldxb ([14] & 0xf) << 2
  19. LD H IND {code: 48, jt: 00, jf: 00, k: 00000010} ldh [16+X]
  20. JEQ K {code: 15, jt: 00, jf: 3B, k: 00000050} jeq 80 0 59
  21. LD H ABS {code: 28, jt: 00, jf: 00, k: 0000000C} ldh [12]
  22. JEQ K {code: 15, jt: 00, jf: 39, k: 00000800} jeq 2048 0 57
  23. LD IMM {code: 00, jt: 00, jf: 00, k: 00000002} ldw 2
  24. ST {code: 02, jt: 00, jf: 00, k: 00000000} st MEM[0]
  25. LDX MEM {code: 61, jt: 00, jf: 00, k: 00000000} ldxw MEM[0]
  26. LD H IND {code: 48, jt: 00, jf: 00, k: 0000000E} ldh [14+X]
  27. ST {code: 02, jt: 00, jf: 00, k: 00000001} st MEM[1]
  28. LD IMM {code: 00, jt: 00, jf: 00, k: 00000000} ldw 0
  29. ST {code: 02, jt: 00, jf: 00, k: 00000002} st MEM[2]
  30. LDX MEM {code: 61, jt: 00, jf: 00, k: 00000002} ldxw MEM[2]
  31. LD B IND {code: 50, jt: 00, jf: 00, k: 0000000E} ldb [14+X]
  32. ST {code: 02, jt: 00, jf: 00, k: 00000003} st MEM[3]
  33. LD IMM {code: 00, jt: 00, jf: 00, k: 0000000F} ldw 15
  34. ST {code: 02, jt: 00, jf: 00, k: 00000004} st MEM[4]
  35. LDX MEM {code: 61, jt: 00, jf: 00, k: 00000004} ldxw MEM[4]
  36. LD MEM {code: 60, jt: 00, jf: 00, k: 00000003} ldw MEM[3]
  37. AND X {code: 5C, jt: 00, jf: 00, k: 00000000} and X
  38. ST {code: 02, jt: 00, jf: 00, k: 00000004} st MEM[4]
  39. LD IMM {code: 00, jt: 00, jf: 00, k: 00000002} ldw 2
  40. ST {code: 02, jt: 00, jf: 00, k: 00000005} st MEM[5]
  41. LDX MEM {code: 61, jt: 00, jf: 00, k: 00000005} ldxw MEM[5]
  42. LD MEM {code: 60, jt: 00, jf: 00, k: 00000004} ldw MEM[4]
  43. LSH X {code: 6C, jt: 00, jf: 00, k: 00000000} lsh X
  44. ST {code: 02, jt: 00, jf: 00, k: 00000005} st MEM[5]
  45. LDX MEM {code: 61, jt: 00, jf: 00, k: 00000005} ldxw MEM[5]
  46. LD MEM {code: 60, jt: 00, jf: 00, k: 00000001} ldw MEM[1]
  47. SUB X {code: 1C, jt: 00, jf: 00, k: 00000000} sub X
  48. ST {code: 02, jt: 00, jf: 00, k: 00000005} st MEM[5]
  49. LD IMM {code: 00, jt: 00, jf: 00, k: 0000000C} ldw 12
  50. ST {code: 02, jt: 00, jf: 00, k: 00000006} st MEM[6]
  51. LDX B MSH {code: B1, jt: 00, jf: 00, k: 0000000E} ldxb ([14] & 0xf) << 2
  52. LD MEM {code: 60, jt: 00, jf: 00, k: 00000006} ldw MEM[6]
  53. ADD X {code: 0C, jt: 00, jf: 00, k: 00000000} add X
  54. TAX {code: 07, jt: 00, jf: 00, k: 00000000} tax
  55. LD B IND {code: 50, jt: 00, jf: 00, k: 0000000E} ldb [14+X]
  56. ST {code: 02, jt: 00, jf: 00, k: 00000007} st MEM[7]
  57. LD IMM {code: 00, jt: 00, jf: 00, k: 000000F0} ldw 240
  58. ST {code: 02, jt: 00, jf: 00, k: 00000008} st MEM[8]
  59. LDX MEM {code: 61, jt: 00, jf: 00, k: 00000008} ldxw MEM[8]
  60. LD MEM {code: 60, jt: 00, jf: 00, k: 00000007} ldw MEM[7]
  61. AND X {code: 5C, jt: 00, jf: 00, k: 00000000} and X
  62. ST {code: 02, jt: 00, jf: 00, k: 00000008} st MEM[8]
  63. LD IMM {code: 00, jt: 00, jf: 00, k: 00000002} ldw 2
  64. ST {code: 02, jt: 00, jf: 00, k: 00000009} st MEM[9]
  65. LDX MEM {code: 61, jt: 00, jf: 00, k: 00000009} ldxw MEM[9]
  66. LD MEM {code: 60, jt: 00, jf: 00, k: 00000008} ldw MEM[8]
  67. RSH X {code: 7C, jt: 00, jf: 00, k: 00000000} rsh X
  68. ST {code: 02, jt: 00, jf: 00, k: 00000009} st MEM[9]
  69. LDX MEM {code: 61, jt: 00, jf: 00, k: 00000009} ldxw MEM[9]
  70. LD MEM {code: 60, jt: 00, jf: 00, k: 00000005} ldw MEM[5]
  71. SUB X {code: 1C, jt: 00, jf: 00, k: 00000000} sub X
  72. ST {code: 02, jt: 00, jf: 00, k: 00000009} st MEM[9]
  73. LD IMM {code: 00, jt: 00, jf: 00, k: 00000000} ldw 0
  74. ST {code: 02, jt: 00, jf: 00, k: 0000000A} st MEM[10]
  75. LDX MEM {code: 61, jt: 00, jf: 00, k: 0000000A} ldxw MEM[10]
  76. LD MEM {code: 60, jt: 00, jf: 00, k: 00000009} ldw MEM[9]
  77. SUB X {code: 1C, jt: 00, jf: 00, k: 00000000} sub X
  78. JEQ K {code: 15, jt: 01, jf: 00, k: 00000000} jeq 0 1 0
  79. RET K {code: 06, jt: 00, jf: 00, k: 0000FFFF} ret 65535
  80. RET K {code: 06, jt: 00, jf: 00, k: 00000000} ret 0

LLVM IR (after optimization)

  1. LLVM IR:
  2. ; ModuleID = 'cbpf_ir'
  3. source_filename = "cbpf_ir"
  4. ; Function Attrs: norecurse nounwind readonly
  5. define i32 @main(i8* nocapture readonly) local_unnamed_addr #0 {
  6. entry:
  7. %1 = getelementptr inbounds i8, i8* %0, i64 12
  8. %2 = load i8, i8* %1, align 1
  9. %3 = zext i8 %2 to i32
  10. %4 = shl nuw nsw i32 %3, 8
  11. %5 = getelementptr inbounds i8, i8* %0, i64 13
  12. %6 = load i8, i8* %5, align 1
  13. %7 = zext i8 %6 to i32
  14. %8 = or i32 %4, %7
  15. %cond = icmp eq i32 %8, 2048
  16. br i1 %cond, label %insn.10, label %insn.79
  17. insn.10: ; preds = %entry
  18. %9 = getelementptr inbounds i8, i8* %0, i64 23
  19. %10 = load i8, i8* %9, align 1
  20. %11 = icmp eq i8 %10, 6
  21. br i1 %11, label %insn.12, label %insn.79
  22. insn.12: ; preds = %insn.10
  23. %12 = getelementptr inbounds i8, i8* %0, i64 20
  24. %13 = load i8, i8* %12, align 1
  25. %14 = zext i8 %13 to i32
  26. %15 = shl nuw nsw i32 %14, 8
  27. %16 = getelementptr inbounds i8, i8* %0, i64 21
  28. %17 = load i8, i8* %16, align 1
  29. %18 = zext i8 %17 to i32
  30. %.masked = and i32 %15, 7936
  31. %19 = or i32 %.masked, %18
  32. %20 = icmp eq i32 %19, 0
  33. br i1 %20, label %insn.14, label %insn.79
  34. insn.14: ; preds = %insn.12
  35. %21 = getelementptr inbounds i8, i8* %0, i64 14
  36. %22 = load i8, i8* %21, align 1
  37. %23 = zext i8 %22 to i32
  38. %24 = shl nuw nsw i32 %23, 2
  39. %25 = and i32 %24, 60
  40. %26 = add nuw nsw i32 %25, 14
  41. %27 = zext i32 %26 to i64
  42. %28 = getelementptr inbounds i8, i8* %0, i64 %27
  43. %29 = load i8, i8* %28, align 1
  44. %30 = zext i8 %29 to i32
  45. %31 = shl nuw nsw i32 %30, 8
  46. %32 = getelementptr inbounds i8, i8* %28, i64 1
  47. %33 = load i8, i8* %32, align 1
  48. %34 = zext i8 %33 to i32
  49. %35 = or i32 %31, %34
  50. %36 = icmp eq i32 %35, 80
  51. br i1 %36, label %insn.22, label %insn.17
  52. insn.17: ; preds = %insn.14
  53. %37 = add nuw nsw i32 %25, 16
  54. %38 = zext i32 %37 to i64
  55. %39 = getelementptr inbounds i8, i8* %0, i64 %38
  56. %40 = load i8, i8* %39, align 1
  57. %41 = zext i8 %40 to i32
  58. %42 = shl nuw nsw i32 %41, 8
  59. %43 = getelementptr inbounds i8, i8* %39, i64 1
  60. %44 = load i8, i8* %43, align 1
  61. %45 = zext i8 %44 to i32
  62. %46 = or i32 %42, %45
  63. %47 = icmp eq i32 %46, 80
  64. br i1 %47, label %insn.22, label %insn.79
  65. insn.22: ; preds = %insn.17, %insn.14
  66. %48 = getelementptr inbounds i8, i8* %0, i64 16
  67. %49 = load i8, i8* %48, align 1
  68. %50 = zext i8 %49 to i32
  69. %51 = shl nuw nsw i32 %50, 8
  70. %52 = getelementptr inbounds i8, i8* %0, i64 17
  71. %53 = load i8, i8* %52, align 1
  72. %54 = zext i8 %53 to i32
  73. %55 = or i32 %51, %54
  74. %56 = sub nsw i32 %55, %25
  75. %57 = add nuw nsw i32 %25, 26
  76. %58 = zext i32 %57 to i64
  77. %59 = getelementptr inbounds i8, i8* %0, i64 %58
  78. %60 = load i8, i8* %59, align 1
  79. %61 = and i8 %60, -16
  80. %62 = zext i8 %61 to i32
  81. %63 = lshr exact i32 %62, 2
  82. %64 = icmp eq i32 %56, %63
  83. br i1 %64, label %insn.79, label %insn.78
  84. insn.78: ; preds = %insn.79, %insn.22
  85. %merge = phi i32 [ 65535, %insn.22 ], [ 0, %insn.79 ]
  86. ret i32 %merge
  87. insn.79: ; preds = %entry, %insn.12, %insn.22, %insn.17, %insn.10
  88. br label %insn.78
  89. }
  90. ; Function Attrs: nounwind readnone
  91. define i32 @be(i32) local_unnamed_addr #1 {
  92. %2 = tail call i32 @llvm.bswap.i32(i32 %0)
  93. ret i32 %2
  94. }
  95. ; Function Attrs: nounwind readnone speculatable
  96. declare i32 @llvm.bswap.i32(i32) #2
  97. attributes #0 = { norecurse nounwind readonly }
  98. attributes #1 = { nounwind readnone }
  99. attributes #2 = { nounwind readnone speculatable }

eBPF code (llc -march=bpf -o a.bpf a.ll)

  1. .text
  2. .macosx_version_min 10, 12
  3. .globl main # -- Begin function main
  4. .p2align 3
  5. main: # @main
  6. # BB#0: # %entry
  7. r2 = *(u8 *)(r1 + 13)
  8. r3 = *(u8 *)(r1 + 12)
  9. r3 <<= 8
  10. r3 |= r2
  11. if r3 != 2048 goto LBB0_7
  12. # BB#1: # %insn.10
  13. r2 = *(u8 *)(r1 + 23)
  14. if r2 != 6 goto LBB0_7
  15. # BB#2: # %insn.12
  16. r2 = *(u8 *)(r1 + 21)
  17. r3 = *(u8 *)(r1 + 20)
  18. r3 <<= 8
  19. r3 &= 7936
  20. r3 |= r2
  21. if r3 != 0 goto LBB0_7
  22. # BB#3: # %insn.14
  23. r2 = *(u8 *)(r1 + 14)
  24. r2 <<= 2
  25. r2 &= 60
  26. r3 = r1
  27. r3 += r2
  28. r4 = *(u8 *)(r3 + 15)
  29. r3 = *(u8 *)(r3 + 14)
  30. r3 <<= 8
  31. r3 |= r4
  32. if r3 == 80 goto LBB0_5
  33. # BB#4: # %insn.17
  34. r3 = r2
  35. r3 <<= 32
  36. r3 >>= 32
  37. r4 = r1
  38. r4 += r3
  39. r3 = *(u8 *)(r4 + 17)
  40. r4 = *(u8 *)(r4 + 16)
  41. r4 <<= 8
  42. r4 |= r3
  43. if r4 != 80 goto LBB0_7
  44. LBB0_5: # %insn.22
  45. r3 = *(u8 *)(r1 + 16)
  46. r3 <<= 8
  47. r4 = *(u8 *)(r1 + 17)
  48. r3 |= r4
  49. r3 -= r2
  50. r2 <<= 32
  51. r2 >>= 32
  52. r1 += r2
  53. r0 = 65535
  54. r1 = *(u8 *)(r1 + 26)
  55. r1 &= 240
  56. r1 >>= 2
  57. if r3 == r1 goto LBB0_7
  58. LBB0_6: # %insn.78
  59. exit
  60. LBB0_7: # %insn.79
  61. r0 = 0
  62. goto LBB0_6

Convertion Strategy

Convert each cBPF instruction to the corresponding basic block.
Some instructions which are difficult to directly convert LLVM IR are
converted so as to call functions defined in src/ll/util.ll,
which is generated by clang -S -emit-llvm util.c.
All of these functions are inlined by optimization.
(Note that to compile eBPF program, all functions must be inlined)

Note

The converted codes are not verified well yet.

  • c2e: Convert a cBPF program directly to the eBPF program

License

Dual-licensed under MIT or Apache-2.0.