Java
  • About This Book
  • 🍖Prerequisites
    • 反射
      • 反射基本使用
      • 高版本JDK反射绕过
      • 反射调用命令执行
      • 反射构造HashMap
      • 方法句柄
    • 类加载
      • 动态加载字节码
      • 双亲委派模型
      • BCEL
      • SPI
    • RMI & JNDI
      • RPC Intro
      • RMI
      • JEP 290
      • JNDI
    • Misc
      • Unsafe
      • 代理模式
      • JMX
      • JDWP
      • JPDA
      • JVMTI
      • JNA
      • Java Security Manager
  • 👻Serial Journey
    • URLDNS
    • SerialVersionUID
    • Commons Collection 🥏
      • CC1-TransformedMap
      • CC1-LazyMap
      • CC6
      • CC3
      • CC2
    • FastJson 🪁
      • FastJson-Basic Usage
      • FastJson-TemplatesImpl
      • FastJson-JdbcRowSetImpl
      • FastJson-BasicDataSource
      • FastJson-ByPass
      • FastJson与原生反序列化(一)
      • FastJson与原生反序列化(二)
      • Jackson的原生反序列化利用
    • Other Components
      • SnakeYaml
      • C3P0
      • AspectJWeaver
      • Rome
      • Spring
      • Hessian
      • Hessian_Only_JDK
      • Kryo
      • Dubbo
  • 🌵RASP
    • JavaAgent
    • JVM
    • ByteCode
    • JNI
    • ASM 🪡
      • ASM Intro
      • Class Generation
      • Class Transformation
    • Rasp防御命令执行
    • OpenRASP
  • 🐎Memory Shell
    • Tomcat-Architecture
    • Servlet API
      • Listener
      • Filter
      • Servlet
    • Tomcat-Middlewares
      • Tomcat-Valve
      • Tomcat-Executor
      • Tomcat-Upgrade
    • Agent MemShell
    • WebSocket
    • 内存马查杀
    • IDEA本地调试Tomcat
  • ✂️JDBC Attack
    • MySQL JDBC Attack
    • H2 JDBC Attack
  • 🎨Templates
    • FreeMarker
    • Thymeleaf
    • Enjoy
  • 🎏MessageQueue
    • ActiveMQ CNVD-2023-69477
    • AMQP CVE-2023-34050
    • Spring-Kafka CVE-2023-34040
    • RocketMQ CVE-2023-33246
  • 🛡️Shiro
    • Shiro Intro
    • Request URI ByPass
    • Context Path ByPass
    • Remember Me反序列化 CC-Shiro
    • CB1与无CC依赖的反序列化链
  • 🍺Others
    • Deserialization Twice
    • A New Blazer 4 getter RCE
    • Apache Commons Jxpath
    • El Attack
    • Spel Attack
    • C3P0原生反序列化的JNDI打法
    • Log4j
    • Echo Tech
      • SpringBoot Under Tomcat
    • CTF 🚩
      • 长城杯-b4bycoffee (ROME反序列化)
      • MTCTF2022(CB+Shiro绕过)
      • CISCN 2023 西南赛区半决赛 (Hessian原生JDK+Kryo反序列化)
      • CISCN 2023 初赛 (高版本Commons Collections下其他依赖的利用)
      • CISCN 2021 总决赛 ezj4va (AspectJWeaver写字节码文件到classpath)
      • D^3CTF2023 (新的getter+高版本JNDI不出网+Hessian异常toString)
      • WMCTF2023(CC链花式玩法+盲读文件)
      • 第六届安洵杯网络安全挑战赛(CB PriorityQueue替代+Postgresql JDBC Attack+FreeMarker)
  • 🔍Code Inspector
    • CodeQL 🧶
      • Tutorial
        • Intro
        • Module
        • Predicate
        • Query
        • Type
      • CodeQL 4 Java
        • Basics
        • DFA
        • Example
    • SootUp ✨
      • Intro
      • Jimple
      • DFA
      • CG
    • Tabby 🔦
      • install
    • Theory
      • Static Analysis
        • Intro
        • IR & CFG
        • DFA
        • DFA-Foundation
        • Interprocedural Analysis
        • Pointer Analysis
        • Pointer Analysis Foundation
        • PTA-Context Sensitivity
        • Taint Anlysis
        • Datalog
Powered by GitBook
On this page
  • Compiler and Static Analyzer
  • Compiler Structure
  • why IR
  • 3-Address Code
  • Control Flow Graph

Was this helpful?

  1. 🔍Code Inspector
  2. Theory
  3. Static Analysis

IR & CFG

PreviousIntroNextDFA

Last updated 7 months ago

Was this helpful?

Compiler and Static Analyzer

Compiler Structure

How does Compiler work?

传统编译器架构:

  • Frontend:前端

    • Lexical Analysis:词法分析

    • Syntax Analysis:语法分析

    • Semantic Analysis:语义分析

    • Translator:生成中间代码(Intermediate Representation)

  • Optimizer:优化器

  • Backend:后端

    Code Generator生成机器码

why IR

why IR is better for static analysis than AST?

AST:

  • high-level and closed to grammar structure

  • usually language dependent

  • suitable for fast type checking

  • lack of control flow information

IR:

  • low-level and closed to machine code

  • usually language independent

  • compact and uniform

  • contains control flow information

  • usually considered as the basis for static analysis

Other candidates:

Java source code:

  • statements and classes can be nested

Java bytecode

  • advantages

    • no nesting; one statement follows the other; looping/branches through jumps

    • nested classes are "flattened" into normal classes

  • disadvantages

    • no local variables: operations performed on operand stack

    • too many bytecodes(more than 200,many of them are overloaded based on their type)

3-Address Code

3-Address Code(3AC)

  • There is at most one operator on the right side of an instruction

  • Each 3AC contains at most 3 address

note:There is no fixed realization of the 3AC.

Address can means:

  • Name

  • Constant

  • Compiler-generated temporary variable

Some Common 3AC Forms:

Soot is one of the most popular static analysis framework for Java

Soot's IR is Jimple:typed 3-address code

JVM complement:

invoke special:call constructor、superclass methods、private methods

invoke virtual:instance methods call(virtual dispatch)

invoke interface:checking interface implement

invoke static:call static methods

invoke dynamic:dynamic language runs on JVM

methods signature: <class name: return type method name(param1 type, param2 type,...)>

Control Flow Graph

  • Building Control Flow Graph(CFG)

  • CFG serves as the basic structure for static analysis

  • The node in CFG can be an individual 3AC(or a Basic Block 【BB】)

Basic Blocks are maximal sequences of consecutive three-address instructions

  • It can be entered only at the beginning

  • It can be exited only at the end

Build CFG through BBs

The nodes of CFG are basic blocks

  • There is an edge from block A to block B if and only if

    • There is a conditional or unconditional jump from the end of A to the beginning of B

    • B immediately follows A in the original order of instructions and A does not end in an unconditional jump

    • A is a predecessor of B and B is a successor of A

  • It is normal to replace the jumps to instruction labels by jumps to basic blocks

  • Usually we add two nodes, Entry and Exit

    • An edge from Entry to the BB containing the first instruction of IR

    • An edge to Exit from any BB containing an instruction that could be the last instruction of IR

image-20230402150937216
image-20230402151352286
image-20240312193742675
image-20240312195128468
image-20230403093852162
image-20230403094033043
image-20230403094332792