Java
  • About This Book
  • 🍖Prerequisites
    • 反射
      • 反射基本使用
      • 高版本JDK反射绕过
      • 反射调用命令执行
      • 反射构造HashMap
      • 方法句柄
    • 类加载
      • 动态加载字节码
      • 双亲委派模型
      • BCEL
      • SPI
    • RMI & JNDI
      • RPC Intro
      • RMI
      • JEP 290
      • JNDI
    • Misc
      • Unsafe
      • 代理模式
      • JMX
      • JDWP
      • JPDA
      • JVMTI
      • JNA
      • Java Security Manager
  • 👻Serial Journey
    • URLDNS
    • SerialVersionUID
    • Commons Collection 🥏
      • CC1-TransformedMap
      • CC1-LazyMap
      • CC6
      • CC3
      • CC2
    • FastJson 🪁
      • FastJson-Basic Usage
      • FastJson-TemplatesImpl
      • FastJson-JdbcRowSetImpl
      • FastJson-BasicDataSource
      • FastJson-ByPass
      • FastJson与原生反序列化(一)
      • FastJson与原生反序列化(二)
      • Jackson的原生反序列化利用
    • Other Components
      • SnakeYaml
      • C3P0
      • AspectJWeaver
      • Rome
      • Spring
      • Hessian
      • Hessian_Only_JDK
      • Kryo
      • Dubbo
  • 🌵RASP
    • JavaAgent
    • JVM
    • ByteCode
    • JNI
    • ASM 🪡
      • ASM Intro
      • Class Generation
      • Class Transformation
    • Rasp防御命令执行
    • OpenRASP
  • 🐎Memory Shell
    • Tomcat-Architecture
    • Servlet API
      • Listener
      • Filter
      • Servlet
    • Tomcat-Middlewares
      • Tomcat-Valve
      • Tomcat-Executor
      • Tomcat-Upgrade
    • Agent MemShell
    • WebSocket
    • 内存马查杀
    • IDEA本地调试Tomcat
  • ✂️JDBC Attack
    • MySQL JDBC Attack
    • H2 JDBC Attack
  • 🎨Templates
    • FreeMarker
    • Thymeleaf
    • Enjoy
  • 🎏MessageQueue
    • ActiveMQ CNVD-2023-69477
    • AMQP CVE-2023-34050
    • Spring-Kafka CVE-2023-34040
    • RocketMQ CVE-2023-33246
  • 🛡️Shiro
    • Shiro Intro
    • Request URI ByPass
    • Context Path ByPass
    • Remember Me反序列化 CC-Shiro
    • CB1与无CC依赖的反序列化链
  • 🍺Others
    • Deserialization Twice
    • A New Blazer 4 getter RCE
    • Apache Commons Jxpath
    • El Attack
    • Spel Attack
    • C3P0原生反序列化的JNDI打法
    • Log4j
    • Echo Tech
      • SpringBoot Under Tomcat
    • CTF 🚩
      • 长城杯-b4bycoffee (ROME反序列化)
      • MTCTF2022(CB+Shiro绕过)
      • CISCN 2023 西南赛区半决赛 (Hessian原生JDK+Kryo反序列化)
      • CISCN 2023 初赛 (高版本Commons Collections下其他依赖的利用)
      • CISCN 2021 总决赛 ezj4va (AspectJWeaver写字节码文件到classpath)
      • D^3CTF2023 (新的getter+高版本JNDI不出网+Hessian异常toString)
      • WMCTF2023(CC链花式玩法+盲读文件)
      • 第六届安洵杯网络安全挑战赛(CB PriorityQueue替代+Postgresql JDBC Attack+FreeMarker)
  • 🔍Code Inspector
    • CodeQL 🧶
      • Tutorial
        • Intro
        • Module
        • Predicate
        • Query
        • Type
      • CodeQL 4 Java
        • Basics
        • DFA
        • Example
    • SootUp ✨
      • Intro
      • Jimple
      • DFA
      • CG
    • Tabby 🔦
      • install
    • Theory
      • Static Analysis
        • Intro
        • IR & CFG
        • DFA
        • DFA-Foundation
        • Interprocedural Analysis
        • Pointer Analysis
        • Pointer Analysis Foundation
        • PTA-Context Sensitivity
        • Taint Anlysis
        • Datalog
Powered by GitBook
On this page
  • Intro
  • CHA
  • ICFG
  • Inter DFA

Was this helpful?

  1. 🔍Code Inspector
  2. Theory
  3. Static Analysis

Interprocedural Analysis

PreviousDFA-FoundationNextPointer Analysis

Last updated 7 months ago

Was this helpful?

Intro

all analyses we learnt previously are intraprocedural which can not deal with method calls.

Take Constant Propagation for example:

we make the most conservative assumption for method calls(safe-approximation)

  • x = NAC

  • y = NAC

  • n = NAC

which leads to imprecision.

For better precision, introducing interprocedural analysis(propagate data-flow information along interprocedural control-flow edges). First we need call graph to perform interprocedural analysis.

Call Graph is a representation of calling relationships in the program.

It consists of a set of call edges from call-sites to their target methods(callees)

Some applications of Call Graph:

  • Interprocedural Analyses

  • Program optimization

  • Program understanding

  • Program debugging

  • Program testing

  • .....

Call Graph Construction for OOPLs

Before proceeding, we need to learn some basic knowledge about method calls in Java

Static Call
Special Call
Virtual Call

instruction

invokestatic

invokespecial

invokeinterface、invokevirtual

receiver objects

×

√

√

description

static methods

constructors、private instance methods、superclass instance methods

other instance methods

target methods

1

1

≥1(polymorphism)

determinacy

Compile-time

Compile-time

Run-time

static call and special call can be determined at compile-time, but virtual call can be only determined at run-time. The former is trivial and the latter is non-trivial(key to call graph construction for OOPLs)

During run-time, a virtual call is dynamically resolved based on(method dispatch)

  1. type of the receiver object(caller)

  2. method signature at the call site

We define function Dispatch(c, m) to simulate the procedure of run-time method dispatch

finding the class type which contains the called method is important.

Dispatch(B, A.foo()) = A.foo()

Dispatch(C, A.foo()) = C.foo()

CHA

Class Hierarchy Analysis

  • Require the class hierarchy information (inheritance structure) of the whole program

  • Resolve a virtual call based on the declared type of receiver variable of the call site

A a = ...
a.foo()

Assume the receiver variable a may point to objects of class A or all subclasses of A

Resolve target methods by looking up the class hierarchy of class A

Taking the class inheritance relationship above as an example, let's take a look at the bytecode of the method call.

public class C extends B{
    @Override
    public void foo() {
        super.foo();
        this.secret();
    }
    private void secret(){
    }
    public C() {
    }
}

corresponding jvm bytecode👇

ALOAD 0
INVOKESPECIAL B.foo ()V
ALOAD 0
INVOKESPECIAL C.secret ()V
RETURN

<init>
aload_0
invokespecial <B.<init> : ()V>
return

super.foo() calls the foo method of the superclass of class C. But we know that class B does not have a foo method. So invokespecial also needs to dispatch.

For calls to private methods and constructors, the corresponding class can be found directly in the method signature.

In addition, the constructor first calls the parent class's no-argument

In fact, static methods can also be inherited, so invokestatic also requires dispatch.

So what we need to focus on are calls to methods that can be overridden.

A a = new C();
a.foo();

corresponding jvm bytecode👇

new <C>
dup
invokespecial <C.<init> : ()V>
astore_1
aload_1
invokevirtual <A.foo : ()V>
return

You can see that the class pointed to in the method signature followed by invokevirtual is the declared type of the receiver object.

Given the polymorphic nature of Java, the actual method invoked might be the foo method superseded by a subclass of class A. So we shall find the foo method of class A and its subclasses.

We define function Resolve(cs) to resolve possible target methods of a call site cs by CHA

inheritance structure

Resolve(c.foo()) = {C.foo()}

Resolve(a.foo()) = {A.foo(), C.foo(), D.foo()}

Resolve(b.foo()) = {A.foo(), C.foo(), D.foo()}

NOTE: if

B b = new B();
b.foo();

still, Resolve(b.foo()) = {A.foo(), C.foo(), D.foo()}

CHA produces two spurious call targets

Features of CHA:

  • advantage: fast

    • Only consider the declared type of receiver variable at the call-site, and its inheritance hierarchy

    • Ignore data and control-flow information

  • disadvantage: imprecise

    • Easily introduce spurious target methods

  • common usage: IDE

Navigate -> Call Hierarchy

Call Graph Construction —— Algorithm

  • Start from entry method(main method in java)

  • For each reachable method m, resolve target methods for each call site cs in m via CHA(Resolve(cs))

  • Repeat until no new method is discovered

Here we ignore constructor call

  • Round 0

WL = [A.main()]

  • Round 1

Resolve(a.foo()) = {A.foo()}

WL = [A.foo()] RM = [A.main()]

  • Round 2

Resolve(a.bar()) = {A.bar(), B.bar(), C.bar()}

WL = [A.bar(), B.bar(), C.bar()] RM = [A.main(), A.foo()]

  • Round 3

Resolve(c.bar()) = {C.bar()}

WL = [B.bar(), C.bar(), C.bar()] RM = [A.main(), A.foo(), A.bar()]

  • Round 4

WL = [C.bar(), C.bar()] RM = [A.main(), A.foo(), A.bar(), B.bar()]

  • Round 5

Resolve(A.foo()) = {A.foo()}

WL = [C.bar(), A.foo()] RM = [A.main(), A.foo(), A.bar(), B.bar(), C.bar()]

  • Round 6

WL = [] RM = [A.main(), A.foo(), A.bar(), B.bar(), C.bar()]

ICFG

CFG represents structure of an individual method while ICFG represents structure of the whole program.

An ICFG of a program consists of CFGs of the methods in the program, plus two kinds of additional edges:

  • Call edges: from call sites to the entry nodes of their callees

  • Return edges: from exit nodes of the callees to the statements following their call sites (return site)

The edge from call site to return site is called call-to-return edge.

ICFG = CFGs + call & return edges

We build call & return edges via Call Graph

Inter DFA

How to analyze the whole program with method calls based on ICFG

intraprocedural
interprocedural

representation

CFG

ICFG

transfer functions

node transfer

node transfer+edge transfer

Edge transfer • Call edge transfer: transfer data flow from call site to the entry node of callee (along call edges) • Return edge transfer: transfer data flow from exit node of the callee to the return site (along return edges)

For call site, the transfer function is identity function.

We leave the handling of the LHS(Left-Hand-Side) variable(return value) to edge transfer

Why call-to-return edge exists?

  • allow the analysis to propagate local data-flow

  • we have to propagate local data-flow across target methods without call-to-return edge(inefficient and troublesome)

Why need to kill LHS variable on call-to-return edge?

  • LHS variable’s value will flow to return site along the return edges.

  • merge its previous value with return edge will cause impression.

Interprocedural Constant Propagation In Summary

  • Node transfer

    • Call nodes: identity

    • Other nodes: same as intraprocedural constant propagation

  • Edge transfer

    • Normal edges: identity

    • Call-to-return edges: kill the value of LHS variable of the call site, propagate values of other local variables

    • Call edges: pass argument values

    • Return edges: pass return values

image-20240406223311898
image-20240406223950493
image-20240406224131960
image-20240406225211122
image-20240406225243632
image-20240406225530186
image-20240406230815555
image-20240406231655955
image-20241003141258635
image-20240406234814278
image-20240407104644921
image-20240407105505569
image-20240407114123021
image-20240407114844589