Java
  • About This Book
  • 🍖Prerequisites
    • 反射
      • 反射基本使用
      • 高版本JDK反射绕过
      • 反射调用命令执行
      • 反射构造HashMap
      • 方法句柄
    • 类加载
      • 动态加载字节码
      • 双亲委派模型
      • BCEL
      • SPI
    • RMI & JNDI
      • RPC Intro
      • RMI
      • JEP 290
      • JNDI
    • Misc
      • Unsafe
      • 代理模式
      • JMX
      • JDWP
      • JPDA
      • JVMTI
      • JNA
      • Java Security Manager
  • 👻Serial Journey
    • URLDNS
    • SerialVersionUID
    • Commons Collection 🥏
      • CC1-TransformedMap
      • CC1-LazyMap
      • CC6
      • CC3
      • CC2
    • FastJson 🪁
      • FastJson-Basic Usage
      • FastJson-TemplatesImpl
      • FastJson-JdbcRowSetImpl
      • FastJson-BasicDataSource
      • FastJson-ByPass
      • FastJson与原生反序列化(一)
      • FastJson与原生反序列化(二)
      • Jackson的原生反序列化利用
    • Other Components
      • SnakeYaml
      • C3P0
      • AspectJWeaver
      • Rome
      • Spring
      • Hessian
      • Hessian_Only_JDK
      • Kryo
      • Dubbo
  • 🌵RASP
    • JavaAgent
    • JVM
    • ByteCode
    • JNI
    • ASM 🪡
      • ASM Intro
      • Class Generation
      • Class Transformation
    • Rasp防御命令执行
    • OpenRASP
  • 🐎Memory Shell
    • Tomcat-Architecture
    • Servlet API
      • Listener
      • Filter
      • Servlet
    • Tomcat-Middlewares
      • Tomcat-Valve
      • Tomcat-Executor
      • Tomcat-Upgrade
    • Agent MemShell
    • WebSocket
    • 内存马查杀
    • IDEA本地调试Tomcat
  • ✂️JDBC Attack
    • MySQL JDBC Attack
    • H2 JDBC Attack
  • 🎨Templates
    • FreeMarker
    • Thymeleaf
    • Enjoy
  • 🎏MessageQueue
    • ActiveMQ CNVD-2023-69477
    • AMQP CVE-2023-34050
    • Spring-Kafka CVE-2023-34040
    • RocketMQ CVE-2023-33246
  • 🛡️Shiro
    • Shiro Intro
    • Request URI ByPass
    • Context Path ByPass
    • Remember Me反序列化 CC-Shiro
    • CB1与无CC依赖的反序列化链
  • 🍺Others
    • Deserialization Twice
    • A New Blazer 4 getter RCE
    • Apache Commons Jxpath
    • El Attack
    • Spel Attack
    • C3P0原生反序列化的JNDI打法
    • Log4j
    • Echo Tech
      • SpringBoot Under Tomcat
    • CTF 🚩
      • 长城杯-b4bycoffee (ROME反序列化)
      • MTCTF2022(CB+Shiro绕过)
      • CISCN 2023 西南赛区半决赛 (Hessian原生JDK+Kryo反序列化)
      • CISCN 2023 初赛 (高版本Commons Collections下其他依赖的利用)
      • CISCN 2021 总决赛 ezj4va (AspectJWeaver写字节码文件到classpath)
      • D^3CTF2023 (新的getter+高版本JNDI不出网+Hessian异常toString)
      • WMCTF2023(CC链花式玩法+盲读文件)
      • 第六届安洵杯网络安全挑战赛(CB PriorityQueue替代+Postgresql JDBC Attack+FreeMarker)
  • 🔍Code Inspector
    • CodeQL 🧶
      • Tutorial
        • Intro
        • Module
        • Predicate
        • Query
        • Type
      • CodeQL 4 Java
        • Basics
        • DFA
        • Example
    • SootUp ✨
      • Intro
      • Jimple
      • DFA
      • CG
    • Tabby 🔦
      • install
    • Theory
      • Static Analysis
        • Intro
        • IR & CFG
        • DFA
        • DFA-Foundation
        • Interprocedural Analysis
        • Pointer Analysis
        • Pointer Analysis Foundation
        • PTA-Context Sensitivity
        • Taint Anlysis
        • Datalog
Powered by GitBook
On this page
  • Motivation
  • Intro to Datalog
  • predicates
  • Atoms
  • Rules
  • Recursion
  • Rule Safety
  • Execution
  • Pointer Analysis via Datalog
  • Taint Analysis via Datalog

Was this helpful?

  1. 🔍Code Inspector
  2. Theory
  3. Static Analysis

Datalog

Motivation

Most program languages are imperative (tells how to do)

On the other hand, there are some program languages are declarative (tells what to do, like SQL)

The rules we learn in pointer analysis can be seen as specifications which can be implemented by declarative language.

Compared with imperative languages, declarative language are more:

  • succint

  • readable(logic-based specification)

  • easy to implement

Intro to Datalog

  • Datalog is a declarative logic programming language that is a subset of Prolog

  • It emerged as a database language, later found a variety of applications

    • program analysis

    • declarative networking

    • big data

    • cloud computing

Datalog = Data + Logic

predicates

In datalog, a predicate(relation)(谓词) is a set of statements(陈述)

Essentially, a predicate is a table of data, a fact asserts that a particular tuple belongs to a relation.

Atoms

Atoms are basic elements of Datalog, which represent predicates of the form.

terms can be:

  • variables(stand for any value)

  • constant

relational atom

P(X1,X2,…,Xn) is called relational atom, it evaluates to true when predicate P contains the tuple described by X1,X2,…,Xn

arithmetic atoms

E.g., age >= 18

Rules

  • Rule is a way of expressing logical inferences

  • Rule also serves to specify how facts are deduced

The form of a rule is

, can be read as (logical) and

head is true if all subgoals B1, B2, ..., and Bn are true

# deduce adults via Datalog rule
Adult(person) <-
	Age(person, age),
	age >= 18

There are two ways to express logical or in Datalog

# multiple rules with the same head
SportFan(person) <- Hobby(person, “jogging”).
SportFan(person) <- Hobby(person,“swimming”).

# use logical or operation ";"
SportFan(person) <-
    Hobby(person,“jogging”);
    Hobby(person,“swimming”).

The precedence of “;” (or) is lower than “,” (and), so disjunctions may be enclosed by parentheses, e.g., H <- A,(B;C)

A subgoal can be a negated atom, which negates its meaning, written as !B(...)

# compute the students who need to take a make-up exam
MakeupExamStd(student) <-
    Student(student),
    !PassedStd(student)

How to interpret a rule ?

  • Consider all possible combinations of values of the variables in the subgoals

  • If a combination makes all subgoals true, then the head atom with corresponding values is also true

  • The head predicate consists of all true atoms

Conventionally, predicates in Datalog are divide into two kinds

  • EDB(extensional database)

    • The predicates that are defined in a priori

    • Relations are immutable

    • Can be seen as input relations

  • IDB(intensional database)

    • The predicates that are established only by rules

    • Relations are inferred by rules

    • Can be seen as output relations

Recursion

Datalog supports recursive rules, which allows that an IDB predicate can be deduced (directly/indirectly) from itself

  • Without recursion, Datalog can only express the queries of basic relational algebra. (Just like SQL)

  • With recursion, Datalog becomes much more powerful, and is able to express sophisticated program analyses, such as pointer analysis

Reach(from, to) <-
	Edge(from, to).

Reach(from, to) <-
	Reach(from, node),
	Edge(node, to).

Where Edge(a,b) means that the graph has an edge from node a to node b, and Reach(a,b) means that b is reachable from a.

Rule Safety

A rule is safe if every variable appears in at least one non-negated relational atom

In Datalog, only safe rules are allowed

# below two are not safe
A(x) <- B(y), x > y.

A(x) <- B(y), !C(x,y).

# infinite values of x can satisfy the rule, which makes A an infinite relation

# recursion and negation of head are not allowed
A(x) <- B(x), !A(x).

Execution

  • Datalog engine deduces facts by given rules and EDB predicates until no new facts can be deduced. Some modern Datalog engines:

    • LogicBlox

    • XSB

    • Datomic

    • Flora-2

  • Monotonicity: Datalog is monotone as facts cannot be deleted

  • Termination: A Datalog program always terminates as

    • Datalog is monotone

    • Possible values of IDB predicates are finite (rule safety)

Pointer Analysis via Datalog

  • EDB:pointer-relevant information that can be extracted from program syntactically

  • IDB:pointer analysis results

  • Rules:pointer analysis rules

Modle:

EDB:

Rule:

Taint Analysis via Datalog

Pros & Cons of Datalog-Based Program Analysis:

  • Pros

    • succinct and readable

    • easy to implement

    • benefit from off-the-shelf optimized Datalog engines

  • Cons

    • restricted expressiveness, i.e., it is impossible or inconvenient to express some logics

    • Cannot fully control performance

PreviousTaint Anlysis

Last updated 11 months ago

Was this helpful?

image-20240425145456937
image-20240425145556237
image-20240425150055912
image-20240425150740237
image-20240425151007703
image-20240425152823297
image-20240425154324393
image-20240425154355163
image-20240425154442244
image-20240425160930109
image-20240425161448073
image-20240425161547192
image-20240425162153687
image-20240425165500215
image-20240425165550411