# Intro

## 0x00 Resource

[Static Program Analysis | Tai-e (pascal-lab.net)](http://tai-e.pascal-lab.net/lectures.html)

<https://www.bilibili.com/video/BV1b7411K7P4>

NOTEs Are Taken From Online Shared Course *Static Program Analysis* conducted by Teacher Yue Li and Tian Tan in NJU. Here I appreciate these two teachers for your selfless sharing. 😭

## 0x01 Intro

![image-20240925234445818](https://1239337109-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F2HLPOkuOfb7iyCzDJ8vA%2Fuploads%2Fgit-blob-1999b068ec3de8f17f33e21955dec4251bfde0e4%2Fimage-20240925234445818.png?alt=media)

In the last decade, the language cores had few changes, but the programs became significantly larger and more complicated. To ensure the reliability, security, and other promises of large-scale and complex programs become a challenge.

Programming Languages（PL）can not live without Program analysis

Why Need Static Analysis：

* Program Reliability

  Null pointer dereference、memory leak（malloc without free）
* Program Security

  Private information leak、injection attack
* Compiler Optimization

  Dead code elimination、code motion
* Program Understanding

  IDE call hierarchy、type indication

Static analysis analyzes a program P to reason about its behaviors and determines whether it satisfies some properties before running P

> Rice's Theorem：
>
> Any non-trivial property of the behavior of programs in a r.e. language is undecidable
>
> r.e. (recursively enumerable) = recognizable by a Turing-machine
>
> A property is trivial if either it is not satisfied by any r.e. language, or if it is satisfied by all r.e. languages; otherwise it is non-trivial.
>
> non-trivial properties ≈ the properties related with run-time behaviors of programs

![image-20230402140026717](https://1239337109-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F2HLPOkuOfb7iyCzDJ8vA%2Fuploads%2Fgit-blob-f79697bec93bfe4d7f6f0b3ec3c7be765eb15cf2%2Fimage-20230402140026717.png?alt=media)

So there is no perfect static analysis strategy

* Sound ≈ 误报
* Complete ≈ 漏报

![image-20230402140323081](https://1239337109-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F2HLPOkuOfb7iyCzDJ8vA%2Fuploads%2Fgit-blob-aa4fc85bb99bd532af298355dfde72dc8412a54f%2Fimage-20230402140323081.png?alt=media)

But we can make some compromises to reach a useful static analysis

* Compromise soundness (false negatives)
* Compromise completeness (false positives)

![image-20230402140559500](https://1239337109-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F2HLPOkuOfb7iyCzDJ8vA%2Fuploads%2Fgit-blob-e42aa20b19eec5eff928e4e28b90bba64d4142f3%2Fimage-20230402140559500.png?alt=media)

Mostly compromising completeness: Sound but not fully-precise static analysis

Static Analysis: ensure (or get close to) soundness, while making good trade-offs between analysis precision and analysis speed

How to Do Static Analysis：

* Abstraction
* Over-approximation
  * Transfer functions：
    * define how to evaluate different program statements on abstract values.
    * defined according to “analysis problem” and the “semantics” of different program statements
  * Control flows
    * flow merging
