|
Writeup Authors
- Tom Wang - ayanami at cs.stanford.edu
- Naeim Semsarilar - naeim at stanford.edu
Talk Overview
- Title: Secure Execution via Program Shepherding
- Summary: Professor Amarasinghe began with a brief overview of what he calls the "attack landscape" by detailing the lifecycle of a security attack and some of the more common attack types. Program shepherding is implemented as a highly optimized interpreter using code caches, linking, indirect branch handling, and traces. The key observation of program shepherding is that attacks can be thwarted through three key steps:
- Restrict execution privileges (on basis of code origins)
- Restrict control transfer (defined entry points into shared libraries)
- Sandboxing checks (the application cannot subvert program shepherding's address space)
The Attack Landscape
- To defend against attacks, examine application vulnerabilities (human error is a separate issue).
- Usually, an attack is discovered, then many variations of the same attack can be carried out by anyone(script kiddies), and then finally the attack dies down to make way for a new attack.
- A good metric to classify these variations of attacks is by the type of flaw exploited.
- Memory-based vulnerabilities are the most common type of flaw.
- e.g. 100% of critical vulnerabilities in MS Server according to the MS Security Bulletins, 2003-2004 were memory-based
- Other types:
- Privilege Elevation
- DOS-malformed
- Cross Site Scripting
- Weak or Missing Permission
- Information Leak
Attack Lifecycle
- Most attacks follow this pattern:
- Malicious code enters the vulnerable system masquerading as data
- Control gets hijacked
- Malicious code gets executed by operating system
- System is compromised
- How can we stop the attack from compromising the system? Three possibilities:
- Stop before malicious code enters the system
- Example: Bank guard at door with pictures of all known attacks.
- Flaw: Must know what attacks look like + difficult to keep up with new variations of attacks
- Stop if suspicious behavior detected
- Example: Bank guard tackles all customers who exhibit suspicious behavior (but what if you're just in a hurry to your meeting?)
- Flaw: Adequate detection requires some amount of high-level inferencing to avoid false positives.
- Stop in the act of criminal behavior, e.g. when "bad guy" grabs control
- Losing control of the program counter (PC) to the malicious code allows the attack to ability to:
- Execute any instruction
- Bypass all protection set within the application
- Do any action the application is allowed to do (with same privileges)
- Insight: Main advantage in defense is that all bad guys need to grab control at some point while good guys don't need to. Protecting data is hard, so instead we will focus on protecting control transfers and prevent hijack of the PC. This is the main focus of program shepherding.
- Flaw: Protecting control transfers means the attacker is still able to execute the attack on your system (buffer overflow, etc.) and kill the application but is not able to achieve the last step of fully compromising your system.
Program Shepherding
- The idea: Monitor all control flow transfers dynamically during execution, validate them, and catch bad activity. Never give up the PC to the attacker.
- Note: There are few compilers in the world that generate code, so the problem is tractable
- In order to have a usable system, it must be:
- Efficient - small performance overhead
- Transparent - no interference with program semantics
- All-inclusive - all control flow transfers must be monitored
- Robust
- Program Shepherding is built on top of DynamoRIO which has the above properties.
- Restricted Code Origins
- Check against a security policy to see if the origin of code is valid. Code origins include:
- Code from disk, originally loaded
- Code from disk, dynamically loaded
- Dynamically written code, self-contained w/ no system calls
- Track pages that have remained read-only since load time
- Restricted Control Transfers
- The idea:
- Only jump to known function entry points on jumps and indirect calls.
- Only return to after a call instruction and ensure that in a direct call, the called function and the function returning from are the same
- On Windows, must handle specific control flows, such as callbacks and exception.
- Sandboxing checks
- Only allow control flow transfers to top of basic blocks or traces, so the attacker can't bypass the inserted checks.
- Must protect the system itself, by keeping all data structures in a write protected segment.
- This is expensive to do, so in the real production system, they only do selective protection (i.e. protection of very critical pieces of the code) and add guard pages.
DynamoRIO
- DynamoRIO is a dynamic code modification/optimization system. It is implemented for IA-32 Windows and Linux.
- Efficiency
- Normal interpreters cause a 300x slow-down in the execution of programs. DynamoRIO makes a number of optimizations to minimize this slow-down enough that it can be used in a real production environment:
- Code cache
- Basic blocks are cached and executed natively. This results in an order of magnitude speed-up.
- Linking
- If the next block is already present in the cache, no context switch (to DynamoRIO and back) is necessary and the two blocks are linked by a direct jump. This results in another order of magnitude speed-up.
- Indirect branch handling
- A hashtable is used to translate original program addresses into their corresponding code cache addresses. Using a fast lookup into this table, indirect branch targets are resolved and handled.
- Traces
- Frequently executed basic blocks are merged together to form a trace. The benefit of using traces is more than its cost (i.e. when the program follows a different path than the trace at a branch point, in which case the full branch target lookup must be performed.)
- Transparency
- Data Transparecy: User data is not modified. In order to achieve full data transparency, DynamoRIO uses its own heap and malloc.
- I/O Transparency: The program's I/O buffering is not modified.
- Program Address Transparency: The program is not aware that it is running in the DynamoRIO environment, and only sees original addresses, not code cache addresses. To achieve this, on indirect branches, program addresses must be translated to cache addresses, and self-references and visible program contexts are translated to original program addresses.
- Transparent Exception Handling: See next bullet.
- All-inclusiveness
- Must capture all code execution. Abnormal control flow must also be intercepted. This is easy for Linux where we just have signals to worry about, but in Windows, there are exceptions, call backs, asynchronous procedure calls, setjmp/longjmp, and set thread context to handle.
- Robustness
- There are a lot of corner cases in a real production environment, so in order to make a usable system (realized in Determina Inc.), they've done a lot of tweaks, and sometimes had to emulate bugs in the system being monitored to maintain transparency.
- e.g. Microsoft Exchange Server does one big malloc and grabs all memory on the machine it's running on, leaving no memory for the Determina system to use.
Related Work
- Partial Solutions
- Hardware Solutions
- Language Solutions
- Static Analysis/Model Checking Solutions
- Hybrid Approaches
- Tracking Dataflow
- Runtime Code Modification System (no security)
|
|