Smaller exe files from Visual C++ Just a collection of basic stuff to do to decrease the size of your images.. andrewl/crackmes.de 1) use the visual studio command line utilities, not the IDE why? - your ability to control compilation/linking of your exe's depends on your knowledge of the IDE's dialogs and other GUI crap like "projects" and "solutions" - with the command line tools, all compiler options are immediately in front of you with "cl /?" and linker options with "link /?" useful ones: /c compile only, don't linke (so you can customize linker options later) /O1 optimization for size /GS- disables security cookie checks from compiler /Oi- disable intrinsic functions (if you want to replace strlen(), etc.) /NODEFAULTLIB obvious /MERGE to merge sections /SAFESEH:NO no extra space allocated for list of SEH handlers /ENTRY define entrypoint (away from default _mainCRTStartup()) /ALIGN set alignment of sections /FILEALIGN (undocumented!) /INCREMENTAL:NO save call [addr_of_jmp] space google for: msdn "linker options" and: msdn "compiler options" for even better explanations 2) don't statically link against what is already in DLL's, for example, user32.dll already has wvsprintfA(), wvsprintfW(), wsprintfA(), wsprintfW() just link against user32.lib (has thunks for calling user32.dll) another example is that _aullshr() exists in msvcrt, GDI32, SHLWAPI, and ntdll 3) bypass the convenience functions printf() probably does normal formatting stuff to a buffer, then sends this to the kernel (who can actually write on the console window)... google for "console functions (Windows)" to see now we can hack together a very small printf(): void printf(char * fmtstr, ...) { DWORD dwRet; CHAR buffer[256]; va_list v1; va_start(v1,fmtstr); wvsprintf(buffer,fmtstr,v1); WriteConsole(GetStdHandle(STD_OUTPUT_HANDLE), buffer, strlen(buffer), &dwRet, 0); va_end(v1); } obvious security problems here, but we are optimizing for size :) strlen() obviously is not defined, see below for defining your own intrinsics if you don't like stdarg.h for whatever reason, you can get away with forcefully pushing some amount of parameters, whether they were sent to your printf or not: //... // push 8 possibly existing parameters __asm { push ecx push edi mov ecx, 0x08 lea edi, fmtstr add edi, 0x20 // skip to param[7] more: push [edi] sub edi, 4 dec ecx jnz more } wsprintf(msg_caller, fmtstr); // pop 8 possibly existing parameters __asm { add esp, 0x20 pop edi pop ecx } 4) make your own intrinsics many intrinsic functions are optimized for speed, like disassemble your memcpy: it will try to find out how many dwords are in your copy, then do rep movsd, then words, then bytes and may try to get the writes to be word aligned this is much slower, but much shorter :) void * memcpy(void *d, const void * s, unsigned int n) { __asm { mov esi, dword ptr[s] mov edi, dword ptr[d] mov ecx, dword ptr[n] rep movsb } return d; } remember to disable the other intrinsics or you will have a name collision.. it might also be difficult to have none of the include files in your chain of includes not also declare these functions 5) make your own CRT MUCH easier than you might think... what do you need from a runtime? to allocate/deallocate mem? to print to the console? look up the functions required to allocate memory from windows... now just have some functions that set needed globals: HANDLE g_hHeap = 0; extern "C" BOOL crt_initialize() { return g_hHeap = HeapCreate(0, 0, 0)); } extern "C" BOOL crt_uninitialize() { return HeapDestroy(g_hHeap)); } you can now, if you choose, override the default CRT entry's name: extern "C" int mainCRTStartup() { crc_initialize() // maybe get the arguments here with GetCommandLine() main(/* maybe send args here */); return 0; } so how do you do malloc()/free() ? how do you do new/delete? it's as simple as passing the requested sizes to the OS functions along with the heap handle made during the initialization extern "C" void * malloc(unsigned int size) { return HeapAlloc(g_hHeap, HEAP_ZERO_MEMORY, size); } extern "C" void free(void * p) { HeapFree(g_hHeap, 0, p); } void * __cdecl operator new(unsigned int size) { return HeapAlloc(g_hHeap, HEAP_ZERO_MEMORY, size); } void __cdecl operator delete(void *p) { HeapFree(g_hHeap, 0, p); } hopefully you can figure out the rest... especially how your crt entry should acquire and supply needed parameters to your main() or winmain() 6) bypass the normal CRT if you don't want to write a CRT, but also don't want the cruft that comes with the normal CRT, just specify that the linker should jump to your code first, NOT the crt /ENTRY:yourfunction 7) rip missing functions here is _allshl() ripped from ntdll: extern "C" void __declspec(naked) _allshl() { __asm { loc_7C9016E9: cmp cl,40h loc_7C9016EC: jae loc_7C901703 loc_7C9016EE: cmp cl,20h loc_7C9016F1: jae loc_7C9016F9 loc_7C9016F3: shld edx,eax,cl loc_7C9016F6: shl eax,cl loc_7C9016F8: ret loc_7C9016F9: mov edx,eax loc_7C9016FB: xor eax,eax loc_7C9016FD: and cl,1Fh loc_7C901700: shl edx,cl loc_7C901702: ret loc_7C901703: xor eax,eax loc_7C901705: xor edx,edx loc_7C901707: ret } } 8) putting it together: 1k hello world #include extern "C" unsigned int strlen(const char *f) { INT i=0; while(*f++) i++; return i; } void printf(char * fmtstr, ...) { DWORD dwRet; CHAR buffer[256]; va_list v1; va_start(v1,fmtstr); wvsprintf(buffer,fmtstr,v1); WriteConsole(GetStdHandle(STD_OUTPUT_HANDLE), buffer, strlen(buffer), &dwRet, 0); va_end(v1); } VOID main() { printf("hello world!"); } build with: cl small.cpp /c /O1 /GS- /Oi- link /NODEFAULTLIB /ENTRY:main /MERGE:.rdata=.text small.obj user32.lib kernel32.lib 9) from pegasus' pdf, learned about /ALIGN and /FILEALIGN ... FILEALIGN doesn't appear as one of the listed options anywhere! but link.exe accepts it, hmmm... link /ALIGN:8 /FILEALIGN:0x8 /NODEFAULTLIB /ENTRY:main /MERGE:.rdata=.text small.obj user32.lib kernel32.lib reduces the above example to 784 bytes!!! the argument to FILEALIGN appears to have no effect whatsoever 10) from ufmod page: Try using the undocumented directive /opt:nowin98 while linking a Visual C++ or MASM32 project to minimize section alignment. The .rdata section (read-only data, where the IAT and some other constants reside) and .text section (usually contains executable code) could be safely combined together into a single section. Try adding the following directives to MS LINK.EXE or POLINK.EXE command line: /MERGE:.rdata=.text There's another MS linker-specific known issue. link.exe attaches some unnecessary data between DOS stub and the beginning of PE header. It's easy to spot the dead weight in a Hex editor - it begins with a magic word 'Rich'. The encoded machine compid follows the magic word. If you don't want your executables being signed this way or just don't like to spend some extra bytes (actually, it's half a Kb!) on the signature, there's a couple of workarounds available. First, you can switch to another linker. Or you can search the web to find an article on patching link.exe. Psst! It's written in russian and available somewhere at wasm.ru. Delphi likes to include a Relocation Table (.reloc section) inside every single executable. That's not required for a typical exe to run (but not a DLL!) and you may safely remove that section. Try using StripReloc by Jordan Russel, PE Optimizer by Dr. Golova or a similar tool in case you don't know how to remove relocations by hand. Visual Basic and Delphi like to create a Resources section (.rsrc) even if it doesn't contain any useful resources. So, it's usually safe to remove the whole resources section if it doesn't contain forms, XMs or anything your program might really need. The same applies to .flat sections inside PureBasic executables. Be careful while performing this kind of operation on your exe! Packers and exe compressors, such as FSG and UPX, make executables smaller. Anyway, to make things fair, the sample executables are not compressed at all! When using MS-COFF import libraries (like kernel32.lib, libkernel32.a, etc.) some space is wasted in the executable image to hold the original thunks tables. These tables are only required when binding the executable image. If you don't plan to bind your executables, you can get rid of the original thunks and save up to 512 bytes or even a couple of kilobytes when importing a large amount of symbols. To do so you should replace the original import libraries, shipped with your compiler SDK (Visual Studio, masm32, etc.) whith modified (stripped) import libraries and rebuild your projects. You can make a "stripped" import library with ImpLib SDK. That's pretty much everything one should know about optimizing an executable file for size.