Software Optimization Resources

turtleyacht1 pts0 comments

Software optimization resources. C++ and assembly. Windows, Linux, BSD, Mac OS X

Software optimization resources

Contents

Optimization manuals

Vector class library

ForwardCom: An open standard instruction set for high performance microprocessors

Test programs for measuring clock cycles in C++ and assembly code

Object file converter and disassembler

Assembly function library

Floating point exception tracking and NaN propagation

CPUID manipulation program

Links

My blog on optimization

Optimization manuals

This series of five manuals describes everything you need to know about optimizing<br>code for x86 and x86-64 family microprocessors, including optimization advices for C++<br>and assembly language, details about the microarchitecture and instruction<br>timings of most Intel, AMD and VIA processors, and details about different compilers and<br>calling conventions.

Operating systems covered: DOS, Windows, Linux, BSD, Mac OS X Intel based, 32 and 64 bits.

Note that these manuals are not for beginners.

1. Optimizing software in C++:<br>An optimization guide for Windows, Linux and Mac platforms<br>This is an optimization manual for advanced C++ programmers.<br>Topics include: The choice of platform and operating system. Choice of<br>compiler and framework. Finding performance bottlenecks.<br>The efficiency of different C++ constructs. Multi-core systems.<br>Parallelization with vector operations. CPU dispatching. Efficient<br>container class templates. Etc.

File name: optimizing_cpp.pdf, size: 1848238, last modified: 2025-Dec-15.

Download.

2. Optimizing subroutines in assembly language:<br>An optimization guide for x86 platforms<br>This is an optimization manual for advanced assembly language programmers<br>and compiler makers.<br>Topics include: C++ instrinsic functions, inline assembly and stand-alone assembly.<br>Linking optimized assembly subroutines into high level language programs.<br>Making subroutine libraries compatible with multiple compilers and operating systems.<br>Optimizing for speed or size. Memory access. Loops. Vector programming (XMM, YMM, SIMD).<br>CPU-specific optimization and CPU dispatching.

File name: optimizing_assembly.pdf, size: 1091275, last modified: 2025-Dec-18.

Download.

3. The microarchitecture of Intel, AMD and VIA CPUs:<br>An optimization guide for assembly programmers and compiler makers<br>This manual contains details about the internal working of various microprocessors<br>from Intel, AMD and VIA. Topics include: Out-of-order execution, register renaming,<br>pipeline structure, execution unit organization and branch prediction algorithms<br>for each type of microprocessor. Describes many details that cannot be found<br>in manuals from microprocessor vendors or anywhere else. The information is<br>based on my own research and measurements rather than on official sources.<br>This information will be useful to programmers who want to make CPU-specific<br>optimizations as well as to compiler makers and students of microarchitecture.

File name: microarchitecture.pdf, size: 1866514, last modified: 2026-May-23.

Download.

4. Instruction tables:<br>Lists of instruction latencies, throughputs and micro-operation<br>breakdowns for Intel, AMD and VIA CPUs<br>Contains detailed lists of instruction latencies, execution unit throughputs,<br>micro-operation breakdown and other details for all common application instructions<br>of most microprocessors from Intel, AMD and VIA. Intended as an appendix to the<br>preceding manuals. Available as pdf file and as spreadsheet (ods format).

File name: instruction_tables.pdf, size: 2248996, last modified: 2025-Sep-20.

Download.

File name: instruction_tables.ods, size: 557154, last modified: 2025-Sep-20.

Download.

5. Calling conventions for different C++ compilers and operating systems<br>This document contains details about data representation,<br>function calling conventions, register usage conventions, name mangling schemes,<br>etc. for many different C++ compilers and operating systems. Discusses compatibilities<br>and incompatibilities between different C++ compilers. Includes information that<br>is not covered by the official Application Binary Interface standards (ABI's).<br>The information provided here is based on my own research and therefore<br>descriptive rather than normative.<br>Intended as a source of reference for programmers who want to make function<br>libraries compatible with multiple compilers or operating systems and for<br>makers of compilers and other development tools who want their tools to be<br>compatible with existing tools.

File name: calling_conventions.pdf, size: 1078737, last modified: 2023-Jul-01.

Download.

All five manuals<br>Download all the above manuals together in one zip file.

File name: optimization_manuals.zip, size: 7011391, last modified: 2026-May-23.

Download.

C++ vector class library

This is a collection of C++ classes, functions and operators that makes it easier to<br>use the the vector instructions (Single Instruction Multiple Data instructions) of<br>modern CPUs without using assembly language. Supports...

optimization assembly file manuals name size

Related Articles