The Netwide Assembler: NASM

This manual documents NASM, the Netwide Assembler: an assembler targetting the Intel x86 series of processors, with portable source.

Chapter 1: Introduction
Section 1.1: What Is NASM?
Section 1.1.1: Why Yet Another Assembler?
Section 1.1.2: Licence Conditions
Section 1.2: Contact Information
Section 1.3: Installation
Section 1.3.1: Installing NASM under MS-DOS or Windows
Section 1.3.2: Installing NASM under Unix

Chapter 2: Running NASM
Section 2.1: NASM Command-Line Syntax
Section 2.1.1: The -o Option: Specifying the Output File Name
Section 2.1.2: The -f Option: Specifying the Output File Format
Section 2.1.3: The -l Option: Generating a Listing File
Section 2.1.4: The -M Option: Generate Makefile Dependencies.
Section 2.1.5: The -F Option: Selecting a Debugging Format
Section 2.1.6: The -g Option: Enabling Debug Information.
Section 2.1.7: The -E Option: Send Errors to a File
Section 2.1.8: The -s Option: Send Errors to stdout
Section 2.1.9: The -i Option: Include File Search Directories
Section 2.1.10: The -p Option: Pre-Include a File
Section 2.1.11: The -d Option: Pre-Define a Macro
Section 2.1.12: The -u Option: Undefine a Macro
Section 2.1.13: The -e Option: Preprocess Only
Section 2.1.14: The -a Option: Don't Preprocess At All
Section 2.1.15: The -On Option: Specifying Multipass Optimization.
Section 2.1.16: The -t option: Enable TASM Compatibility Mode
Section 2.1.17: The -w Option: Enable or Disable Assembly Warnings
Section 2.1.18: The -v Option: Display Version Info
Section 2.1.19: The NASMENV Environment Variable
Section 2.2: Quick Start for MASM Users
Section 2.2.1: NASM Is Case-Sensitive
Section 2.2.2: NASM Requires Square Brackets For Memory References
Section 2.2.3: NASM Doesn't Store Variable Types
Section 2.2.4: NASM Doesn't ASSUME
Section 2.2.5: NASM Doesn't Support Memory Models
Section 2.2.6: Floating-Point Differences
Section 2.2.7: Other Differences

Chapter 3: The NASM Language
Section 3.1: Layout of a NASM Source Line
Section 3.2: Pseudo-Instructions
Section 3.2.1: DB and friends: Declaring Initialised Data
Section 3.2.2: RESB and friends: Declaring Uninitialised Data
Section 3.2.3: INCBIN: Including External Binary Files
Section 3.2.4: EQU: Defining Constants
Section 3.2.5: TIMES: Repeating Instructions or Data
Section 3.3: Effective Addresses
Section 3.4: Constants
Section 3.4.1: Numeric Constants
Section 3.4.2: Character Constants
Section 3.4.3: String Constants
Section 3.4.4: Floating-Point Constants
Section 3.5: Expressions
Section 3.5.1: |: Bitwise OR Operator
Section 3.5.2: ^: Bitwise XOR Operator
Section 3.5.3: &: Bitwise AND Operator
Section 3.5.4: << and >>: Bit Shift Operators
Section 3.5.5: + and -: Addition and Subtraction Operators
Section 3.5.6: *, /, //, % and %%: Multiplication and Division
Section 3.5.7: Unary Operators: +, -, ~ and SEG
Section 3.6: SEG and WRT
Section 3.7: STRICT: Inhibiting Optimization
Section 3.8: Critical Expressions
Section 3.9: Local Labels

Chapter 4: The NASM Preprocessor
Section 4.1: Single-Line Macros
Section 4.1.1: The Normal Way: %define
Section 4.1.2: Enhancing %define: %xdefine
Section 4.1.3: Concatenating Single Line Macro Tokens: %+
Section 4.1.4: Undefining macros: %undef
Section 4.1.5: Preprocessor Variables: %assign
Section 4.2: String Handling in Macros: %strlen and %substr
Section 4.2.1: String Length: %strlen
Section 4.2.2: Sub-strings: %substr
Section 4.3: Multi-Line Macros: %macro
Section 4.3.1: Overloading Multi-Line Macros
Section 4.3.2: Macro-Local Labels
Section 4.3.3: Greedy Macro Parameters
Section 4.3.4: Default Macro Parameters
Section 4.3.5: %0: Macro Parameter Counter
Section 4.3.6: %rotate: Rotating Macro Parameters
Section 4.3.7: Concatenating Macro Parameters
Section 4.3.8: Condition Codes as Macro Parameters
Section 4.3.9: Disabling Listing Expansion
Section 4.4: Conditional Assembly
Section 4.4.1: %ifdef: Testing Single-Line Macro Existence
Section 4.4.2: ifmacro: Testing Multi-Line Macro Existence
Section 4.4.3: %ifctx: Testing the Context Stack
Section 4.4.4: %if: Testing Arbitrary Numeric Expressions
Section 4.4.5: %ifidn and %ifidni: Testing Exact Text Identity
Section 4.4.6: %ifid, %ifnum, %ifstr: Testing Token Types
Section 4.4.7: %error: Reporting User-Defined Errors
Section 4.5: Preprocessor Loops: %rep
Section 4.6: Including Other Files
Section 4.7: The Context Stack
Section 4.7.1: %push and %pop: Creating and Removing Contexts
Section 4.7.2: Context-Local Labels
Section 4.7.3: Context-Local Single-Line Macros
Section 4.7.4: %repl: Renaming a Context
Section 4.7.5: Example Use of the Context Stack: Block IFs
Section 4.8: Standard Macros
Section 4.8.1: __NASM_MAJOR__, __NASM_MINOR__, __NASM_SUBMINOR__ and ___NASM_PATCHLEVEL__: NASM Version
Section 4.8.2: __NASM_VERSION_ID__: NASM Version ID
Section 4.8.3: __NASM_VER__: NASM Version string
Section 4.8.4: __FILE__ and __LINE__: File Name and Line Number
Section 4.8.5: STRUC and ENDSTRUC: Declaring Structure Data Types
Section 4.8.6: ISTRUC, AT and IEND: Declaring Instances of Structures
Section 4.8.7: ALIGN and ALIGNB: Data Alignment
Section 4.9: TASM Compatible Preprocessor Directives
Section 4.9.1: %arg Directive
Section 4.9.2: %stacksize Directive
Section 4.9.3: %local Directive
Section 4.10: Other Preprocessor Directives
Section 4.10.1: %line Directive
Section 4.10.2: %!<env>: Read an environment variable.

Chapter 5: Assembler Directives
Section 5.1: BITS: Specifying Target Processor Mode
Section 5.1.1: USE16 & USE32: Aliases for BITS
Section 5.2: SECTION or SEGMENT: Changing and Defining Sections
Section 5.2.1: The __SECT__ Macro
Section 5.3: ABSOLUTE: Defining Absolute Labels
Section 5.4: EXTERN: Importing Symbols from Other Modules
Section 5.5: GLOBAL: Exporting Symbols to Other Modules
Section 5.6: COMMON: Defining Common Data Areas
Section 5.7: CPU: Defining CPU Dependencies

Chapter 6: Output Formats
Section 6.1: bin: Flat-Form Binary Output
Section 6.1.1: ORG: Binary File Program Origin
Section 6.1.2: bin Extensions to the SECTION Directive
Section 6.1.3: Multisection support for the BIN format.
Section 6.2: obj: Microsoft OMF Object Files
Section 6.2.1: obj Extensions to the SEGMENT Directive
Section 6.2.2: GROUP: Defining Groups of Segments
Section 6.2.3: UPPERCASE: Disabling Case Sensitivity in Output
Section 6.2.4: IMPORT: Importing DLL Symbols
Section 6.2.5: EXPORT: Exporting DLL Symbols
Section 6.2.6: ..start: Defining the Program Entry Point
Section 6.2.7: obj Extensions to the EXTERN Directive
Section 6.2.8: obj Extensions to the COMMON Directive
Section 6.3: win32: Microsoft Win32 Object Files
Section 6.3.1: win32 Extensions to the SECTION Directive
Section 6.4: coff: Common Object File Format
Section 6.5: elf: Executable and Linkable Format Object Files
Section 6.5.1: elf Extensions to the SECTION Directive
Section 6.5.2: Position-Independent Code: elf Special Symbols and WRT
Section 6.5.3: elf Extensions to the GLOBAL Directive
Section 6.5.4: elf Extensions to the COMMON Directive
Section 6.5.5: 16-bit code and ELF
Section 6.6: aout: Linux a.out Object Files
Section 6.7: aoutb: NetBSD/FreeBSD/OpenBSD a.out Object Files
Section 6.8: as86: Minix/Linux as86 Object Files
Section 6.9: rdf: Relocatable Dynamic Object File Format
Section 6.9.1: Requiring a Library: The LIBRARY Directive
Section 6.9.2: Specifying a Module Name: The MODULE Directive
Section 6.9.3: rdf Extensions to the GLOBAL directive
Section 6.10: dbg: Debugging Format

Chapter 7: Writing 16-bit Code (DOS, Windows 3/3.1)
Section 7.1: Producing .EXE Files
Section 7.1.1: Using the obj Format To Generate .EXE Files
Section 7.1.2: Using the bin Format To Generate .EXE Files
Section 7.2: Producing .COM Files
Section 7.2.1: Using the bin Format To Generate .COM Files
Section 7.2.2: Using the obj Format To Generate .COM Files
Section 7.3: Producing .SYS Files
Section 7.4: Interfacing to 16-bit C Programs
Section 7.4.1: External Symbol Names
Section 7.4.2: Memory Models
Section 7.4.3: Function Definitions and Function Calls
Section 7.4.4: Accessing Data Items
Section 7.4.5: c16.mac: Helper Macros for the 16-bit C Interface
Section 7.5: Interfacing to Borland Pascal Programs
Section 7.5.1: The Pascal Calling Convention
Section 7.5.2: Borland Pascal Segment Name Restrictions
Section 7.5.3: Using c16.mac With Pascal Programs

Chapter 8: Writing 32-bit Code (Unix, Win32, DJGPP)
Section 8.1: Interfacing to 32-bit C Programs
Section 8.1.1: External Symbol Names
Section 8.1.2: Function Definitions and Function Calls
Section 8.1.3: Accessing Data Items
Section 8.1.4: c32.mac: Helper Macros for the 32-bit C Interface
Section 8.2: Writing NetBSD/FreeBSD/OpenBSD and Linux/ELF Shared Libraries
Section 8.2.1: Obtaining the Address of the GOT
Section 8.2.2: Finding Your Local Data Items
Section 8.2.3: Finding External and Common Data Items
Section 8.2.4: Exporting Symbols to the Library User
Section 8.2.5: Calling Procedures Outside the Library
Section 8.2.6: Generating the Library File

Chapter 9: Mixing 16 and 32 Bit Code
Section 9.1: Mixed-Size Jumps
Section 9.2: Addressing Between Different-Size Segments
Section 9.3: Other Mixed-Size Instructions

Chapter 10: Troubleshooting
Section 10.1: Common Problems
Section 10.1.1: NASM Generates Inefficient Code
Section 10.1.2: My Jumps are Out of Range
Section 10.1.3: ORG Doesn't Work
Section 10.1.4: TIMES Doesn't Work
Section 10.2: Bugs

Appendix A: Ndisasm
Section A.1: Introduction
Section A.2: Getting Started: Installation
Section A.3: Running NDISASM
Section A.3.1: COM Files: Specifying an Origin
Section A.3.2: Code Following Data: Synchronisation
Section A.3.3: Mixed Code and Data: Automatic (Intelligent) Synchronisation
Section A.3.4: Other Options
Section A.4: Bugs and Improvements

Appendix B: x86 Instruction Reference
Section B.1: Key to Operand Specifications
Section B.2: Key to Opcode Descriptions
Section B.2.1: Register Values
Section B.2.2: Condition Codes
Section B.2.3: SSE Condition Predicates
Section B.2.4: Status Flags
Section B.2.5: Effective Address Encoding: ModR/M and SIB
Section B.3: Key to Instruction Flags
Section B.4: x86 Instruction Set
Section B.4.1: AAA, AAS, AAM, AAD: ASCII Adjustments
Section B.4.2: ADC: Add with Carry
Section B.4.3: ADD: Add Integers
Section B.4.4: ADDPD: ADD Packed Double-Precision FP Values
Section B.4.5: ADDPS: ADD Packed Single-Precision FP Values
Section B.4.6: ADDSD: ADD Scalar Double-Precision FP Values
Section B.4.7: ADDSS: ADD Scalar Single-Precision FP Values
Section B.4.8: AND: Bitwise AND
Section B.4.9: ANDNPD: Bitwise Logical AND NOT of Packed Double-Precision FP Values
Section B.4.10: ANDNPS: Bitwise Logical AND NOT of Packed Single-Precision FP Values
Section B.4.11: ANDPD: Bitwise Logical AND For Single FP
Section B.4.12: ANDPS: Bitwise Logical AND For Single FP
Section B.4.13: ARPL: Adjust RPL Field of Selector
Section B.4.14: BOUND: Check Array Index against Bounds
Section B.4.15: BSF, BSR: Bit Scan
Section B.4.16: BSWAP: Byte Swap
Section B.4.17: BT, BTC, BTR, BTS: Bit Test
Section B.4.18: CALL: Call Subroutine
Section B.4.19: CBW, CWD, CDQ, CWDE: Sign Extensions
Section B.4.20: CLC, CLD, CLI, CLTS: Clear Flags
Section B.4.21: CLFLUSH: Flush Cache Line
Section B.4.22: CMC: Complement Carry Flag
Section B.4.23: CMOVcc: Conditional Move
Section B.4.24: CMP: Compare Integers
Section B.4.25: CMPccPD: Packed Double-Precision FP Compare
Section B.4.26: CMPccPS: Packed Single-Precision FP Compare
Section B.4.27: CMPSB, CMPSW, CMPSD: Compare Strings
Section B.4.28: CMPccSD: Scalar Double-Precision FP Compare
Section B.4.29: CMPccSS: Scalar Single-Precision FP Compare
Section B.4.30: CMPXCHG, CMPXCHG486: Compare and Exchange
Section B.4.31: CMPXCHG8B: Compare and Exchange Eight Bytes
Section B.4.32: COMISD: Scalar Ordered Double-Precision FP Compare and Set EFLAGS
Section B.4.33: COMISS: Scalar Ordered Single-Precision FP Compare and Set EFLAGS
Section B.4.34: CPUID: Get CPU Identification Code
Section B.4.35: CVTDQ2PD: Packed Signed INT32 to Packed Double-Precision FP Conversion
Section B.4.36: CVTDQ2PS: Packed Signed INT32 to Packed Single-Precision FP Conversion
Section B.4.37: CVTPD2DQ: Packed Double-Precision FP to Packed Signed INT32 Conversion
Section B.4.38: CVTPD2PI: Packed Double-Precision FP to Packed Signed INT32 Conversion
Section B.4.39: CVTPD2PS: Packed Double-Precision FP to Packed Single-Precision FP Conversion
Section B.4.40: CVTPI2PD: Packed Signed INT32 to Packed Double-Precision FP Conversion
Section B.4.41: CVTPI2PS: Packed Signed INT32 to Packed Single-FP Conversion
Section B.4.42: CVTPS2DQ: Packed Single-Precision FP to Packed Signed INT32 Conversion
Section B.4.43: CVTPS2PD: Packed Single-Precision FP to Packed Double-Precision FP Conversion
Section B.4.44: CVTPS2PI: Packed Single-Precision FP to Packed Signed INT32 Conversion
Section B.4.45: CVTSD2SI: Scalar Double-Precision FP to Signed INT32 Conversion
Section B.4.46: CVTSD2SS: Scalar Double-Precision FP to Scalar Single-Precision FP Conversion
Section B.4.47: CVTSI2SD: Signed INT32 to Scalar Double-Precision FP Conversion
Section B.4.48: CVTSI2SS: Signed INT32 to Scalar Single-Precision FP Conversion
Section B.4.49: CVTSS2SD: Scalar Single-Precision FP to Scalar Double-Precision FP Conversion
Section B.4.50: CVTSS2SI: Scalar Single-Precision FP to Signed INT32 Conversion
Section B.4.51: CVTTPD2DQ: Packed Double-Precision FP to Packed Signed INT32 Conversion with Truncation
Section B.4.52: CVTTPD2PI: Packed Double-Precision FP to Packed Signed INT32 Conversion with Truncation
Section B.4.53: CVTTPS2DQ: Packed Single-Precision FP to Packed Signed INT32 Conversion with Truncation
Section B.4.54: CVTTPS2PI: Packed Single-Precision FP to Packed Signed INT32 Conversion with Truncation
Section B.4.55: CVTTSD2SI: Scalar Double-Precision FP to Signed INT32 Conversion with Truncation
Section B.4.56: CVTTSS2SI: Scalar Single-Precision FP to Signed INT32 Conversion with Truncation
Section B.4.57: DAA, DAS: Decimal Adjustments
Section B.4.58: DEC: Decrement Integer
Section B.4.59: DIV: Unsigned Integer Divide
Section B.4.60: DIVPD: Packed Double-Precision FP Divide
Section B.4.61: DIVPS: Packed Single-Precision FP Divide
Section B.4.62: DIVSD: Scalar Double-Precision FP Divide
Section B.4.63: DIVSS: Scalar Single-Precision FP Divide
Section B.4.64: EMMS: Empty MMX State
Section B.4.65: ENTER: Create Stack Frame
Section B.4.66: F2XM1: Calculate 2**X-1
Section B.4.67: FABS: Floating-Point Absolute Value
Section B.4.68: FADD, FADDP: Floating-Point Addition
Section B.4.69: FBLD, FBSTP: BCD Floating-Point Load and Store
Section B.4.70: FCHS: Floating-Point Change Sign
Section B.4.71: FCLEX, FNCLEX: Clear Floating-Point Exceptions
Section B.4.72: FCMOVcc: Floating-Point Conditional Move
Section B.4.73: FCOM, FCOMP, FCOMPP, FCOMI, FCOMIP: Floating-Point Compare
Section B.4.74: FCOS: Cosine
Section B.4.75: FDECSTP: Decrement Floating-Point Stack Pointer
Section B.4.76: FxDISI, FxENI: Disable and Enable Floating-Point Interrupts
Section B.4.77: FDIV, FDIVP, FDIVR, FDIVRP: Floating-Point Division
Section B.4.78: FEMMS: Faster Enter/Exit of the MMX or floating-point state
Section B.4.79: FFREE: Flag Floating-Point Register as Unused
Section B.4.80: FIADD: Floating-Point/Integer Addition
Section B.4.81: FICOM, FICOMP: Floating-Point/Integer Compare
Section B.4.82: FIDIV, FIDIVR: Floating-Point/Integer Division
Section B.4.83: FILD, FIST, FISTP: Floating-Point/Integer Conversion
Section B.4.84: FIMUL: Floating-Point/Integer Multiplication
Section B.4.85: FINCSTP: Increment Floating-Point Stack Pointer
Section B.4.86: FINIT, FNINIT: Initialise Floating-Point Unit
Section B.4.87: FISUB: Floating-Point/Integer Subtraction
Section B.4.88: FLD: Floating-Point Load
Section B.4.89: FLDxx: Floating-Point Load Constants
Section B.4.90: FLDCW: Load Floating-Point Control Word
Section B.4.91: FLDENV: Load Floating-Point Environment
Section B.4.92: FMUL, FMULP: Floating-Point Multiply
Section B.4.93: FNOP: Floating-Point No Operation
Section B.4.94: FPATAN, FPTAN: Arctangent and Tangent
Section B.4.95: FPREM, FPREM1: Floating-Point Partial Remainder
Section B.4.96: FRNDINT: Floating-Point Round to Integer
Section B.4.97: FSAVE, FRSTOR: Save/Restore Floating-Point State
Section B.4.98: FSCALE: Scale Floating-Point Value by Power of Two
Section B.4.99: FSETPM: Set Protected Mode
Section B.4.100: FSIN, FSINCOS: Sine and Cosine
Section B.4.101: FSQRT: Floating-Point Square Root
Section B.4.102: FST, FSTP: Floating-Point Store
Section B.4.103: FSTCW: Store Floating-Point Control Word
Section B.4.104: FSTENV: Store Floating-Point Environment
Section B.4.105: FSTSW: Store Floating-Point Status Word
Section B.4.106: FSUB, FSUBP, FSUBR, FSUBRP: Floating-Point Subtract
Section B.4.107: FTST: Test ST0 Against Zero
Section B.4.108: FUCOMxx: Floating-Point Unordered Compare
Section B.4.109: FXAM: Examine Class of Value in ST0
Section B.4.110: FXCH: Floating-Point Exchange
Section B.4.111: FXRSTOR: Restore FP, MMX and SSE State
Section B.4.112: FXSAVE: Store FP, MMX and SSE State
Section B.4.113: FXTRACT: Extract Exponent and Significand
Section B.4.114: FYL2X, FYL2XP1: Compute Y times Log2(X) or Log2(X+1)
Section B.4.115: HLT: Halt Processor
Section B.4.116: IBTS: Insert Bit String
Section B.4.117: IDIV: Signed Integer Divide
Section B.4.118: IMUL: Signed Integer Multiply
Section B.4.119: IN: Input from I/O Port
Section B.4.120: INC: Increment Integer
Section B.4.121: INSB, INSW, INSD: Input String from I/O Port
Section B.4.122: INT: Software Interrupt
Section B.4.123: INT3, INT1, ICEBP, INT01: Breakpoints
Section B.4.124: INTO: Interrupt if Overflow
Section B.4.125: INVD: Invalidate Internal Caches
Section B.4.126: INVLPG: Invalidate TLB Entry
Section B.4.127: IRET, IRETW, IRETD: Return from Interrupt
Section B.4.128: Jcc: Conditional Branch
Section B.4.129: JCXZ, JECXZ: Jump if CX/ECX Zero
Section B.4.130: JMP: Jump
Section B.4.131: LAHF: Load AH from Flags
Section B.4.132: LAR: Load Access Rights
Section B.4.133: LDMXCSR: Load Streaming SIMD Extension Control/Status
Section B.4.134: LDS, LES, LFS, LGS, LSS: Load Far Pointer
Section B.4.135: LEA: Load Effective Address
Section B.4.136: LEAVE: Destroy Stack Frame
Section B.4.137: LFENCE: Load Fence
Section B.4.138: LGDT, LIDT, LLDT: Load Descriptor Tables
Section B.4.139: LMSW: Load/Store Machine Status Word
Section B.4.140: LOADALL, LOADALL286: Load Processor State
Section B.4.141: LODSB, LODSW, LODSD: Load from String
Section B.4.142: LOOP, LOOPE, LOOPZ, LOOPNE, LOOPNZ: Loop with Counter
Section B.4.143: LSL: Load Segment Limit
Section B.4.144: LTR: Load Task Register
Section B.4.145: MASKMOVDQU: Byte Mask Write
Section B.4.146: MASKMOVQ: Byte Mask Write
Section B.4.147: MAXPD: Return Packed Double-Precision FP Maximum
Section B.4.148: MAXPS: Return Packed Single-Precision FP Maximum
Section B.4.149: MAXSD: Return Scalar Double-Precision FP Maximum
Section B.4.150: MAXSS: Return Scalar Single-Precision FP Maximum
Section B.4.151: MFENCE: Memory Fence
Section B.4.152: MINPD: Return Packed Double-Precision FP Minimum
Section B.4.153: MINPS: Return Packed Single-Precision FP Minimum
Section B.4.154: MINSD: Return Scalar Double-Precision FP Minimum
Section B.4.155: MINSS: Return Scalar Single-Precision FP Minimum
Section B.4.156: MOV: Move Data
Section B.4.157: MOVAPD: Move Aligned Packed Double-Precision FP Values
Section B.4.158: MOVAPS: Move Aligned Packed Single-Precision FP Values
Section B.4.159: MOVD: Move Doubleword to/from MMX Register
Section B.4.160: MOVDQ2Q: Move Quadword from XMM to MMX register.
Section B.4.161: MOVDQA: Move Aligned Double Quadword
Section B.4.162: MOVDQU: Move Unaligned Double Quadword
Section B.4.163: MOVHLPS: Move Packed Single-Precision FP High to Low
Section B.4.164: MOVHPD: Move High Packed Double-Precision FP
Section B.4.165: MOVHPS: Move High Packed Single-Precision FP
Section B.4.166: MOVLHPS: Move Packed Single-Precision FP Low to High
Section B.4.167: MOVLPD: Move Low Packed Double-Precision FP
Section B.4.168: MOVLPS: Move Low Packed Single-Precision FP
Section B.4.169: MOVMSKPD: Extract Packed Double-Precision FP Sign Mask
Section B.4.170: MOVMSKPS: Extract Packed Single-Precision FP Sign Mask
Section B.4.171: MOVNTDQ: Move Double Quadword Non Temporal
Section B.4.172: MOVNTI: Move Doubleword Non Temporal
Section B.4.173: MOVNTPD: Move Aligned Four Packed Single-Precision FP Values Non Temporal
Section B.4.174: MOVNTPS: Move Aligned Four Packed Single-Precision FP Values Non Temporal
Section B.4.175: MOVNTQ: Move Quadword Non Temporal
Section B.4.176: MOVQ: Move Quadword to/from MMX Register
Section B.4.177: MOVQ2DQ: Move Quadword from MMX to XMM register.
Section B.4.178: MOVSB, MOVSW, MOVSD: Move String
Section B.4.179: MOVSD: Move Scalar Double-Precision FP Value
Section B.4.180: MOVSS: Move Scalar Single-Precision FP Value
Section B.4.181: MOVSX, MOVZX: Move Data with Sign or Zero Extend
Section B.4.182: MOVUPD: Move Unaligned Packed Double-Precision FP Values
Section B.4.183: MOVUPS: Move Unaligned Packed Single-Precision FP Values
Section B.4.184: MUL: Unsigned Integer Multiply
Section B.4.185: MULPD: Packed Single-FP Multiply
Section B.4.186: MULPS: Packed Single-FP Multiply
Section B.4.187: MULSD: Scalar Single-FP Multiply
Section B.4.188: MULSS: Scalar Single-FP Multiply
Section B.4.189: NEG, NOT: Two's and One's Complement
Section B.4.190: NOP: No Operation
Section B.4.191: OR: Bitwise OR
Section B.4.192: ORPD: Bit-wise Logical OR of Double-Precision FP Data
Section B.4.193: ORPS: Bit-wise Logical OR of Single-Precision FP Data
Section B.4.194: OUT: Output Data to I/O Port
Section B.4.195: OUTSB, OUTSW, OUTSD: Output String to I/O Port
Section B.4.196: PACKSSDW, PACKSSWB, PACKUSWB: Pack Data
Section B.4.197: PADDB, PADDW, PADDD: Add Packed Integers
Section B.4.198: PADDQ: Add Packed Quadword Integers
Section B.4.199: PADDSB, PADDSW: Add Packed Signed Integers With Saturation
Section B.4.200: PADDSIW: MMX Packed Addition to Implicit Destination
Section B.4.201: PADDUSB, PADDUSW: Add Packed Unsigned Integers With Saturation
Section B.4.202: PAND, PANDN: MMX Bitwise AND and AND-NOT
Section B.4.203: PAUSE: Spin Loop Hint
Section B.4.204: PAVEB: MMX Packed Average
Section B.4.205: PAVGB PAVGW: Average Packed Integers
Section B.4.206: PAVGUSB: Average of unsigned packed 8-bit values
Section B.4.207: PCMPxx: Compare Packed Integers.
Section B.4.208: PDISTIB: MMX Packed Distance and Accumulate with Implied Register
Section B.4.209: PEXTRW: Extract Word
Section B.4.210: PF2ID: Packed Single-Precision FP to Integer Convert
Section B.4.211: PF2IW: Packed Single-Precision FP to Integer Word Convert
Section B.4.212: PFACC: Packed Single-Precision FP Accumulate
Section B.4.213: PFADD: Packed Single-Precision FP Addition
Section B.4.214: PFCMPxx: Packed Single-Precision FP Compare
Section B.4.215: PFMAX: Packed Single-Precision FP Maximum
Section B.4.216: PFMIN: Packed Single-Precision FP Minimum
Section B.4.217: PFMUL: Packed Single-Precision FP Multiply
Section B.4.218: PFNACC: Packed Single-Precision FP Negative Accumulate
Section B.4.219: PFPNACC: Packed Single-Precision FP Mixed Accumulate
Section B.4.220: PFRCP: Packed Single-Precision FP Reciprocal Approximation
Section B.4.221: PFRCPIT1: Packed Single-Precision FP Reciprocal, First Iteration Step
Section B.4.222: PFRCPIT2: Packed Single-Precision FP Reciprocal/ Reciprocal Square Root, Second Iteration Step
Section B.4.223: PFRSQIT1: Packed Single-Precision FP Reciprocal Square Root, First Iteration Step
Section B.4.224: PFRSQRT: Packed Single-Precision FP Reciprocal Square Root Approximation
Section B.4.225: PFSUB: Packed Single-Precision FP Subtract
Section B.4.226: PFSUBR: Packed Single-Precision FP Reverse Subtract
Section B.4.227: PI2FD: Packed Doubleword Integer to Single-Precision FP Convert
Section B.4.228: PF2IW: Packed Word Integer to Single-Precision FP Convert
Section B.4.229: PINSRW: Insert Word
Section B.4.230: PMACHRIW: Packed Multiply and Accumulate with Rounding
Section B.4.231: PMADDWD: MMX Packed Multiply and Add
Section B.4.232: PMAGW: MMX Packed Magnitude
Section B.4.233: PMAXSW: Packed Signed Integer Word Maximum
Section B.4.234: PMAXUB: Packed Unsigned Integer Byte Maximum
Section B.4.235: PMINSW: Packed Signed Integer Word Minimum
Section B.4.236: PMINUB: Packed Unsigned Integer Byte Minimum
Section B.4.237: PMOVMSKB: Move Byte Mask To Integer
Section B.4.238: PMULHRWC, PMULHRIW: Multiply Packed 16-bit Integers With Rounding, and Store High Word
Section B.4.239: PMULHRWA: Multiply Packed 16-bit Integers With Rounding, and Store High Word
Section B.4.240: PMULHUW: Multiply Packed 16-bit Integers, and Store High Word
Section B.4.241: PMULHW, PMULLW: Multiply Packed 16-bit Integers, and Store
Section B.4.242: PMULUDQ: Multiply Packed Unsigned 32-bit Integers, and Store.
Section B.4.243: PMVccZB: MMX Packed Conditional Move
Section B.4.244: POP: Pop Data from Stack
Section B.4.245: POPAx: Pop All General-Purpose Registers
Section B.4.246: POPFx: Pop Flags Register
Section B.4.247: POR: MMX Bitwise OR
Section B.4.248: PREFETCH: Prefetch Data Into Caches
Section B.4.249: PREFETCHh: Prefetch Data Into Caches
Section B.4.250: PSADBW: Packed Sum of Absolute Differences
Section B.4.251: PSHUFD: Shuffle Packed Doublewords
Section B.4.252: PSHUFHW: Shuffle Packed High Words
Section B.4.253: PSHUFLW: Shuffle Packed Low Words
Section B.4.254: PSHUFW: Shuffle Packed Words
Section B.4.255: PSLLx: Packed Data Bit Shift Left Logical
Section B.4.256: PSRAx: Packed Data Bit Shift Right Arithmetic
Section B.4.257: PSRLx: Packed Data Bit Shift Right Logical
Section B.4.258: PSUBx: Subtract Packed Integers
Section B.4.259: PSUBSxx, PSUBUSx: Subtract Packed Integers With Saturation
Section B.4.260: PSUBSIW: MMX Packed Subtract with Saturation to Implied Destination
Section B.4.261: PSWAPD: Swap Packed Data
Section B.4.262: PUNPCKxxx: Unpack and Interleave Data
Section B.4.263: PUSH: Push Data on Stack
Section B.4.264: PUSHAx: Push All General-Purpose Registers
Section B.4.265: PUSHFx: Push Flags Register
Section B.4.266: PXOR: MMX Bitwise XOR
Section B.4.267: RCL, RCR: Bitwise Rotate through Carry Bit
Section B.4.268: RCPPS: Packed Single-Precision FP Reciprocal
Section B.4.269: RCPSS: Scalar Single-Precision FP Reciprocal
Section B.4.270: RDMSR: Read Model-Specific Registers
Section B.4.271: RDPMC: Read Performance-Monitoring Counters
Section B.4.272: RDSHR: Read SMM Header Pointer Register
Section B.4.273: RDTSC: Read Time-Stamp Counter
Section B.4.274: RET, RETF, RETN: Return from Procedure Call
Section B.4.275: ROL, ROR: Bitwise Rotate
Section B.4.276: RSDC: Restore Segment Register and Descriptor
Section B.4.277: RSLDT: Restore Segment Register and Descriptor
Section B.4.278: RSM: Resume from System-Management Mode
Section B.4.279: RSQRTPS: Packed Single-Precision FP Square Root Reciprocal
Section B.4.280: RSQRTSS: Scalar Single-Precision FP Square Root Reciprocal
Section B.4.281: RSTS: Restore TSR and Descriptor
Section B.4.282: SAHF: Store AH to Flags
Section B.4.283: SAL, SAR: Bitwise Arithmetic Shifts
Section B.4.284: SALC: Set AL from Carry Flag
Section B.4.285: SBB: Subtract with Borrow
Section B.4.286: SCASB, SCASW, SCASD: Scan String
Section B.4.287: SETcc: Set Register from Condition
Section B.4.288: SFENCE: Store Fence
Section B.4.289: SGDT, SIDT, SLDT: Store Descriptor Table Pointers
Section B.4.290: SHL, SHR: Bitwise Logical Shifts
Section B.4.291: SHLD, SHRD: Bitwise Double-Precision Shifts
Section B.4.292: SHUFPD: Shuffle Packed Double-Precision FP Values
Section B.4.293: SHUFPS: Shuffle Packed Single-Precision FP Values
Section B.4.294: SMI: System Management Interrupt
Section B.4.295: SMINT, SMINTOLD: Software SMM Entry (CYRIX)
Section B.4.296: SMSW: Store Machine Status Word
Section B.4.297: SQRTPD: Packed Double-Precision FP Square Root
Section B.4.298: SQRTPS: Packed Single-Precision FP Square Root
Section B.4.299: SQRTSD: Scalar Double-Precision FP Square Root
Section B.4.300: SQRTSS: Scalar Single-Precision FP Square Root
Section B.4.301: STC, STD, STI: Set Flags
Section B.4.302: STMXCSR: Store Streaming SIMD Extension Control/Status
Section B.4.303: STOSB, STOSW, STOSD: Store Byte to String
Section B.4.304: STR: Store Task Register
Section B.4.305: SUB: Subtract Integers
Section B.4.306: SUBPD: Packed Double-Precision FP Subtract
Section B.4.307: SUBPS: Packed Single-Precision FP Subtract
Section B.4.308: SUBSD: Scalar Single-FP Subtract
Section B.4.309: SUBSS: Scalar Single-FP Subtract
Section B.4.310: SVDC: Save Segment Register and Descriptor
Section B.4.311: SVLDT: Save LDTR and Descriptor
Section B.4.312: SVTS: Save TSR and Descriptor
Section B.4.313: SYSCALL: Call Operating System
Section B.4.314: SYSENTER: Fast System Call
Section B.4.315: SYSEXIT: Fast Return From System Call
Section B.4.316: SYSRET: Return From Operating System
Section B.4.317: TEST: Test Bits (notional bitwise AND)
Section B.4.318: UCOMISD: Unordered Scalar Double-Precision FP compare and set EFLAGS
Section B.4.319: UCOMISS: Unordered Scalar Single-Precision FP compare and set EFLAGS
Section B.4.320: UD0, UD1, UD2: Undefined Instruction
Section B.4.321: UMOV: User Move Data
Section B.4.322: UNPCKHPD: Unpack and Interleave High Packed Double-Precision FP Values
Section B.4.323: UNPCKHPS: Unpack and Interleave High Packed Single-Precision FP Values
Section B.4.324: UNPCKLPD: Unpack and Interleave Low Packed Double-Precision FP Data
Section B.4.325: UNPCKLPS: Unpack and Interleave Low Packed Single-Precision FP Data
Section B.4.326: VERR, VERW: Verify Segment Readability/Writability
Section B.4.327: WAIT: Wait for Floating-Point Processor
Section B.4.328: WBINVD: Write Back and Invalidate Cache
Section B.4.329: WRMSR: Write Model-Specific Registers
Section B.4.330: WRSHR: Write SMM Header Pointer Register
Section B.4.331: XADD: Exchange and Add
Section B.4.332: XBTS: Extract Bit String
Section B.4.333: XCHG: Exchange
Section B.4.334: XLATB: Translate Byte in Lookup Table
Section B.4.335: XOR: Bitwise Exclusive OR
Section B.4.336: XORPD: Bitwise Logical XOR of Double-Precision FP Values
Section B.4.337: XORPS: Bitwise Logical XOR of Single-Precision FP Values

Index