Newsgroups: comp.lang.asm.x86,alt.msdos.programmer
From: terra@diku.dk (Morten Welinder)
Subject: Intel Documentation Errata (LONG)
Date: Fri, 28 Oct 1994 09:41:00 GMT


The Unofficial Intel Documentation Errata version 1.00
Copyright (C) 1994 Morten Welinder

Permission is granted to anyone to make or distribute verbatim copies of this
document, in any medium, provided that the copyright notice and permission
notice are preserved, and that the distributor grants the recipient permission
for further redistribution as permitted by this notice.

Modified versions may not be made.
-------------------------------------------------------------------------------
This is a list of errata in Intel's Cpu documentation.  I only have the
i486 manual, but I expect most of the mistakes to be present in the i386
and Pentium manuals also.

A page number like "page 486: 22-1" is a page number in "i486(tm) Processor
Programmer's Reference Manual", ISBN 1-55512-101-2, Intel 1990.

I decide what counts as errata!  Suggestions are always welcome -- send
email to me at <terra@diku.dk>.
-------------------------------------------------------------------------------
Figure 3-23: CPU_ID Detection Code, page 486: 3-42

This code is not guaranteed to work in protected mode (including V86)
since the flag instructions may be emulated.  However, only old system
software (like Borland Turbo Debugger TD386) will deny flipping of the
AC flag.
-------------------------------------------------------------------------------
Figure 5-8: Segment Descriptors, page 486: 5-11

The blank 4-bit field should read "LIMIT 19:16".
-------------------------------------------------------------------------------
D-bit is data descriptors, page 486: 5-12

The D-bit as described is for code segments.  For data segments the same
bit is sometimes called the B-bit and it is only used for stack segments
where it determines whether ESP (B=1) or SP (B=0) is used.
-------------------------------------------------------------------------------
Causes of divide-error fault, page 486: 9-14

The divide-error fault is raised not only when dividing with zero, but
also when an overflow is detected in a divide instruction.
-------------------------------------------------------------------------------
Returning to Real Mode, page 486: 22-4

It seems that SS (and probably CS) must be loaded with selectors that have
their "Big" bit cleared.  Otherwise strange errors can occur later when
the stack pointer is loaded.
-------------------------------------------------------------------------------
"AAD" instruction, page 486: 26-19

Note that this is a two-byte instruction.  The second byte can be any
constant a it will take the place of "10".  This is undocumented and
reportedly does not work on some early 8086-clones.
-------------------------------------------------------------------------------
"AAM" instruction, page 486: 26-20

Note that this is a two-byte instruction.  The second byte can be any
constant a it will take the place of "10".  This is undocumented.
If the constant is zero then exception 0 is raised.
-------------------------------------------------------------------------------
"BT" instruction, page 486: 26-36

The opcode for "BT r/m16,imm8" should be "0F BA /4 ib".
-------------------------------------------------------------------------------
"CALL" instruction, page 486: 26-44

Note that the 16-bit displacement version will clear the upper half of
EIP even in a 32-bit code segment.  Therefore it should not be used in
such a segment unless you are certain that the destination will never
exceed 64K.
-------------------------------------------------------------------------------
"CLTS" instruction, page 486: 26-55

Under V86 mode an #GP(0) is always generated.
-------------------------------------------------------------------------------
"CMPXCHG" instruction, page 486: 26-62

Note that some versions of Borland's Turbo Assembler get this instruction
wrong.
-------------------------------------------------------------------------------
"DEC" instruction, page 486: 26-67

The opcode for "DEC r/m32" should be "FF /1".
-------------------------------------------------------------------------------
"ENTER" instruction, page 486: 26-70

The value assigned to `frame-ptr' should be ESP if the stack is "Big"
and SP if not.

Great care should be taken with "ENTER" and "LEAVE" when the stack segment
size is 32 bits and the code segment size is 16 bits.  It is easy to
destroy the upper half of ESP.  Use equivalent simpler (and ususally faster)
instructions instead.
-------------------------------------------------------------------------------
"FSTSW" instruction, page 486: 26-136

The second form of the instruction should be "FSTSW AX".
-------------------------------------------------------------------------------
"INC" instruction, page 486: 26-164

The opcode for "INC r/m32" should be "FF /0".
-------------------------------------------------------------------------------
"INT" instructions, page 486: 26-167

Page 23-8 states that interrupt instructions are IOPL sensitive in V86
mode.
-------------------------------------------------------------------------------
"Jcc" conditional jump instructions, page 486: 26-180

Note that the 16-bit displacement versions will clear the upper half of
EIP even in a 32-bit code segment.  Therefore they should not be used in
such a segment unless you are certain that the destination will never
exceed 64K.  The same is true for "JCXZ" (as opposed "JECXZ").

It is usually faster and simpler to test CX/ECX with ordinary instructions.
-------------------------------------------------------------------------------
"JMP" instruction, page 486: 26-183

Note that the 16-bit displacement version will clear the upper half of
EIP even in a 32-bit code segment.  Therefore it should not be used in
such a segment unless you are certain that the destination will never
exceed 64K.  (The same is true for the 8-bit displacement version, but
who would add an operand size prefix for that?)
-------------------------------------------------------------------------------
"LEAVE" instruction, page 486: 26-193

Great care should be taken with "ENTER" and "LEAVE" when the stack segment
size is 32 bits and the code segment size is 16 bits.  It is easy to
destroy the upper half of ESP.  Use equivalent simpler (and faster)
instructions instead.
-------------------------------------------------------------------------------
"LIDT"/"LGDT" instructions, page 486: 26-195

Under V86 mode an #GP(0) is always generated.
-------------------------------------------------------------------------------
"LGS" instruction, page 486: 26-196

In case `LGS:' the "DS" should read "GS".
-------------------------------------------------------------------------------
"LLDT" instruction, page 486: 26-199

Under V86 mode an #GP(0) is always generated.
-------------------------------------------------------------------------------
"LOOP" instructions, page 486: 26-206

The specification fails to state that when the operand size is 16 bits (and
the condition is satisfied) then the upper half of EIP is cleared.

For this reason the CX versions should not be used in a 32-bit segment
unless you are certain that the destination will never exceed 64K. It
is benificial anyway to use a decrement and standard jump.
-------------------------------------------------------------------------------
"MOV" instructions, page 486: 26-211

The specification fails to mention that moves into CS are not allowed.
-------------------------------------------------------------------------------
"POPAD" instruction, page 486: 26-234

The "POPAD" instruction is sligtly buggy on most 386s.  To avoid problems
always have a non-memory referencing instruction (e.g., "NOP") immediately
following.
-------------------------------------------------------------------------------
"PUSH" instruction, page 486: 26-237

The specification fails to mention that an 8-bit operand is sign-extended
before being pushed.
-------------------------------------------------------------------------------
"RCL"/"RCR"/"ROL"/"ROR" instructions, page 486: 26-242

As stated elsewhere in the documentation these instruction do absolutely
nothing when the effective shift count is zero.  In particular, and in spite
of what is written here, no flags are affected.  If you are not sure that
CL is non-zero (mod 32) then you should not assume that "R[CO][LR] ...,CL"
sets any flags.

Note that there are two different versions of the instructions shifting
by one bit.  Assemblers usually select the shortest (implied operand) which
is important to know because the two different forms set different flags.
-------------------------------------------------------------------------------
"REP"/"REPE"/"REPNE" instructions, page 486: 26-245

Note that ECX is used exactly when ESI is.  Note also that "REP LODSB"
does makes sense with certain memory mapped hardware like VGA cards.

The description for "CMPS" and "SCAS" saying that "JCXZ" can be used
for testing why the looping stopped is *wrong*.  Use the Z flag.  When
the condition ends up false (meaning (E)CX did not cause the exiting)
note that (E)SI/(E)DI have been stepped.
-------------------------------------------------------------------------------
"SAL"/"SAR"/"SHL"/"SHR" instructions, page 486: 26-253

As stated elsewhere in the documentation these instruction do absolutely
nothing when the effective shift count is zero.  In particular, and in spite
of what is written here, no flags are affected.  If you are not sure that
CL is non-zero (mod 32) then you should not assume that "S[AH][LR] ...,CL"
sets any flags.

Note that there are two different versions of the instructions shifting
by one bit.  Assemblers usually select the shortest (implied operand) which
is important to know because the two different forms set different flags.
-------------------------------------------------------------------------------
"SHLD" instruction, page 486: 26-264

Contrary to the stated information this instruction seems to work also
in the case where the shift amount is exactly the same as the operand
size.
-------------------------------------------------------------------------------
"SHRD" instruction, page 486: 26-264

Contrary to the stated information this instruction seems to work also
in the case where the shift amount is exactly the same as the operand
size.
-------------------------------------------------------------------------------
"WAIT" instruction, page 486: 26-281

(Cpus with on-chip Fpu only.)  If the instruction immediately following
this instruction is an Fpu-instruction then no exception is raised.
-------------------------------------------------------------------------------
"XADD" instruction, page 486: 26-283

Note that some versions of Borland's Turbo Assembler get this instruction
wrong.
-------------------------------------------------------------------------------
"XCHG" instruction, page 486: 26-285

The timings are incorrect for "XCHG AX,AX" and "XCHG EAX,EAX" which
execute in one cycle.
-------------------------------------------------------------------------------
Standard OPCODE map, page 486: A-4

There should be no instruction at 0x82.
-------------------------------------------------------------------------------
0F OPCODE map, page 486: A-6

The instruction at 0x20 to 0x24 and 0x26 should have their operands reversed,
i.e., the first should read "MOV Rd,Cd".
-------------------------------------------------------------------------------
Extension OPCODE map, page 486: A-8

The instruction at (7,111) should read "INVLPG EW".
-------------------------------------------------------------------------------
Index.

B bit ("big bit", stack), see D bit.