x86 cpus' Guide

 Search (OK)

 

Statistics
Collections : 21281  cpu
Known : 10392  cpu
For sale : 232  cpu
Pictures : 25794  photos
 Add to favorites
 Homepage
Site map Site map

Franc¸ais  English  Dutch   

 Log in
 Register

Processor of the day

Intel_Xeon_S1567_E7540_2000MHz_18M_SLBRG_top.jpg
Intel Xeon E7540

Most popular CPUs

Intel Core 2 Duo E8800 (ES)
Intel Core 2 Duo E6200 (ES)
Intel Core 2 Duo E7700
Intel Core 2 Duo E7800
Intel Pentium II 266 (0,35µ)

Intel Core i7-3770
Intel Core i5-3470
Intel Core i5-2400
Intel Core i5-4570
Intel Core i7-2600

Most powerful CPUs
Desktop PCs
AMD : Ryzen Threadripper PRO 3995X
Intel : Core i9-10980XE
Laptop PCs
AMD : Ryzen 9 4900H
Intel : Core i9-10980HK
Servers
AMD : EPYC 7763
Intel : Xeon Platinium 9282

Other articles > X86 Glossary > 4-operand Fused Multiply-Add instructions (FMA4)

The FMA instruction set is an extension to the 128 and 256-bit Streaming SIMD Extensions instructions in the x86 microprocessor instruction set to perform fused multiply–add (FMA) operations. There are two variants: - FMA4 is supported in AMD processors starting with the Bulldozer architecture. FMA4 was realized in hardware before FMA3. - FMA3 is supported in AMD processors starting with the Piledriver architecture and Intel starting with Haswell processors and Broadwell processors since 2014. FMA3 and FMA4 instructions have almost identical functionality, but are not compatible. Both contain fused multiply–add (FMA) instructions for floating-point scalar and SIMD operations, but FMA3 instructions have three operands, while FMA4 ones have four. The FMA operation has the form d = round(a · b + c), where the round function performs a rounding to allow the result to fit within the destination register if there are too many significant bits to fit within the destination. The four-operand form (FMA4) allows a, b, c and d to be four different registers, while the three-operand form (FMA3) requires that d be the same register as a, b or c. The three- operand form makes the code shorter and the hardware implementation slightly simpler, while the four-operand form provides more programming flexibility.


Used for : AMD APU, AMD Athlon, AMD FX, AMD Opteron, AMD Sempron