Rudometov.COM               

usEnglish       ruRussian       deGerman      

The beginning of an era 64

Evgenie Rudometov.
authors@rudometov.com

Prompt development of semiconducting and computer processing techniques has provided fast evolution of the architecture of processors. As a result there was possible an extension of functionality of the 32-bit processors which have received special modes of performance of 64-bit applications.

At the heart of the present stage of development of a civilisation the effective information processing which is carried out by computer systems lays. The central units of these systems are high-powered processors which in short period have transited some stages of the evolution. Each of these historical stages is characterised by a number of features. Among them not only frequency and architectural parametres of electronic units, but also features of solved tasks, complexity and raznoobraznost which permanently increase, also as well as sizes of the processed information.

The arrangements constructed by new architectural principles are necessary for effective operation with great volumes of figures. Really, more and more the wide circulation of computer processing techniques generates a large quantity of the new figures demanding timely conversion and consumption.

For effective processing of huge streams of the information, among kotorj znachiteluju the share is made by multimedia files, konvergentnaja the computer-communication branch should process the hardware and applications essentially. Thus the potential of growth of functionality and productivity in the future will be provided not only rise of the clock rate setting rate of operation of semiconducting units, but also implantation of perspective architectural innovations.

However the computer equipment future is substantially defined by the operations intensively led in laboratories and on production now. The main object of attention of scientists and engineers, as well as earlier, there are processors, major of which parametres is the productivity depending on processing technique, microarchitecture and clock rate.

Productivity of processors

Productivity of processors is defined by following relations:

Productivity of the processor = / Commands Time,

/ commands of Second = (/Commands Clock ticks) h (Clock ticks / seconds),

The parametre Commands/clock ticks means quantity of executable instructions for clock tick — IPC (Instructions per Cycle), and Clock ticks/seconds are a clock rate on which the processor kernel works.

IPC value is function of the architecture of the processor and a used procedure that is written by means of following expression: IPC = f (the architecture, that. Process). Thus clock rate F, setting that of operation of a processor kernel is function, both a procedure, and design features of processors of chains used in a construction, that is F = f (that. Process, chains).

From 32 to 64

So, being grounded on the relations resulted above, productivity of the processor is defined by two parametres, one of which quantity of the instructions executable for clock tick, — IPC, another — clock rate — F.

If in the marked functional relations to lay aside questions of processing technique and feature of implementation of internal electronic chains of a kernel of the processor there is a dependence of productivity of the processor on its architectural solutions. These solutions defining parametre IPC, include many features of the architecture, in particular, digit capacity of processed units.

It is obvious, that, the more the bits which are taken away on the commands and data, the it is required the commands executable for clock tick, for solution of tasks in view less. It is real, obvious, that to performance of the calculations demanding high accuracy, in case of processors of high digit capacity it is necessary less than machine words for allocation of operands and commands, than in case of low-discharge units.

It could be observed at transition from 4-bit processors to 8-bit, further — to 16 and, at last, to 32-bit models. It is necessary to add, what exactly 32-bit processors are dominating.

Despite prevalence of the given high-powered models, it is necessary to mark, that in the market already a number of years there are also more difficult solutions oriented on data of the doubled length. But the developed 64-bit processors intended until recently only for the market of server solutions. Because of high cost of the procedure used in production of such products, and also their extremely difficult inner pattern, 64-bit server processors differ rather high level of the prices. This level considerably above that is installed for table models.

Besides, existing models calculated under given, in length 64 bits, demand or the software engineerings, specially created under concrete architectures, or recompilation of existing applications. It is necessary to add, that the appropriate hardware is necessary also. All it creates certain difficulties on implantation of perspective multidigit processor architectures.

However thanks to prompt perfection of solid-state technologies and implantation of numerous architectural solutions for processors occurs not only lowering of the cost price before the developed models, but also begins possible to realise more and more difficult plans of implementators. Among these plans there are schedules of development and release for the market of desktop computers of high-powered processors bolshej, than traditional 32 bits, digit capacity. It becomes more and more actual as overgrowth of frequency potential is given to the objective reasons all more difficultly because of growth of a leakage current and appropriate increase teploobrazovanija.

AMD h86-64

Company AMD was the first who has managed to realise possibilities of modern solid-state technologies for release of mass processors for sector of desktop computers.

As development of the architecture of 32-bit processors AMD Athlon, implementators of the given company had been created the processors supporting 64-bit calculations. The given architecture has received name AMD64. As a prototype of advanced design of company AMD the architecture of processors AMD Athlon XP has been selected .

For the first time innovations of the perspective architecture providing support of 64-bit calculations, have been tested in server solutions. Appropriate processors have received name AMD Opteron and have been favourably accepted by the computer market. Moreover, a number of known manufacturers was declared by systems, osnovonnyh on the given processors.

However the greatest popularity was received by models with AMD64 architecture in sector of desktop computers. These models have received name AMD Athlon  64. osgnovy developments of these processors have served In quality server AMD Opteron.

The main difference of new desktop processors from previous AMD Athlon XP consists in support not only 32-bit, but also a 64-bit code at saving of complete compatibility with existing program applications. It gives the chance to carry out a soft junction from 32 to the 64-bit software, and also to provide compatibility with following generation of Microsoft Windows XP for 64-bit platforms.

Estimating applied usefulness of the new architecture, here, it is necessary to mark, that hardware implementation of 64-bit commands allows to increase productivity at data processing of appropriate length. Besides, usage of 64-bit modes provides the address space extension.

However it is necessary to mark, that for usage of potential of the expanded architecture the appropriate software providing support of 64-bit modes is required. Otherwise hardware extensions cannot be used.

For implementation of possibilities of the architecture oriented to 64-bit calculations, implementators AMD have doubled quantity of general-purpose registers and have increased their digit capacity to 64 bats.

Besides, among basic refinements of the architecture of processors AMD Athlon  64 it is necessary to mark:

        the Integrated controller of memory which are earlier present exclusively in chips North Bridge of chipsets of the system logic;

        Bus Hyper-Transport for link with a chip set, increasing capacity and reducing delays in data transfer;

        Extended to 12 steps the pipeline of integer calculations and to 17 steps the pipeline of real calculations that gives possibility to increase clock rates;

        the Increased size of the cache memory of the second level;

        Support of a set of instructions SSE2;

        Support of processing technique processing technique Cool-n-Quiet providing energosberezhenie and reduction of a heat release;

        Advanced protection against viruses (blocking of overflow of the buffer);

        Effective performance of 32-bit applications.

From design features it is necessary to mark appearance of the cover protecting a chip of the processor in which structure the improved chain of temperature protection is included.

From other parametres of processors AMD Athlon 64 it is necessary to mark: presence as a part of a kernel more than 100 million transistors, Socket 754 plug, processing technique 0,13 microns, etc . Units of chips of memory DDR2 PC3200/2700/2100/1600 (a transfer rate of 3,2/2,7/2,1/1,6 Gbytes) — to four registered units (special) DIMM or three unbuffered units (usual) DIMM are hooked up by means of a 64-bit wire (+8 bits ECC) with capacity to 3,2 Gbytes/with. The processor is hooked up by means of bus HyperTransport (to 1600 MHz, Full duplex) with capacity of 6,4 Gbytes. Cache memory L1 size – 128 Kbytes, size of the cache memory of second level L2 depends on model. So the model with a rating 3400 + (clock rate of a kernel — 2,2 GHz) incorporates cache memory L2 in size of 1024 Kbytes, model 3200 + (2,0 GHz) — 1024 Kbytes, model 3000 + (2,0 GHz) — 512 Kbytes.

On the basis of AMD64 architecture the flagman model of the processor which has received name AMD Athlon 64 FX is created. It differs from AMD Athlon 64 first of all usage of a two-channel subsystem of memory. As a result registered (special) memory modules DDR2 PC3200/2700/2100/1600 are hooked up by means of a 128-bit wire (+16 bits ECC). It provides for a subsystem of memory a pass-band to 6,4 Gbytes/with.

As the plug of this bar of processors originally has been selected Socket  940, coinciding with the plug of server solutions, however now there is its changeover by Socket 939 plug . dvuhkanalnost memory subsystems it is saved, but with processors uses already usual standard modules DDR400 that facilitates their integration into the systems intended for desktop computers.

 

Intel EM64T

The existing tendency of transition to development and release of processors of the hybrid architecture was supported by Intel corporation. It was declared for the first time in the spring of this year at Forum IDF in San Francisco by Krejga Barrett, the main chief executive of corporation of Intel (Craig Barrett, Chief Executive Officer). This subject has been continued and at other sessions of the Forum.

 

The announcement on IDF Spring 2004

Among numerous advanced development of corporation CEO of Intel has marked planned support of 64-bit instructions by 32-bit processors. This architecture originally named as IA32e, has received further the name processing technique EMT64T (Extended Memory 64 Technology)

Photo 1. Performance Krejga Barrett, the main chief executive of corporation of Intel

Speaking rather this new to processors from architecture Intel, Krejg Barrett has underlined, that at an initial stage it is a question only of the market of servers and powerful workstations. The processing technique of support of 64-bit calculations expanding possibilities of IA32 architecture, for the specified sector will be already realised in 2004 . Following generations of server processors of Intel Xeon become the application object. Further in process of the extension of the software market oriented to 64-bit calculations and presented by appropriate variants of operating systems and applications, the given support will appear and in processors of computers of desktop level among which 32-bit processors are dominating (Table 1).

Table 1. The bar of high-powered processors of Intel

The processor

Assignment

Parametres (130 nanometers)

Parametres (90нм)

Intel
Itanium 2

Servers
Back-end

64 bits

1,3/1,4/1,5  GHz (MP)

L3  A cache memory – 3/4/6 Mb (MP)

1,4/1,6  GHz (DP)

L3  A cache memory – 1,5/3 Mb (DP)

L2 A cache memory - 256 Kbytes

L1 A cache memory - 32 Kbytes

64-bit addressing

128 INT + 128 FP registers

FSB - 400 MHz, 128 bits

To 512 processors

-

Intel
Xeon

Servers
Mid/Back-end

32 bits

To 4 Mb L3 a cache memory

To 3,4 GHz

To 4 processors

-

Intel
Pentium 4
Exteme Edition

Desktop PCs
For office and the house

32 bits

2 Mb L3 a cache memory

512 Kbytes L2 a cache memory

3,2‑3,4 GHz

800 MHz FSB

Processing technique Hyper-Threading

Microarchitecture NetBurst

-

Intel
Pentium 4

Desktop PCs
For office and the house

32 bits

512 Kbytes L2 a cache memory

2,4‑3,4 GHz

533/800 MHz FSB

Processing technique Hyper-Threading

Microarchitecture NetBurst

Kernel Northwood

32 bits

1 Mбайт L2 a cache memory

2,8-3,4 + GHz

533/800  MHz FSB

Processing technique Hyper-Threading

Microarchitecture NetBurst

Kernel Prescott

Intel
Pentium M

Transportable PCs

32 bits

Mбайт L2 a cache memory

1,3‑1,7 GHz

400 MHz FSB

Kernel Banias

32 bits

2 Mбайт L2 a cache memory

1,7-2,0 GHz

400  MHz FSB

Kernel Dothan

It is necessary to mark, as Krejg Barrett, both the subsequent lecturers, and the experts participating in numerous open desktops, is multiple underlined, that IA32e architecture providing implementation of 64-bit commands, is not copying of already existing AMD architectures. Moreover, it was repeatedly underlined, that the command system which has received the name x86-64, is not the corporation-competitor property. Besides, architectures of the processors presented by both largest manufacturers of this class of semiconducting units, are different and have the specific features of implementation. These features will find appropriate incarnation both in 32, and in 64-bit instruction sets. Approximately how it has been made in MMX, SSE, SSE2, SSE3, etc. However compatibility will be present at many commands.

Nevertheless, will be, bessomnenija, and differences. Differences in products from Intel and AMD will be linked basically to the different architectures, different approaches to engineering of the processors, different solid-state technologies applied in production. As examples of unique features it is possible to result developed processing technique Hyper-Threading (HT), and also the used instruction set SSE3, realised in production of Intel and absent while for the competitor even as clones. And in the near future other innovations, same processing techniques LaGrande, Vanderpool and, for example, Foxton will be realised also. At the same time in processors AMD use a number of own unique developments, for example, 3DNow!, not having clones in Intel products.

Nevertheless, despite possible differences in implementations of processing techniques of 64-bit calculations which are carried out on the basis of 32-bit processors, spoken on behalf Microsoft Steve Balmer has told that the beta-version of the appropriate operating system supporting 64-bit extensions of commands is ready. He has marked, that, according to experts of Microsoft, implementation of new commands will allow to raise productivity and accuracy of some calculations.

This subject was continued in the report at same Forum IDF by Michael Fister, the high vice-president and the general manager of division of Intel Enterprise Platforms Group (Michael Fister, Senior Vice President General Manager, Enterprise Platforms Group).

Photo 2. Michael Fistera's performance, the high vice-president and the general manager of division of Intel Enterprise Platforms Group

He has informed present on Forum IDF, that the 32-bit processor of Intel Xeon constructed on the basis of kernel Nocona with usage of processing technique of 90 nanometers becomes the first processor with support of 64-bit commands . The kernel of low model of this processor possessing the cache memory of 1 Mb and calculated for the bus of 800 MHz, will work on frequency of 3,6 GHz. Further in process of perfection semiconducting tehprotsessa more efficient variants of this processor created on the basis of kernel Prescott on processing technique of 90 nanometers will be released .

On this path there will be many the industrial difficulties linked to the used scale of lithograph. Now implementators operate with the units which sizes make already nanometry. Unfortunately, development and release of processors on the newest processing techniques provides not only reduction of scale of lithograph and increase in number of transistors by semiconducting crystals, but also is accompanied by a number of the negative phenomena. Considerable growth of a leakage current and appropriate increase concern such negative phenomena teploobrazovanija. However on belief of heads of Intel all problems will be successfully solved. As proof of it on a site of Intel long before release of appropriate models of processors the document describing processing technique of the 64-bit extension of the architecture of 32-bit processors — Extended Memory 64 Technology has been released 

 

Features of EM64T architecture

The processing technique of the 64-bit extension represents the extension of the 32-bit architecture.

As a result updated IA-32 architecture means support of 64-bit addressing. The extension includes new functional modes and the new expanded instructions providing increase of functionality of processors (a Fig. 1).

As a result of implantation of innovations 32-bit processors with support of processing technique of the 64-bit extension are compatible to the existing software. They are calculated for support as 32 bits of addressing, so 64 bits of direct addressing of great volumes of dynamic storage.

Fig. 1. Main innovations EM64T

The processor with implementation of processing technique of the 64-bit extension completely supports all existing features IA-32. In addition to them the new working conditions which have received name IA-32e are entered. This mode (mode) includes two podrezhima (sub-modes).

The first sub-mode — the compatibility mode accessible to the 64-bit operating system, is created for maintenance of existing heritage of not updated 32-bit software.

The second sub-mode, named as a 64-bit mode, is accessible to the 64-bit operating system, ensuring functioning of the applications written specially under 64-bit addressing of a memory space.

In a 64-bit mode, 64 bits of the extension provided with processing technique , applications can use following possibilities:

        64 bits of linear addressing,

        8 new general-purpose registers — GPR (general-purpose register),

        8 new 128 bits registorov for stream commands of SIMD-extensions (SSE, SSE2 and SSE3),

        64-bit GPR and command pointers,

64-extensions also add the unified addressing byte-register, the fast mechanism of exhibiting of priorities of interruptions and a new mode of a relative addressing.

So, the processor with implementation of processing technique of the 64-bit extension can work or in mode IA-32, or in mode IA-32e.

Traditional mode IA-32 allows the processor to work in a protected mode, in a real address mode, a mode virtual 8086.

Mode IA-32e is the mode of the processor used only in the environment of the 64-bit operating system, allows to use resources and advantages of processing technique of the 64-bit extension.

Resulted Table 2 describes basic performances of supported modes Ia-32e and differences between them.

Table 2. Modes IA-32e

Modes

OS

Recompilation

The address
(Default)

Operand
(Default)

The register extension

GPR, bit

IA-32e

64  bits

64 bits of OS

Yes

64

32

Yes

64

The compatible

No

32

32

No

32

16

16

16, 8

Mode IA-32e

Mode IA-32e contains two podrezhima — sub-modes: a mode of 64 bits and a compatible mode. Mode IA32e can be installed only loading of the 64-bit operating system.

Mode of 64 bits

The mode of 64 bits uses 64-bit applications started under the 64-bit operating system.

For implementation of a 64-bit mode following modifications of the architecture have been made:

        chains of support of 64 bits of linear addressing Are entered ,

        Register extensions are accessible through installation of a new prefix of a code (REX commands,

        Existing registers GPR are expanded to 64 bats (RAX, RBX, RCX, RDX, RSI, RDI, RBP, RSP),

        Eight new GPR (R8-R15),

        Eight new 128 bats of registers for SIMD-expansions (XMM8–XMM15),

        64 bits the current address register (RIP),

        the New mode of a relative addressing (RIP-relative data addressing),

        Can use space of plane addressing by one command,

        Expanded and new commands,

        Physical addressing supports more than 64 Gbytes (depends on features of implementation),

        the New mechanism of the control of queues of interruptions

Compatibility mode

The compatibility mode allows to start 16 and 32 bit applications under 64-bit OS without recompilation. Nevertheless, applications which are started in a virtual mode 8086, will not work. As well as 64-bit, the compatibility mode should be supported by the operating system. It in particular means, that 64-razr. Applications can simultaneously work with not recompiled 32-bit applications started in compatibility mode.

Table 3. Registers in processors with processing technique of the 64-bit extension

Program-accessible registers

Mode 64-bit

Modes usual and compatibility

Name

Quantity

The size, bit

Name

Quantity

The size, bit

General-purpose registers

RAX, RBX, RCX,

RDX, RBP, RSI,

RDI, RSP, R8-15

16

64

EAX, EBX, ECX,

EDX, EBP, ESI,

EDI, ESP

8

32

The pointer isnstruktsy

RIP

1

64

EIP

1

32

Flags

EFLAGS

1

32

EFLAGS

1

32

FP-registers

ST0-7

8

80

STO-7

8

80

Multimedia registers

MM0-7

8

64

MM0-7

8

64

Stream SIMD

Registers

XMM0-15

16

128

XMM0-7

8

128

The stack size

-

64

-

16 or 32

 

Development ekosistemy EM64T

Intel corporation works with key participants of the market for support of processing technique of 64 bits of the extension in their solutions.

This processing technique is provided by support by operating systems of Microsoft Windows Server 2003 and Windows XP Pro. Beta the system version is already accessible from Microsoft (NDA), Microsoft Server 2003 SP1 RTM is expected in 3 quarter 2004.

In addition to it the first statements of manufacturers of a hardware for support of processing technique of 64 bits of the extension have arrived . There is a testing of the developed platforms and appropriate drivers.

Intel corporation as it was underlined repeatedly by Krejg Barrett, puts considerable resources in development of the processing techniques oriented not only on current requirements, but also on perspective.

 

Productivity

Programs with AMD Athlon 64 in a mode 64-bit are fulfilled, more slowly, than in a mode 32. Here it is necessary to remind, that there are three functional modes: 32, 32/64, 64 bits.

In modes 32/64 and 64 essential gains neither in games, nor in existing tasks in processors AMD Athlon64 are not observed. Moreover, in a mode 64 considerable falling, as a rule, is observed. A gain in a traditional mode speaks otrabotannostju architectures of processors Athlon. Remains poaplodirovat to engineers AMD who thanks to NexGen command have learnt to do effective executive blocks.

But it does not mean, that Атлонов 64 architecture is bad. At all is not present. Engineers AMD managed to create effective 32-bit sites of what it is possible to be convinced on the basis of numerous testing of the systems created on the basis of processors AMD Athlon  64. But at what here 64-bit commands...

The modest results of testing shown by processors AMD in 64-bit calculations, speak the different reasons. One of them, possibly, is linked by that it is essentially difficult to make АЛУ64 so effective that its high-speed performance was above traditional 32-bit, at least, now.

By the way, for this reason Intel Itanium 2 nenamnogo faster Intel Xeon (a Fig. 2). Here it is necessary to remind, that advantages of the architecture of Intel Itanium not only not in speed of data processing, but also in stability and dependability. Besides, the architecture of processors of Intel Itanium 2 provides multiplexing of the big number of these arrangements, not resorting to difficult system of cluster solutions.

Fig. 2. Matching of productivity of traditional server multiprocessor platforms

As to testing of appropriate processors from Intel it will be fulfilled after appearance in the market of appropriate models. Then it will be possible to compare and results of implementation of firm solutions from both manufacturers.

Coming back to a problem of a demand of 64-bit commands in the architecture of 32-bit processors, it is necessary to mark, that, according to the majority of experts, these commands are necessary now, first of all, for support of direct addressing of dynamic storage of the considerable size exceeding a maximum level for traditional processors. It means, that in new modes potential users will feel productivity growth basically when systems with very great volumes of the memory, for example doubled size be required. In such cases the considerable part of programs, at least third, will be allocated in the addresses unavailable to direct addressing in a traditional mode of 32-bit processors.

The expediency of usage of 64-bit addressing for operation with great volumes of dynamic storage does not raise the doubts. Further there will be specially developed applications, capable to use new features of 32-bit processors. But it in the future. Existing propaganda appeals in the majority represent today, as a rule, the naked marketing routed on those who understands questions of the architecture of processors insufficiently and does not use the firm documentation and the given reason opinion of experts as a source of the objective information. Fundamental knowledge in the field of solid-state technologies and the architecture of processors can be useful in installation of true and prediction of perspectives of development. Especially, if they lean against the objective information. In this case they help to shield potential users from the intensive advertising campaigns generated by marketing services and numerous hearings.


Article is published in log Byte (http://www.bytemag.ru).

   About Us | Site Map | Privacy Policy | Contact Us | © 2009 By Rudometov.COM