Monday, August 2, 2010

I decided to update the CVector library before I made the tutorial, while coding I realized that Intel had a new instruction DPPS - Dot Product Packet of Singles.
I was fascinated with this new instruction. Before my dotProduct function would take 15 to 20 instructions. Now it would take 5 instructions. Meaning less code, less memory access, and easier to understand.

But my happiness was ended when at run time I had the error: 0xC000001D: Illegal Instruction.
This was because my Turion64X2 does not supported that instruction.
After a while I think a solution, how about if I checked before what instructions my processor supports and depending on that generate the right code for the function so there is no exceptions.

This Tutorial will be focused on that.

CPUID Instruction



Intel documentation stats that:

CPUID returns processor identification and feature information in the EAX, EBX, ECX,
and EDX registers.
1 The instruction’s output is dependent on the contents of the EAX
register upon execution (in some cases, ECX as well).


If you wish to know more you can check documents 253666, 253665 or directly in CPUID document http://www.intel.com/Assets/PDF/appnote/241618.pdf

So this means that if we put in EAX a 0x00 and then call CPUID:
__asm
{
mov EAX, 00h
cpuid
}

we will see that EBX, EDX and ECX had changed and that we have the vendor ID :
EBX = uneG
EDX = Ieni
ECX = letn

In each one of the registers.

The next code gets the vendor ID, and prints it:
char format[] = "%c%c%c%c%c%c%c%c%c%c%c%c\n";
__asm
{
mov eax,0x00
CPUID
// Prepar for Printf
// Save the original value of ESP
mov esi,esp
push BL
SHR EBX,8
push BL
SHR EBX,8
push BL
SHR EBX,8
push BL

push DL
SHR EDX,8
push DL
SHR EDX,8
push DL
SHR EDX,8
push DL

push CL
SHR ECX,8
push CL
SHR ECX,8
push CL
SHR ECX,8
push CL
lea eax,format
push eax
call DWORD PTR printf
mov ESP,ESI
}


As you may read, i used esi
mov esi,esp
to store esp, the Stack Pointer so that even though I will make tons of push for the printf, I can easily restore the correct value of esp and simulate that I made all the necessary pops.

Later,
push BL
will push the 8 less significant bits of EBX into the stack, this is the character G(47).
To get the next 8 bits of EBX, we can shift the register 8 units to the right.
SHR EDX,8
And use BL to push the next character to the stack.
push BL
And so on, until we have push all 3 registers into the stack with the respective format

lea eax,format
push eax

To finish we only restored the value of the Stack pointer.
mov ESP,ESI

In my computer the output was:
DMAcitnehtuA
witch backward will spell:
AuthenticAMD

Isn't this fun???

SSE Support


To check for SSE support we must load EAX with 0x01, and later call CPUID:
mov eax,01h
cpuid

As we may read on 253665 document of Intel in the bit 25 of the EDX register, we can check if SSE is Supported.
bit 26 of edx for SSE2 support
bit 0 of ecx for SSE3 support
bit 19 of ECX for SSE4 support
the next code will turn on the bits pertinents on sse, sse2, sse3, sse4 variables( the << operator implies a left shift)

int sse =0x01<<25; // turn on bit 25
int sse2=0x01<<26; // turn on bit 26
int sse3=0x01; // turn on bit 0
int sse4=0x01<<19; // turn on bit 19

Remember that:

0 and 0 = 0
1 and 0 = 0
0 and 1 = 0
1 and 1 = 1

this means that if we mask(and operation) any number to sse variable, we will only leave the bit 25 of the original number and all the other bits will be 0.
The next code will mask the bits and only leave the pertinent bit on to check later.

and sse,edx
and sse2,edx
and sse3,ecx
and sse4,ecx


Remember that in C/C++, the if statement works as,
if (condition) statement
and if the condition is a numeric value different of zero it enters at execute the statement.
if its zero, it jumps the statement.
The next code will check the value of sse, and determine if the CPU supports the extension.

if(sse){
printf("SSE is Supported\n");
}
else {
printf("SSE Not Supported\n");
}
if(sse2){
printf("SSE2 is Supported\n");
}
else {
printf("SSE2 Not Supported\n");
}
if(sse3){
printf("SSE3 is Supported\n");
}
else{
printf("SSE3 Not Supported\n");
}
if(sse4){
printf("SSE4 is Supported\n");
}
else {
printf("SSE4 Not Supported\n");
}


The Final Code:


void checkSSEsuport()
{
int sse =0x01<<25;
int sse2=0x01<<26;
int sse3=0x01;
int sse4=0x01<<19;
__asm
{
// get information to see if it supports XMM register
mov eax,01h
cpuid
and sse,edx
and sse2,edx
and sse3,ecx
and sse4,ecx
}
if(sse){
printf("SSE is Supported\n");
}
else {
printf("SSE Not Supported\n");
}
if(sse2){
printf("SSE2 is Supported\n");
}
else {
printf("SSE2 Not Supported\n");
}
if(sse3){
printf("SSE3 is Supported\n");
}
else{
printf("SSE3 Not Supported\n");
}
if(sse4){
printf("SSE4 is Supported\n");
}
else {
printf("SSE4 Not Supported\n");
}
char format[] = "%c%c%c%c%c%c%c%c%c%c%c%c\n";
__asm
{
mov eax,0x00
CPUID
// Prepar for Printf
// Save the original value of ESP
mov esi,esp
push BL
SHR EBX,8
push BL
SHR EBX,8
push BL
SHR EBX,8
push BL

push DL
SHR EDX,8
push DL
SHR EDX,8
push DL
SHR EDX,8
push DL
push CL
SHR ECX,8
push CL
SHR ECX,8
push CL
SHR ECX,8
push CL
lea eax,format
push eax
call DWORD PTR printf
mov ESP,ESI
}
}

int main()
{
checkSSEsuport();
getchar();
return 0;
}

My output





Please comment!