This tutorial is intended to explain simple Vector Algebra.
Then it will show how to create a simple Vector Library overloading the operators to perform dot product, scalar product, cross product, addition, subtraction, magnitude, scalar addition, and scalar subtraction. 
This tutorial will be introductory, so that in later tutorials you will modify the class making it more efficient using inline assembly and Streaming SIMD Extensions (SSE) of Intel.
Simple Vector Algebra
This tutorial will make the assumption that you already have taken a course of vector algebra and you only need to brush up on that(in my Mexico they teach you vector algebra like in kindergarten). So I am going strait forward with the definitions (if you Wish a Tutorial of Linear Algebra, just mail me, or leave a comment).
Quaternions
What are quaternions? and why use them?
A quaternion is, a ordered list of elements(Tuple) of 4 elements. 
we will use them because of the advantages they offer compered to 3d dimensional vectors. Like no lost of a degree of freedom(http://en.wikipedia.org/wiki/Gimbal_lock). And its more efficient when operating with 4x4 Matrix, because one matrix multiplication may scale,rotate and translate the vector(if working with 3d vectors, 3 matrix multiplications would be necessary one for each transformation).
Header File of the CVector Class
#ifndef _CVECTOR_ // Check if C Vector is defined 
#define _CVECTOR_ // the next time CVector will be define
#define ZEROVECTOR CVector()
class  CVector    
{       // Private 
 float x,y,z,w; // Order list of 4 elements |x|y|z|w|
public:
 static char sBuffer[38];  // holds the string of given vector
 char* toString();    // Return sBuffer with (x,y,z,w) values
 CVector (void);      // zero Vector Constructor
 CVector (float,float,float);   // Constructor
 CVector (CVector&);            // Copy Vector Constructor
 ~CVector();                    // Destructor
 CVector  operator+(CVector&);  // Addition
 CVector  operator-(CVector&);  // Subtraction
 float    dotProduct(CVector&);  
 void  operator=(CVector&);     // Copy Vector
 int  operator==(CVector&);     // Comparison Vector
 CVector  operator*(CVector&);  // Cross Product
 CVector  operator*(float);     // Scalar Multiplication
 float length();                // Magnitude
 void normalize();
};
#endif // _CVECTOR_
If you understand this completely jump to theCVector Memory Organization 
Preprocessor Directive
C++ Tutorial says:
Preprocessor directives are lines included in the code of our programs that are not program statements but directives for the preprocessor. These lines are always preceded by a hash sign (#). The preprocessor is executed before the actual compilation of code begins, therefore the preprocessor digests all these directives before any code is generated by the statements.
These preprocessor directives extend only across a single line of code. As soon as a newline character is found, the preprocessor directive is considered to end. No semicolon (;) is expected at the end of a preprocessor directive. The only way a preprocessor directive can extend through more than one line is by preceding the newline character at the end of the line by a backslash (\).Conditional inclusions #ifndef
C++ Tutorial says:
These directives allow to include or discard part of the code of a program if a certain condition is met#ifndef something
// some mystifying  code...
// that will be include if something is Not defined
#endif
Preprocessor Macro #define
#define identifier replacementC++ Tutorial says:
When the preprocessor encounters this directive, it replaces any occurrence of identifier in the rest of the code by replacement. This replacement can be an expression, a statement, a block or simply anything. The preprocessor does not understand C++, it simply replaces any occurrence of identifier by replacement.Error C2011: 'CVector' :'class' type redefinition
This is a common error among programmers. This error is because the class could of got defined twice when using it in multiple files.
The solution to this problem is to always check if the class has already been defined, if is not define(#ifndef). Define (#define) the hole Class(#endif)
This is the reason for this code:
#ifndef _CVECTOR_ // Check if C Vector is defined 
#define _CVECTOR_ // the next time CVector will be define
// All the header declaration!
// And Class here
#endif // _CVECTOR_
Organization of the program in memory
I find this interesting article that explain how is the memory organized from the 
University of Hawaii When a program is loaded into memory, it is organized into three areas of memory, called segments: the text segment, stack segment, and heap segment. The text segment (sometimes also called the code segment) is where the compiled code of the program itself resides. This is the machine language representation of the program steps to be carried out, including all functions making up the program, both user defined and system. 
The remaining two areas of system memory is where storage may be allocated by the compiler for data storage. The stack is where memory is allocated for automatic variables within functions. A stack is a Last In First Out (LIFO) storage device where new storage is allocated and deallocated at only one ``end'', called the Top of the stack.
The heap segment provides more stable storage of data for a program; memory allocated in the heap remains in existence for the duration of a program. Therefore, global variables (storage class external), and static variables are allocated on the heap. The memory allocated in the heap area, if initialized to zero at program start, remains zero until the program makes use of it. Thus, the heap area need not contain garbage.
It's important to notice that the functions are defined in the code segment and not in the stack(the diagram of Hawaii University might be a little bit confusing at first).
Memory organization of a C++ class
All variables(attributes) of a class are stored in continues memory and ordered by how you declared them(this is the object essence). The static variables are stored else where(in the heap)
The Function(methods) definitions of the class are in the code segment
The location of the object essence may vary depending of where you declared the object(if inside a function in the stack; Outside or static definition will place it in the heap) 

Remember that in a 32 bit architecture(x86) the size of a float is 32 bits(4 bytes[8 bit],4*8=32)
This means that the total size of our objetc in memory is 128 bits(4 byte[8 bits] floats * 4 vectors).
sizeof(float);   // should return 4
sizeof(CVector); // should return 4*4=16
All this information will be important when designing our vector class functions(methods) in inline ASM. Because we will use the xmm register(128 bit size registers to hold 4 floats).
class  CVector    
{       // Private 
 float x,y,z,w; // Order list of 4 elements |x|y|z|w|
public:
 static char sBuffer[38];  // holds the string of given vector
...
Remember that static variables are in the heap elsewhere. sBuffer and toString method are only implemented for debugging reasons and will not be in the final code.
CVector Constructors
CVector::CVector (float xi,float yi,float zi){  // Constructor
 x=xi;
 y=yi;
 z=zi;
 w=1.0f; // normalize the vector
}
this first constructor initialize the inner variables(attributes) of the object. It receives 3 floats(4 bytes[32 bits] in a 32 bit architecture), total 3*4bytes= 12 bytes(96 bits). And homogenize the vector(w=1.0f).
CVector::CVector (CVector& a){   // Copy Vector Constructor
 x=a.x;
 y=a.y;
 z=a.z;
 w=a.w;
}
As you may see, here we receive CVector by reference(the & denotes that). This means that we are only passing the reference of the object and not the object it self(only 4 byte pointer in a 32 bit architecture insted of 128 bits(object size)). its recommended that you pass always by reference(pointer, address).
CVector Operator definition
As you may recall from your kindergarten courses, the sum of two vector defined in 3 dimensional space is the sum of each component individually (eg.(1, 2, 3) + (−2, 0, 4) = (1 − 2, 2 + 0, 3 + 4) = (−1, 2, 7)).
CVector  CVector::operator+(CVector& a){  // Addition
 return CVector(x +a.x,y +a.y,z +a.z);
}
We are passing by reference CVector& a, and we use the constructor we defined earlier to build a new temporary object that will be copied to eax(return accumulator). this is inefficient because we fill up the stack with 3 floats when we call CVector constructor, pop the 3 floats create a temporary vector and return it.
In the next tutorial we will solve this problem, although it will lose the elegance.
Copy function
Copy function
void CVector::operator=(CVector& a){  // Copy Vector
 x=a.x;
 y=a.y;
 z=a.z;
 w=a.w;
}
as you can see here we also are passing by refence CVector&a, but here we are not returning the new copy of CVector because we want to change the inner values of the object(its attributes).
in C/C++ like: 
v=v+w;
Our program would fist call the function of addition and will return a new CVector who's address will be stored in eax accumulator register, and later on would call the copy function and copy each component to the desired object. This is very inefficient as we will see later all this can be done with out creating a temporary register, and filling up the stack.
v+=w;
This statement would do the same but with out calling the copy function and with out creating a new CVector.
CVector  CVector::operator-(CVector& a){  // Subtraction
 return CVector(x -a.x,y -a.y,z -a.z);
}
The subtraction is define as the addition(we only substract)
float    CVector::dotProduct(CVector& a){
 return x*a.x+y*a.y+z*a.z;
}
The Dot product of two vectors a = [a1, a2, ... , an] and b = [b1, b2, ... , bn] is defined as:

CVector Class definition C++ File
This Code is deprecated.. Soon I will post the new one 
#include "CVector.h"
#include <math.h>
#include <stdio.h>
CVector::CVector (float xi,float yi,float zi){  // Constructor
 x=xi;
 y=yi;
 z=zi;
 w=1.0f; // normalize the vector
}
CVector::CVector (CVector& a){   // Copy Vector Constructor
 x=a.x;
 y=a.y;
 z=a.z;
 w=a.w;
}
CVector::~CVector(){}    // Destructor
CVector  CVector::operator+(CVector& a){  // Addition
 return CVector(x +a.x,y +a.y,z +a.z);
}
CVector  CVector::operator-(CVector& a){  // Subtraction
 return CVector(x -a.x,y -a.y,z -a.z);
}
float    CVector::dotProduct(CVector& a){
 return x*a.x+y*a.y+z*a.z;
}
char CVector::sBuffer[38];// definition outside class declaration
char* CVector::toString(){// 8 bytes per float*4+5+1=38 bytes per vector
 sprintf_s(sBuffer,38,"(%f,%f,%f,%f)",x,y,z,w);
 return sBuffer;
}
int  CVector::operator==(CVector& a){  // Comparison Vector
 
 return (x==a.x && y==a.y && z==a.z);
}
CVector::CVector (void){
 x=0.0f;
 y=0.0f;
 z=0.0f;
 w=1.0f;
}
void CVector::operator=(CVector& a){  // Copy Vector
 x=a.x;
 y=a.y;
 z=a.z;
 w=a.w;
}
CVector  CVector::operator*(CVector& a){  // Cross Product
 return CVector(y*a.z-z*a.y,z*a.x-x*a.z,x*a.y-a.x*y);
}
CVector CVector::operator*(float a){  // Scalar Multiplication
 return CVector(x*a,y*a,z*a);
}
 float CVector::length(){   // Magnitude
 return sqrt(x*x+y*y+z*z);
 }
 void CVector::normalize()
 {
 float len = length();
 x = x/len;
 y = y/len;
 z = z/len;
 w=1.0f;
 }
Main Test File
#include "CVector.h"
#include "CVector.h"
#include <stdio.h>
#include <string>
#define TEST_NORMALIZE_ERROR -0x01
#define TEST_CROSSPRODUCT_ERROR -0x02
#define TEST_SCALAR_ERROR -0x04
#define TEST_SUBTRACTION_ERROR -0x08
#define TEST_ADDITION_ERROR -0x10
#define TEST_COPY_ERROR -0x20
int main()
{
 CVector i,j,k;
 CVector a,b,c;
 i = CVector(5.0f,0.0f,0.0f);
 j = CVector(0.0f,6.0f,0.0f);
 printf("CVector Example\n");
 // Unit test.
 i.normalize();
 j.normalize();
 k = i*j;
 if(i.length() != 1.0f){
  printf("Failed to Normileze i");
  getchar();
  return TEST_NORMALIZE_ERROR;
 }
 if(j.length() != 1.0f){
  printf("Failed to Normileze j");
  getchar();
  return TEST_NORMALIZE_ERROR;
 }
 if(k.length() != 1.0f){
  printf("Failed k Should be normalized");
  getchar();
  return TEST_CROSSPRODUCT_ERROR;
 }
 if(!(k == CVector(0.0f,0.0f,1.0f))){
  printf("Failed Cross Product, k=i*j");
  getchar();
  return TEST_CROSSPRODUCT_ERROR;
 }
 if(!(((i*k)*5.0f)==j*(-5.0f))){
  printf("Failed Scalar Multiplication or Cross Product");
  getchar();
  return TEST_CROSSPRODUCT_ERROR|TEST_SCALAR_ERROR;
 }
 if(!(c-c==ZEROVECTOR)){
  printf("Failed to Substract");
  getchar();
  return TEST_SUBTRACTION_ERROR;
 }
 c = CVector(7.0f,2.0f,-4.0f);
 b = c;
 if(!(b==c)){
  printf("Failed to Copy Vector");
  getchar();
  return TEST_COPY_ERROR;
 }
 c = CVector(7.0f,2.0f,-4.0f);
 if(a=c+c,b=c*2.0f,!(a==b)){
 //if(!(CVector(4.0f,2.0f,-4.0f)+CVector(4.0f,2.0f,-4.0f)==(CVector(4.0f,2.0f,-4.0f)*2.0f))){
  printf("Failed to Addition or Scalar Multiplication");
  getchar();
  return TEST_ADDITION_ERROR|TEST_SCALAR_ERROR;
 }
 printf("All test Pass. \n");
 getchar();
}
Download project File  for Visual Studio 2008