Code Optimization

Interesting things in software development and code optimization

Multiplication and Division - could be even faster?

Hi friends,


today we are going to work on something that looks and sounds crazy, we are going to speed up multiplication and division.

First, lets look into a code that we will try to speed up: 

double p1 = 0, p2 = 0;

int count = 100000000;

Stopwatch sw = new Stopwatch();

Int32 a = 0, b = 0;

sw.Start();

for (int i = 0; i < count; i++)

{

a = i;

a *= 2;

b += a;

}

sw.Stop();

p1 = sw.ElapsedMilliseconds;

Console.WriteLine(sw.ElapsedMilliseconds + " : " + a);

As you can see the code just measures time that multiplication takes, on my PC this cycle takes about 160 ms.

Now lets add similar code but optimized and we are expecting it will be faster: 

-

static void Main(string[] args)

{

double p1 = 0, p2 = 0;

int count = 100000000;


Stopwatch sw = new Stopwatch();

Int32 a = 0, b = 0;

sw.Start();

for (int i = 0; i < count; i++)

{

a = i;

//multiply by 2

a *= 2;

b += a;

}

sw.Stop();

p1 = sw.ElapsedMilliseconds;

Console.WriteLine(sw.ElapsedMilliseconds + " : " + a + " : " + b);

sw.Reset();

sw.Start();

b = 0;

for (int i = 0; i < count; i++)

{

a = i;

//multiply by 2

a <<= 1;

b += a;

}

sw.Stop();

p2 = sw.ElapsedMilliseconds;

Console.WriteLine(sw.ElapsedMilliseconds + " : " + a + " : " + b);

Console.WriteLine("percent faster: " + 100.0 / p1 * (p1 - p2));

Console.WriteLine("times faster: " + (p1 / p2));

Console.ReadLine();

}

If you will compile it and run (make sure you compile it in Release mode and run outside of Visual Studio) you will see almost no difference

in speed between these two cases of multiplication. On my PC it shows something like that:


161 : 199999998 : 1774919424
157 : 199999998 : 1774919424
percent faster: 2.48447204968944
times faster: 1.02547770700637


but we can say that the result almost the same. Interesting and not really, but do not be  sad, every next step will amaze you ;)


Another type


So, the next step is to test it with another type of integer - Int64 (or long), and what we see? We see that now it is about two times faster, here is my result:


846 : 199999998 : 9999999900000000
342 : 199999998 : 9999999900000000
percent faster: 59.5744680851064
times faster: 2.47368421052632


ha, interesting? So using simple bit scrolling to multiply integer value by 2, 4, 8, 16 ... much faster than multiplication itself.

Lets continue with division.


Division


Ok, now, lets go through the same steps but this time with division, here is the code:

static void Main(string[] args)

{

double p1 = 0, p2 = 0;

int count = 100000000;

Stopwatch sw = new Stopwatch();

Int32 a = 0, b = 0;

sw.Start();

for (int i = 0; i < count; i++)

{

a = i;

//div by 2

a /= 2;

b += a;

}

sw.Stop();

p1 = sw.ElapsedMilliseconds;

Console.WriteLine(sw.ElapsedMilliseconds + " : " + a + " : " + b);

sw.Reset();

sw.Start();

b = 0;

for (int i = 0; i < count; i++)

{

a = i;

//div by 2

a >>= 1;

b += a;

}

sw.Stop();

p2 = sw.ElapsedMilliseconds;

Console.WriteLine(sw.ElapsedMilliseconds + " : " + a + " : " + b);

Console.WriteLine("percent faster: " + 100.0 / p1 * (p1 - p2));

Console.WriteLine("times faster: " + (p1 / p2));

Console.ReadLine();

}

Compile and run it and you will see that the speed also almost the same:


180 : 49999999 : -1728753792
158 : 49999999 : -1728753792
percent faster: 12.2222222222222
times faster: 1.13924050632911


And as you may expect changing from Int32 to Int64 should be even more faster, and this is true, and in my case it more than 8 times faster:


2978 : 49999999 : 2499999950000000
344 : 49999999 : 2499999950000000
percent faster: 88.4486232370719
times faster: 8.65697674418605


So, the main thing here is that multiplication and division on non-native types are complex operations that involve more processor consumption than simple bit scrolling.

On the other hand the Int64 type is not native type for 32 bit environment and it takes more time to operate with it.

But if you will compile this code as x64 and run it under 64 bit operating system, you will see completely another result, here is mine:


229 : 49999999 : 2499999950000000
151 : 49999999 : 2499999950000000
percent faster: 34.061135371179
times faster: 1.51655629139073


this is because of under 64 bit operating system the Int32 and Int64 are native types (means a CPU register may handles whole value).

One more thing is the Any CPU compilation - it seems not so good as you may expect.


Thank you, and do not hesitate to comment it.


1vqHSTrq1GEoEF7QsL8dhmJfRMDVxhv2y



Bits, bytes and masks

Hello,


Today we are going to learn how to apply mask to any type of data. We will look into mask itself, and apply a mask to an image.

Lets look into the image below to get general idea of mask:


As you can see we took the star mask and an image and got imaged star in output. Cool? :)

Hmm, what is going on and how to do it? Everything is very simple if you have learned my post about bit manipulations.

So, to do it we need the following things:

- bit mask;

- an image;

- go through each byte and apply AND operation for each couple of bytes (actually Int32 in my case as it is much faster).

Here is how:

Int32[] rgba_source = new Int32[256 * 256];

Int32[] rgba_dest = new Int32[256 * 256];

for (int i = 0; i < rgba_source.Length; i++)

{

rgba_dest[i] = rgba_dest[i] & rgba_source[i];

}

Here is the result of masking two images:


That's it, you have got like textured object/figure :).


Now, just imagine where you can use such approach and with which kind of data, as well as other operations like XOR and OR.

In short - it can be used in hash, crypto, visualization algorithms and much more.

One easiest encryption algorithm is just apply XOR for each char of string, for example:

string origin = "hello world";

string encoded = string.Empty;

string decoded = string.Empty;

for (int i = 0; i < origin.Length; i++)

{

encoded += (char)(origin[i] ^ 0x003F);

}

And we will get "WZSSPHPMS[" in the encoded variable, that is equal to hello world but just encoded. To decrypt/decode it, just make the same XOR:

for (int i = 0; i < encoded.Length; i++)

{

decoded += (char)(encoded[i] ^ 0x003F);

}

And you got your "hello world" string back again in the decoded variable.

Think about it and a lot of things you can do with that ;)


Thank you and see you next time.


1vqHSTrq1GEoEF7QsL8dhmJfRMDVxhv2y



Bit manipulations

Hi,

Today, I will explain bit manipulations. As you know each bit set in its own position from 0 to 7.

As an example, Lets use the following byte 00000010 which is 0x02 in hex-decimal and 2 in decimal.

It has "1" at 1st place (lets count it from zero place to 7 place).


So shifting bits to the left will move all bits to the left:

00000010 << 2 = 00001000

As you see now 1 on the 3rd place, instead of 1st place, because of we have shifted all bits to the left 2 times. Same rule is for the right shifting.


Another type of bit manipulation is boolean operations like XOR, OR and AND.

OR just sets 1 on its place:

0010 OR 0001 = 0011
0010 OR 0010 = 0010


XOR makes the same as OR and inverts bits at the same places if they both equal to 1  (outputs 1 only when both bits differ):

0010 XOR 0001 = 0011
0010 XOR 0010 = 0000


AND leaves bits that equal to 1 at the same place only:

0010 AND 0001 = 0000
0010 AND 0010 = 0010


Ha, seems nothing too hard, doesn't? :)


You may ask - why do we need it? I will explain and show an example where and how to use it.
Lets imagine that you work with bitmaps in C#, do you want it to be really fast? I'm guess yes for sure. So lets figure out how to make it very fast.
As I said, graphic memory and images itself can be RGB and RGBA (in most general cases), and we will work with RGBA as it has transparent component with is called Alpha. So RGBA is just a group of 4 bytes, do you remember what type is represented by 4 bytes in C#? Yes, it is Int32. So we are going to use Int32 to work with our images. And here is how:

Rectangle rect = new Rectangle(0, 0, tmpi.Width, tmpi.Height);

BitmapData bmpd = tmpi.LockBits(rect, ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);

Int32[] src = new Int32[bmpd.Stride / 4 * tmpi.Height];

System.Runtime.InteropServices.Marshal.Copy(bmpd.Scan0, src, 0, src.Length);

Do you think I'm crazy? :)
No, because of reading and writing 4 bytes instead of 1 byte is much faster, thus I do not use byte[].
But now we need to split Int32 onto RGBA and here is how we will do it:

public void SetPointAsIs(byte r, byte g, byte b, byte a, int x, int y)

{

src[Stride * Y + x] = (a << 24) + (r << 16) + (g << 8) + b;

}

public ColorBgra GetPoint(int x, int y)

{

Int32 argb = src[Stride * y + x];

ColorBgra c = new ColorBgra();

c.A = (byte)(argb >> 24);

c.R = (byte)(argb >> 16 & 0xff);

c.G = (byte)(argb >> 8 & 0xff);

c.B = (byte)(argb & 0xff);

return c;

}

This code will compiled into command like MOV ds:[edx + offset], 0x11223344, instead of four MOV op-codes that will move each byte separately.



Thank you, and see you next time :)

1vqHSTrq1GEoEF7QsL8dhmJfRMDVxhv2y



Hardware and Bits

Hello,


As I said we would go from basic to complex things step by step and this post will clarify some important things that you must know about hardware and bits.


Each PC consists of different devices like monitor, HDD, keyboard, mouse, etc., and each device has its own controller (a chip) that controls device. For example, mouse has a small chip that controls laser and gets and sends commands and data to and from PC.


Almost each device has its own IRQ - this is interrupt number that assigned to a device by system to be able to get and send commands. So when you move your mouse it triggers a IRQ and provides data like DX and DY and system knows what direction should it move a cursor to and how many points the cursor should be moved.

This is general explanation and to be able to understand more you have to find a book and read it.


So devices talk to each other or to system via Bytes. They send a lot of bytes and each byte consists of 8 bits, like:

...

0 0 0 0   0 0 0 0

0 1 1 0   1 1 1 1

...

To understand what these bits represent in a more human readable way you have to know how to convert binary numeral system into hex-decimal or decimal numeral system. You can find it on internet or a book.


So, when I started to learn programming my first language was Assembler for Z-80 CPU. There was a book that had description for all 255 op-codes (commands of CPU) and some additional information, but it was not enough for me to understand anything and more over to create at least such simple program like the popular "Hello World!".

I did learn bits, bytes, bit operations like shift to the left/right, conditional operations like jump here or there, CPU registers, etc. And that was much but not enough to make the "Hello World!".

Then I did figure out memory structure (my brother did help me in this) and at some point there was like a flash in my mind - "Or God! That is how it all works!".


Main thing is to understand memory structure and how controllers works. In my case I had video memory mapped from 16384 address to 32767, and one part of this memory did represent pixels, and another part - colors.

Filling just pixel memory with correct bits will lead to this:


that is represented by bits:

1 1 0 0 1 1 0 0    = 0xCC
1 1 0 0 1 1 0 0    = 0xCC
1 1 0 0 1 1 0 0    = 0xCC
1 1 1 1 1 1 0 1    = 0xFD
1 1 1 1 1 1 1 1    = 0xFF
1 1 0 0 1 1 1 0    = 0xCE
1 1 0 0 1 1 0 1    = 0xCD
1 1 0 0 1 1 0 0    = 0xCC
....

So, as you can see you have to set bits in bytes in corresponding places and bytes into correct memory address.

Then you can color it as you need by setting color-bytes into right place in memory.


Nova days graphic memory structure is more simple and represented by 3 or 4 bytes RGB or RGBA. This leads to that, that modern PCs have to have more memory and resources and speed to provide such simplicity. 


Next step we will look into bits and bit shifting as well as bit operations - that is very important in software development.


Thank you.


1vqHSTrq1GEoEF7QsL8dhmJfRMDVxhv2y



Do you want to be best software developer?

Hi everyone,

I'm going to share two simple things you need to know to be one of the best software developers.

Really two things and they are very simple:

- You must know English language at least;

- You must understand hardware, bits, bytes and bit operations;


To be honest, one more and very useful thing is Math, thats all what you need.

Huh, seems not too much but not too little.


Lets go through all of this step by step, I will post articles here with everything you have to go through to get success.

Thank you, and be patient ;)




1vqHSTrq1GEoEF7QsL8dhmJfRMDVxhv2y