14 Nov, 2009

RAM Speed

Posted by Bhavin Turakhia

To test the speed of RAM, I got Ramki to run a small program that writes a set of bytes into memory a billion times and ran 4 instances of it on a dual proc quad core machine. Below are the results of running four instances of the program simultaneously.

Result

output.1:       User time (seconds): 545.99
output.1:       System time (seconds): 1.33
output.1:       Elapsed (wall clock) time (h:mm:ss or m:ss): 9:07.38
output.1:       Involuntary context switches: 820

output.2:       User time (seconds): 250.90
output.2:       System time (seconds): 1.18
output.2:       Elapsed (wall clock) time (h:mm:ss or m:ss): 4:12.12
output.2:       Involuntary context switches: 378

output.3:       User time (seconds): 250.30
output.3:       System time (seconds): 1.15
output.3:       Elapsed (wall clock) time (h:mm:ss or m:ss): 4:11.49
output.3:       Involuntary context switches: 373

output.4:       User time (seconds): 563.62
output.4:       System time (seconds): 1.31
output.4:       Elapsed (wall clock) time (h:mm:ss or m:ss): 9:25.00
output.4:       Involuntary context switches: 845

Observations

  • The write speed was between 0.25 seconds per million writes to 0.55 seconds
  • Output.2 and .3 took half the time as that of .1 or .4
  • Don’t have a specific theory on why 2 of the cores did better than the other two
  • No processor affinity was set, and the processes were being scheduled on random processors after every context switch.
  • Seemingly the processes were accessing RAM simultaneously. In my limited knowledge that could mean a few things – Multi-channel FSB (Dual) and additionally while oneprocess was computing stuff the other processes could access the FSB. The program was using lrand48 to generate a random number to write data to random locations so as to ensure that we do not rely too much on the L1/L2 cache

Some reading

Tags: , ,

Comments
ashok hingorani
November 15, 2009

just a though Bhavin
While much “seems” to happen simultaneously unless the OS is specifically supporting parallel processing, this are still going to happen sequentially. becase the OS is handling out time slots / processer cycles.

similarly though a quadcore all cores are not equal as the primary usually has to host the OS itself and often one core is delegated the I/O processes. These two cores will always run “slower” than others on a test such as you describe.

question 1 – were you generating a new random number for each write – not relevant to the speed test – same number could be written repeatedly reducing one of the processes running to give a clearer picture of the others.

question 2 – were you just testing for MIPs or the load balancing across the cores etc

warm regards

ashok

ashok hingorani
November 15, 2009

……
figues indicate that the involuntary context switching was probably the cause for the asymetrical performance of the cores – as the OS switches from it’s own needs to that of the user program.

brgds

ashok

Bhavin Turakhia
November 15, 2009

@ashok: you are right about the contest switching. your note about asymmetric distribution of workload between cores explains the difference in the context switching values. we were looking at MIPs. we were generating a new random number each write. the random number was not what was being written. The random number was only to write to a random location so that the L2 cache is not used

ashok hingorani
November 15, 2009

understood about the random number Bhavin thanks- but you have to ensure it does not fall outside the addressable area hence more calculations…

I don’t pretend to be the hardware expert but i do experiment with every new thing that comes out software or hardware – so i did some extensive testing with the quadcore before reccomending it to clients.

in my case the Groove Sync process runs continuously in the background and a UI in foreground so was vital to see if i could dedicate resources as i wanted, and the speeds i would get with diff configs.

I did feel that the cache is used only for reads – all writes seemed to write-through which is what they state in the spec somewhere.

However in real world config i found that it was the HDD speed not the RAM that played a significant role – processors and memory have become too fast for the device and you never see optimum performance for the quadcore in business computing.

if i can contibute to your eforts in any way please feel free to call – my cell – +91-98927-99391

brgds

ashok

ashok hingorani
November 15, 2009

Bhavin,

just to clarify, while i understand hardware, my real business is software development – ERPs for the last 25 years for the Apparel and PoS verticals around the world. In 2001 my company was busted up in a M&A style JV with a major indian company and i had to start again.

At that time Ray Ozzie released a technology he personally wrote as an answer to Lotus Notes, the 800 lb gorilla. His own product but he felt IBM made a mess of it. Too expensive, too difficult, too demanding of resources – people and hardware.

He called this new platform Groove. A collaboration tool that needs no servers no special comms or infrastructure or even IT admin to connect thousands of people across multiple countries, keeping all manner of digital information in sync, with zero effort on the part of the user.

Given your company profile you would find this just perfect for your operations at every level from project control to just day to day management. If i have not already bored you at some BTC dinner then allow me to share this with you now.

I don’t sell it, just evangelise what i believe is the perfect solution for India and SMBs – bear in mind it is also used by GSK, Pfizer etc for all R&D, United Nations for Peacekeeping anything that requires scattered people to be on the same page, with total security of information.So it is also an enterprise product, just easy and cost effective enough for even a 5 man company to use.

It is significant that Groove is the only platform certified by the US DoD, NSA, HIPAA you name it, for 100% reliability and security. Never been cracked never fails to sync.

I use the word platform not tool because while you can get 80% of your collaboration needs out of the box, you can also customise it for any business process that needs distributed computing, data capture at source, last mile solutions etc.

I could go on but i have pasted a link to some presentations i wrote and a number of very interesting case studies, if you have the time.
http://computact.web.officelive.com/default.aspx

I have been a Groove development partner for the last 8 years, and after the MS acquisition, I am the MS MVP (most valued professional) for Groove and the only one in india. It will be my pleasure to show you this remarkable capability in person any time you say.

or we can catch up at the next BTC dinner :-)

warm regards

ashok

Deepak
December 9, 2009

Hi Bhavin,

Sorry for asking out of line question …..but was going through your blog of sept 2008 on Software developer community in India, where in you have given city wise breakup of number of developers.

Was curious, to know can you tell me the total number of Developer in India as of date. This will help me in one of my project.

you can reply back on by email id :- deepak.rajgarhia@gmail.com

Thanks in advance
Deepak

Leave a comment

(required)

(required)

Spam protection by WP Captcha-Free