{"id":342,"date":"2012-11-21T18:33:30","date_gmt":"2012-11-21T18:33:30","guid":{"rendered":"http:\/\/www.starcoder.com\/wordpress\/?p=342"},"modified":"2021-10-30T19:54:07","modified_gmt":"2021-10-30T19:54:07","slug":"overhead-while-using-gcd","status":"publish","type":"post","link":"https:\/\/www.starcoder.com\/wordpress\/2012\/11\/overhead-while-using-gcd\/","title":{"rendered":"Overhead while using GCD"},"content":{"rendered":"<p>Today I spent some time optimizing the Particle Mode simulation code in Seasonality Core.  While doing some measurements, I discovered that quite a bit of time was spent in GCD code while starting new tasks.  I use dispatch_apply to iterate through the particles and run the position and color calculations for the next frame.  In the tests below, I was simulating approximately 200,000 particles on the Macs, and 11,000 particles on the iPad.<\/p>\n<p>I decided to try breaking the tasks up into fewer blocks, and run the dispatch_apply for groups of around 50 particles instead of running it for each particle.  After making this change, the simulation ran in up to 59% less CPU time than before.  Here are some informal numbers, just by looking at Activity Monitor and roughly estimating:<\/p>\n<table style=\"padding: 10px;\" border=\"0\" cellspacing=\"5\">\n<tbody>\n<tr>\n<td colspan=\"2\">\u00a0<\/td>\n<td colspan=\"3\" align=\"center\"><b>CPU Usage<\/b><\/td>\n<\/tr>\n<tr>\n<td><b>Device<\/b><\/td>\n<td>\u00a0<\/td>\n<td align=\"center\"><i>Before<\/i><\/td>\n<td>\u00a0<\/td>\n<td align=\"center\"><i>After<\/i><\/td>\n<td>\u00a0<\/td>\n<td align=\"center\"><b>Time Savings<\/b><\/td>\n<\/tr>\n<tr>\n<td>Mac Pro (2009, Oct 2.26Ghz Xeon)<\/td>\n<td>\u00a0<\/td>\n<td align=\"center\">390%<\/td>\n<td>\u00a0<\/td>\n<td align=\"center\">160%<\/td>\n<td>\u00a0<\/td>\n<td align=\"center\">59%<\/td>\n<\/tr>\n<tr>\n<td>Retina MBP (2012, Quad 2.6Ghz i7)<\/td>\n<td>\u00a0<\/td>\n<td align=\"center\">110%<\/td>\n<td>\u00a0<\/td>\n<td align=\"center\">90%<\/td>\n<td>\u00a0<\/td>\n<td align=\"center\">18%<\/td>\n<\/tr>\n<tr>\n<td>MacBook Air (2011, Duo 1.8Ghz i7)<\/td>\n<td>\u00a0<\/td>\n<td align=\"center\">130%<\/td>\n<td>\u00a0<\/td>\n<td align=\"center\">110%<\/td>\n<td>\u00a0<\/td>\n<td align=\"center\">15%<\/td>\n<\/tr>\n<tr>\n<td>\u00a0<\/td>\n<\/tr>\n<tr>\n<td>iPad 3 (fewer particles)<\/td>\n<td>\u00a0<\/td>\n<td align=\"center\">85%<\/td>\n<td>\u00a0<\/td>\n<td align=\"center\">85%<\/td>\n<td>\u00a0<\/td>\n<td align=\"center\">0%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>As you can see, the benefits from the new code running on the Mac Pro are substantial.  In my earlier code, I was somewhat suspicious of why the simulation took so many more resources on the Mac Pro than on the laptops.  Clearly the overhead in thread creation was a lot higher on the older Xeon CPU.  This brings the Mac Pro&#8217;s processing times closer to what the other more modern processors can accomplish.<\/p>\n<p>Perhaps an even more surprising result is the lack of a speedup on the iPad.  While measuring both runs, the two versions averaged about the same usage.  Perhaps if I had a more formal way to measure the processing time, a small difference might become apparent, but overall the difference was minimal.  I&#8217;m guessing that Apple has built logic into the A-series CPUs that allows for a near 0 cost in context switching.  Makes you wonder how much quicker something like this would run if Apple built their own desktop-class CPUs.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Today I spent some time optimizing the Particle Mode simulation code in Seasonality Core. While doing some measurements, I discovered that quite a bit of time was spent in GCD code while starting new tasks. I use dispatch_apply to iterate through the particles and run the position and color calculations for the next frame. In [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[8,49,27],"tags":[],"class_list":["post-342","post","type-post","status-publish","format-standard","hentry","category-coding","category-ipad","category-seasonality","post-preview"],"_links":{"self":[{"href":"https:\/\/www.starcoder.com\/wordpress\/wp-json\/wp\/v2\/posts\/342","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.starcoder.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.starcoder.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.starcoder.com\/wordpress\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.starcoder.com\/wordpress\/wp-json\/wp\/v2\/comments?post=342"}],"version-history":[{"count":6,"href":"https:\/\/www.starcoder.com\/wordpress\/wp-json\/wp\/v2\/posts\/342\/revisions"}],"predecessor-version":[{"id":556,"href":"https:\/\/www.starcoder.com\/wordpress\/wp-json\/wp\/v2\/posts\/342\/revisions\/556"}],"wp:attachment":[{"href":"https:\/\/www.starcoder.com\/wordpress\/wp-json\/wp\/v2\/media?parent=342"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.starcoder.com\/wordpress\/wp-json\/wp\/v2\/categories?post=342"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.starcoder.com\/wordpress\/wp-json\/wp\/v2\/tags?post=342"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}