performance - Empty loop is slower than a non-empty one in C -

When trying to figure out how long it takes to execute a line of code, I It looked strange:

  int main (char argc, char * argv []) {time_t start, end; Uint64_t i; Double total_time, free_time; Int A = 1; Int B = 1; Start = clock (); For (i = 0; i <(1 <- 31) -1; i ++); End = clock (); Free_time = (double) (end-time) / CLOCKS_PER_SEC; Printf ("% f \ n", free_time); Start = clock (); For (i = 0; i <(1 <31) -1; i ++) {A + = B% 2; } End = clock (); Free_time = (double) (end-time) / CLOCKS_PER_SEC; Printf ("% f \ n", free_time); Return (0); }

which shows executed:

  5.873425 4.826874

Why more time is used than empty loop Second, there is a directive? Of course I have tried many forms, but every time, an empty loop takes more than one time with a single instruction.

Note that I have tried to swap the order of loops and add some warm-up code and this

I get codeblocks as GNU gcc compiler, linux ubuntu 14.04 with the IDE form I am using and QuadCar is Intel i5 at 2.3 GHz (I tried to run the program on a core, it does not change the result).

The fact that modern processors are complicated executes all the instructions in a complicated and interesting manner interacting with each other will do. Thanks to OP and "another man", apparently it was found that the short loop takes 11 chakras while taking 9 chakras for a long time. For a longer time loop, 9 cycles take a lot of time, even if there are too many operations. For a small loop, this should be a little stall due to being too low, and just adding a nop makes enough loops to avoid the stall

One thing is this If we look at the code:

  0x00000000004005af : addq $ 0x1, -0x20 (% rbp) 0x00000000004005b4 & lt; + 55>: Cmpq $ 0x7fffffff, -0x20 (% rbp) 0x00000000004005bc & lt; + 63>: JB 0x4005af

  We read  i  and write it back ( addq ). We read it immediately, and compare it ( cmpq ) and then we use loop but loop branch prediction, so when the  addq  is executed then the processor is actually It is not ensured that it is allowed to write on  i  (because the branch forecast may be wrong). 
  Then we compare with the  i  processor will try to avoid reading  i  from memory, because it takes a lot of time to read it. Instead, some hardware will remember that we wrote it to  i , and it reads instead of reading  i , receives the  cmpq  command Data from Store Instructions Unfortunately, we are not sure at this point that if  i  was actually written or not! So that can bring a stall here 
  The problem here is that the conditional jump,  addq  which leads to the conditional store, and  cmpq  which are sure Not that data from where to meet, all are very close together. They are abnormally close together. It may be that they are very close together, the processor can not understand at this time whether the store has to take the instruction  i  or read it in memory and read it from memory, which is slow Because he has to wait for the store to end. And gives enough time to the processor by adding a  nop . 
  Usually you think there is a ram, and there is a cash.  
 L3 cache (optional) 
  On the modern Intel processor, read memory (fastest speed): 
   Memory (RAM) L2 cache 
  L1 cache 
  Previous store instructions which have not yet been written in the L1 cache. 
 
  then processor that internally sorts in a short, slow loop: 
    i  from L1 cache Read 
  1 to  i  
  Type  i  to L1 cache 
  Unless the  I  linux cache 
  read  i  from 1 cache 
  Compare  i  with INSAMX 
 Li>  The branch (1) if it is low 
 
  The processor is in a long, fast, loop: 
   Too many stuff 
   i  from L1 cache 
  1 to  i  
  TOR "which  i  is L1 cache 
  Read  i  directly from the" store "instructions without touching the L1 cache 
  Compare  i  with INSAMX 
  From the branch (1) if it is less






-



03:22


















Get link





Facebook





X





Pinterest





Email





Other Apps




Comments





Post a Comment



Popular posts from this blog




java - Joda Time Interval Not returning what I expect -



    मेरे पास निम्न प्रोग्राम है    import java.util। *; Import java.text। *; आयात करें org.joda.time। *; सार्वजनिक श्रेणी के स्कोप कंट्रोल {सार्वजनिक स्थिर अंतराल मिलनसार () {दिनांक समय currDate = नया दिनांक समय (2008, 4, 4, 15, 30, 0, 0); दिनांक समय epochDate = नया दिनांकटाइम (2000, 1, 1, 12, 0, 0, 0); अंतराल अंतराल = नया अंतराल (युरोप डेट, करोडेट); वापसी अंतराल; } सार्वजनिक स्थिर शून्य मुख्य (स्ट्रिंग [] आर्ग्स) {डबल दिनबात = मिलते समय ()। ToDurationMillis () / 1000/60/60/24; StdOut.println (daysBtween); }}    मुझे आउटपुट मिल रहा है: 3016.0   लेकिन जो मैं देख रहा हूं वह है: 3016.1458333333   मैं क्या कर रहा हूँ ?      toDurationMillis एक लंबा लौटा देता है, और प्रत्येक प्रभाग को int के रूप में घोषित किया जाता है। जावा इस प्रकार इनट्स को लॉन्ग में परिवर्तित कर देगा और डिवीजन को लंबे समय तक लौटाना होगा। परिणाम को अंत में दोहरे रूप में परिवर्तित करना। डबल्स का उपयोग करके अभिव्यक्ति करने के लिए जावा को बताने के लिए, अभिव्यक्ति के किसी भी घटक को दोहरे रूप में घोषित करें। उदाहरण के लिए:    ...





Read more





javascript - Render HTML after each iteration in loop -



    I'm trying to gradually increase the font size of text on a web page. The code works in my place, although the new HTML / CSS does not render after every iteration of the loop and when all this is done, then display only the text of 100 px size. To see the text as if it is slowly zooming, I need to do this JavaScript is down because it is being used from a different file here is what I have ...    & lt; P class = "game-title" style = "font-size: 50px" & gt; Test & lt; / P & gt; Function sleep (milliseconds) {var start = new date (). GetTime (); For (var i = 0; i & lt; 1e7; i ++) {if ((new date). GetTime () - start) & gt; Milliseconds {break; CSS ('font-size', 'parseInt ($ (' game-title '). CSS (' font-size ')) + 1 + "pixels"); } While (parasont ($ ('# text'). Css ('font-size')) & lt; = 100) {sleep (1000); IncreaseSize (); Using CSS:           Use of jQuery (sample):    $ ('...





Read more





sip - Call SipJs to Asterisk 12 -



    I am trying to call Asterisk 12 from SIPJ. My partner is here    [6002]] type = friend secret = 6002 host = dynamic reference = public transport = ws avpf = yes icesupport = no encryption = no    And my JSP code is here    var configuration = {'ws_servers':' ws: //192.168.0.102: 8088 / ws', 'Yuri': 'SIP: 6002 @ 192.168.0.102 ',' Password ':' 6002 '}; Var option = {'EventHolders': EventHandler, 'Media Consultants': {'Audio': True, 'Video': Incorrect}}; Function call () {quietphonecall ('sip: 6003@192.168.0.102', option); }    It is properly registered, but when I call "call" function asterisk logs this error    secure without encryption details Rejecting Audio Stream: Audio 46421 RTP / SAVPF 111 103 104 0 8 106 105 13 126    JSSIp error is here   Call failed with reason: Incompatible SDP   Can someone help me?      First of all, you will have to create a certificate for DTLS. Then enabl...





Read more

Search This Blog

Alcantara

performance - Empty loop is slower than a non-empty one in C -

Comments

Post a Comment

Popular posts from this blog

java - Joda Time Interval Not returning what I expect -

javascript - Render HTML after each iteration in loop -

sip - Call SipJs to Asterisk 12 -