{"id":294,"date":"2018-01-02T07:58:50","date_gmt":"2018-01-02T07:58:50","guid":{"rendered":"http:\/\/damrakoc.com\/blog\/?p=294"},"modified":"2025-05-17T10:49:56","modified_gmt":"2025-05-17T10:49:56","slug":"msvc-mutex-slower-might-expect","status":"publish","type":"post","link":"https:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/","title":{"rendered":"MSVC mutex is slower than you might expect"},"content":{"rendered":"<p>The 2015 MSVC C++ runtime\u2019s std::mutex (and potentially other) implementation is significantly slower than the equivalent code written by hand. The reason is that the runtime is built with a Windows feature called &#8220;<a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/windows\/desktop\/mt637065%28v=vs.85%29.aspx\"><strong>Control Flow Guard<\/strong>&#8220;<\/a>\u00a0and uses function pointers.<\/p>\n<p>While profiling some code in our <strong><a href=\"http:\/\/coherent-labs.com\/colibri-sign-up-beta\/\">new HTML5 renderer<\/a><\/strong>, I noticed that the \u201cmutex\u201d implementation we use on Windows was unexpectedly slow compared to other platforms. On Windows we were using the std::mutex implementation in the MSVC 2015 runtime. I decided to dig through the code an see what was going on. Everything was running on Windows 10.<\/p>\n<p>Looking at the code revealed that the implementation of the mutex depends on the version of Windows (Microsoft Visual Studio 14.0\\VC\\crt\\src\\stl\\primitives.h). This makes good sense, because newer versions of Windows include faster synchronization primitives. On Vista+ the locking is implemented with a <strong><a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/windows\/desktop\/aa904937%28v=vs.85%29.aspx\">SRWLOCK<\/a><\/strong>, which is very fast.<\/p>\n<p>At that point I thought that the implementation they have is pretty good and there might be an issue with my profiling, so I did a simple test. In an auxiliary application I ran 3 test-suites locking 10000 times a mutex, performing a simple computation, unlocking and measuring the time for all those operations.<\/p>\n<p>I measured 3 scenarios:<\/p>\n<ul>\n<li>std::mutex<\/li>\n<li>manual CRITICAL_SECTION<\/li>\n<li>manual SRWLOCK<\/li>\n<\/ul>\n<p>The SRWLOCK implementation was slightly faster than the CRITICAL_SECTION (~5-10%), which is expected, but the std::mutex was 30-40% slower than the rest. At that moment I was really surprised, because essentially the implementation was the same \u2013 just locking and unlocking the SRWLOCK. The std::mutex has some additional code \u2013 it gets the current thread id (I simulated this too and it\u2019s very fast \u2013 no noticeable change in perf.), it calls virtual methods and the execution of the SRWLOCK functions happen through function pointers. None of those should incur such a large difference though.<\/p>\n<p>So I dug in the assembly. It turned out that when the CRT calls <em>__crtAcquireSRWLockExclusive<\/em> (and all other SRWLOCK methods) they don\u2019t go directly in the Windows Kernel32.dll! A lot of checks and code is executed between entering the method and actually arriving in <em>AcquireSRWLockExclusive<\/em>, which is where we want to go. The reason is a Windows feature called &#8220;<em><a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/windows\/desktop\/mt637065%28v=vs.85%29.aspx\"><strong>Control Flow Guard<\/strong>&#8220;<\/a><\/em>. Essentially this is a security feature that instructs the compiler to check every jump through a function pointer and validate if it is a valid target. This is possible because on modern Windows, all function addresses are annotated and known to the loader. The feature will prevent jumping \u201cin the middle\u201d of functions and makes hacking more difficult.<\/p>\n<p>While CFG might be very important for some classes of applications, it\u2019s performance impact on a game unacceptably is high. Unfortunately when using the MCVC runtime you have no choice, because it\u2019s pre-built with the feature. While normal function calls are fine and bypass the mechanism, the std::mutex implementation falls flat there. In order to support multiple Windows versions the authors have to rely on function pointers for routines that might be missing from older Windows\u2019s.<\/p>\n<p>It\u2019s unfortunate that the authors have thought carefully to use different approaches on different Windows versions to squeeze the best performance (the 2015 version is much much better than older std::mutex implementations) but this compiler feature has essentially defeated their efforts.<\/p>\n<p>Fixing this in your application is really trivial \u2013 creating a custom mutex implementation with SRWLOCK is &lt; 2 min. work. I didn\u2019t investigate other STL classes that might suffer from the same problem, but I assume there are in the synchronization mechanisms (where you want the most performance unfortunately).<\/p>\n<p>It\u2019d be great if Microsoft provided more versions of the runtime libraries \u2013 especially \u201cfast\u201d ones without the security features. It would be even best if they provided all the source for the runtime and allowed developers to compile it themselves.<\/p>\n<h4>References:<\/h4>\n<ul>\n<li><a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/windows\/desktop\/mt637065%28v=vs.85%29.aspx\">https:\/\/msdn.microsoft.com\/en-us\/library\/windows\/desktop\/mt637065%28v=vs.85%29.aspx<\/a><\/li>\n<li><a href=\"http:\/\/lifeinhex.com\/tag\/control-flow-guard\/\">http:\/\/lifeinhex.com\/tag\/control-flow-guard\/<\/a><\/li>\n<li><a href=\"https:\/\/blogs.msdn.microsoft.com\/vcblog\/2014\/12\/08\/visual-studio-2015-preview-work-in-progress-security-feature\/\">https:\/\/blogs.msdn.microsoft.com\/vcblog\/2014\/12\/08\/visual-studio-2015-preview-work-in-progress-security-feature\/<\/a><\/li>\n<li><a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/windows\/desktop\/aa904937%28v=vs.85%29.aspx\">https:\/\/msdn.microsoft.com\/en-us\/library\/windows\/desktop\/aa904937%28v=vs.85%29.aspx<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>The 2015 MSVC C++ runtime\u2019s std::mutex (and potentially other) implementation is significantly slower than the equivalent code written by hand. The reason is that the runtime is built with a Windows feature called &#8220;Control Flow Guard&#8220;\u00a0and uses function pointers. While profiling some code in our new HTML5 renderer, I noticed that the \u201cmutex\u201d implementation we [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":350,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[13],"tags":[],"class_list":["post-294","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-c"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.1 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>MSVC mutex is slower than you might expect - Damra KO\u00c7<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"MSVC mutex is slower than you might expect - Damra KO\u00c7\" \/>\n<meta property=\"og:description\" content=\"The 2015 MSVC C++ runtime\u2019s std::mutex (and potentially other) implementation is significantly slower than the equivalent code written by hand. The reason is that the runtime is built with a Windows feature called &#8220;Control Flow Guard&#8220;\u00a0and uses function pointers. While profiling some code in our new HTML5 renderer, I noticed that the \u201cmutex\u201d implementation we [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/\" \/>\n<meta property=\"og:site_name\" content=\"Damra KO\u00c7\" \/>\n<meta property=\"article:published_time\" content=\"2018-01-02T07:58:50+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-05-17T10:49:56+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/damrakoc.com\/blog\/wp-content\/uploads\/2018\/01\/Screenshot_3.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1017\" \/>\n\t<meta property=\"og:image:height\" content=\"912\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"damrakoc\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@damra_koc\" \/>\n<meta name=\"twitter:site\" content=\"@damra_koc\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"damrakoc\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/\",\"url\":\"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/\",\"name\":\"MSVC mutex is slower than you might expect - Damra KO\u00c7\",\"isPartOf\":{\"@id\":\"http:\/\/damrakoc.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/#primaryimage\"},\"image\":{\"@id\":\"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/damrakoc.com\/blog\/wp-content\/uploads\/2018\/01\/Screenshot_3.png\",\"datePublished\":\"2018-01-02T07:58:50+00:00\",\"dateModified\":\"2025-05-17T10:49:56+00:00\",\"author\":{\"@id\":\"http:\/\/damrakoc.com\/blog\/#\/schema\/person\/c0aef33e15396f85a26d08495c742b8b\"},\"breadcrumb\":{\"@id\":\"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/#primaryimage\",\"url\":\"https:\/\/damrakoc.com\/blog\/wp-content\/uploads\/2018\/01\/Screenshot_3.png\",\"contentUrl\":\"https:\/\/damrakoc.com\/blog\/wp-content\/uploads\/2018\/01\/Screenshot_3.png\",\"width\":1017,\"height\":912,\"caption\":\"MSVC mutex is slower than you might expect\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"http:\/\/damrakoc.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"MSVC mutex is slower than you might expect\"}]},{\"@type\":\"WebSite\",\"@id\":\"http:\/\/damrakoc.com\/blog\/#website\",\"url\":\"http:\/\/damrakoc.com\/blog\/\",\"name\":\"Damra KO\u00c7\",\"description\":\"Software Developer\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"http:\/\/damrakoc.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"http:\/\/damrakoc.com\/blog\/#\/schema\/person\/c0aef33e15396f85a26d08495c742b8b\",\"name\":\"damrakoc\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"http:\/\/damrakoc.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/1a5d82872160ecc5a366412de9d017ead27f16fcfce7c8e46532199f18145f06?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/1a5d82872160ecc5a366412de9d017ead27f16fcfce7c8e46532199f18145f06?s=96&d=mm&r=g\",\"caption\":\"damrakoc\"},\"url\":\"https:\/\/damrakoc.com\/blog\/author\/damrakoc\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"MSVC mutex is slower than you might expect - Damra KO\u00c7","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/","og_locale":"en_US","og_type":"article","og_title":"MSVC mutex is slower than you might expect - Damra KO\u00c7","og_description":"The 2015 MSVC C++ runtime\u2019s std::mutex (and potentially other) implementation is significantly slower than the equivalent code written by hand. The reason is that the runtime is built with a Windows feature called &#8220;Control Flow Guard&#8220;\u00a0and uses function pointers. While profiling some code in our new HTML5 renderer, I noticed that the \u201cmutex\u201d implementation we [&hellip;]","og_url":"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/","og_site_name":"Damra KO\u00c7","article_published_time":"2018-01-02T07:58:50+00:00","article_modified_time":"2025-05-17T10:49:56+00:00","og_image":[{"width":1017,"height":912,"url":"http:\/\/damrakoc.com\/blog\/wp-content\/uploads\/2018\/01\/Screenshot_3.png","type":"image\/png"}],"author":"damrakoc","twitter_card":"summary_large_image","twitter_creator":"@damra_koc","twitter_site":"@damra_koc","twitter_misc":{"Written by":"damrakoc","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/","url":"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/","name":"MSVC mutex is slower than you might expect - Damra KO\u00c7","isPartOf":{"@id":"http:\/\/damrakoc.com\/blog\/#website"},"primaryImageOfPage":{"@id":"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/#primaryimage"},"image":{"@id":"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/#primaryimage"},"thumbnailUrl":"https:\/\/damrakoc.com\/blog\/wp-content\/uploads\/2018\/01\/Screenshot_3.png","datePublished":"2018-01-02T07:58:50+00:00","dateModified":"2025-05-17T10:49:56+00:00","author":{"@id":"http:\/\/damrakoc.com\/blog\/#\/schema\/person\/c0aef33e15396f85a26d08495c742b8b"},"breadcrumb":{"@id":"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/#primaryimage","url":"https:\/\/damrakoc.com\/blog\/wp-content\/uploads\/2018\/01\/Screenshot_3.png","contentUrl":"https:\/\/damrakoc.com\/blog\/wp-content\/uploads\/2018\/01\/Screenshot_3.png","width":1017,"height":912,"caption":"MSVC mutex is slower than you might expect"},{"@type":"BreadcrumbList","@id":"http:\/\/damrakoc.com\/blog\/msvc-mutex-slower-might-expect\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"http:\/\/damrakoc.com\/blog\/"},{"@type":"ListItem","position":2,"name":"MSVC mutex is slower than you might expect"}]},{"@type":"WebSite","@id":"http:\/\/damrakoc.com\/blog\/#website","url":"http:\/\/damrakoc.com\/blog\/","name":"Damra KO\u00c7","description":"Software Developer","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"http:\/\/damrakoc.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"http:\/\/damrakoc.com\/blog\/#\/schema\/person\/c0aef33e15396f85a26d08495c742b8b","name":"damrakoc","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"http:\/\/damrakoc.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/1a5d82872160ecc5a366412de9d017ead27f16fcfce7c8e46532199f18145f06?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1a5d82872160ecc5a366412de9d017ead27f16fcfce7c8e46532199f18145f06?s=96&d=mm&r=g","caption":"damrakoc"},"url":"https:\/\/damrakoc.com\/blog\/author\/damrakoc\/"}]}},"_links":{"self":[{"href":"https:\/\/damrakoc.com\/blog\/wp-json\/wp\/v2\/posts\/294","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/damrakoc.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/damrakoc.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/damrakoc.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/damrakoc.com\/blog\/wp-json\/wp\/v2\/comments?post=294"}],"version-history":[{"count":3,"href":"https:\/\/damrakoc.com\/blog\/wp-json\/wp\/v2\/posts\/294\/revisions"}],"predecessor-version":[{"id":349,"href":"https:\/\/damrakoc.com\/blog\/wp-json\/wp\/v2\/posts\/294\/revisions\/349"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/damrakoc.com\/blog\/wp-json\/wp\/v2\/media\/350"}],"wp:attachment":[{"href":"https:\/\/damrakoc.com\/blog\/wp-json\/wp\/v2\/media?parent=294"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/damrakoc.com\/blog\/wp-json\/wp\/v2\/categories?post=294"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/damrakoc.com\/blog\/wp-json\/wp\/v2\/tags?post=294"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}