Maxim Dounin
May 05, 2012 05:42PM
Hello!

On Sat, May 05, 2012 at 08:10:30PM +0800, Simon Liu wrote:

> Hello!
>
> cacheline calculation is hardcode in ngx_cpuinfo, this will make mistake in
> some intel processor. example cache line is 64 byte in sandy bridge,
> its family code is 0110 and model no is 1010 or 1101(in this document
> http://www.intel.com/content/www/us/en/processors/processor-identification-cpuid-instruction-note.html).
> but code is this in ngx_cpuinfo:
>
> /* Pentium Pro, II, III */
> case 6:
> ngx_cacheline_size = 32;
>
> model = ((cpu[0] & 0xf0000) >> 8) | (cpu[0] & 0xf0);
>
> if (model >= 0xd0) {
> /* Intel Core, Core 2, Atom */
> ngx_cacheline_size = 64;
> }
>
> break;
>
> if model no is 1010 , ngx_cacheline_size will be 32, and so this is wrong.

Note the model variable in the above code includes extended model
field as well, and for sandy bridge it will be 0x2a0 (extended
model 0010, model 0101). Thus cache line size is correctly
detected as 64.

> Below is a patch(for nginx trunk) fix this problem, and use cpuid(2) solve
> hardcode。
>
> Index: src/core/ngx_cpuinfo.c
> ===================================================================
> --- src/core/ngx_cpuinfo.c (revision 4615)
> +++ src/core/ngx_cpuinfo.c (working copy)
> @@ -12,9 +12,93 @@
> #if (( __i386__ || __amd64__ ) && ( __GNUC__ || __INTEL_COMPILER ))
>
>
> +#define NGX_CACHE_LVL_1_DATA 1
> +#define NGX_CACHE_LVL_2 2
> +#define NGX_CACHE_LVL_3 3
> +#define NGX_CACHE_PREFETCHING 4
> +
> +
> +typedef struct ngx_cache_table {
> + u_char descriptor;
> + u_char type;
> + ngx_uint_t size;
> +} ngx_cache_table_t;
> +
> +
> static ngx_inline void ngx_cpuid(uint32_t i, uint32_t *buf);
>
>
> +static ngx_cache_table_t cache_table[] = {
> + { 0x0a, NGX_CACHE_LVL_1_DATA, 32 }, /* 32 byte line size */
> + { 0x0c, NGX_CACHE_LVL_1_DATA, 32 }, /* 32 byte line size */
> + { 0x0d, NGX_CACHE_LVL_1_DATA, 64 }, /* 64 byte line size */
> + { 0x0e, NGX_CACHE_LVL_1_DATA, 64 }, /* 64 byte line size */

[...]

I don't really think we need full intel cache descriptor decoding.
It's rather huge and I suspect it might cause more harm than good,
especially in virtualized environment.

And if we decide we need one, we probably want something simplier
(i.e. we don't care about cache levels and so on, and this
information is clearly not needed here).

Maxim Dounin

_______________________________________________
nginx-devel mailing list
nginx-devel@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx-devel
Subject Author Views Posted

[PATCH] optimization of Intel processor cacheline calculation Attachments

Simon Liu 1272 May 05, 2012 08:12AM

Re: [PATCH] optimization of Intel processor cacheline calculation

Maxim Dounin 435 May 05, 2012 05:42PM



Sorry, you do not have permission to post/reply in this forum.

Online Users

Guests: 280
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready