Friday, May 24, 2013

Shared memory run sample 1.0

Let's follow a shared memory code sample 1.0 on a debugger and watch.
This should clear up any remaining doubts and misunderstandings.
Yet for an additional example, check shared memory code sample 2.0.

The particular environment configuration where the code is run isn't relevant.
I believe it suffices to say that it presents:

  • Solaris 11.1 SRU 7.5 running on HP ProLiant BL460c G7.
     
  • 24 GiB of physical memory and 18 GiB available virtual memory.
    This is due to some physical memory being reserved to VirtualBox.
    VirtualBox has no other relation to this post, it's there by accident.
    Sure there are VirtualBox guests running and consuming memory.
       
  • Disk-based swap is defined at 4 GiB and is not being used at first.
    Certainly, the available virtual memory includes this figure.
       
  • The binary will be run under group.staff project resource controls.
    The resource control default was at around 5.99 GiB, ¼ of physical memory.
    Hence, it was adjusted to 20 GiB in order to cause no interferences here.

So here is the essential starting point settings and figures:

# tail /etc/system
...
* 24GB - 1.5GB -> 22GB - 5% -> 20GB due to VirtualBox
set zfs:zfs_arc_max=0x400000000

# swap -lh; swap -sh
swapfile                   dev    swaplo   blocks     free
/dev/zvol/dsk/rpool/swap 214,2        4K     4.0G     4.0G
total: 436M allocated + 211M reserved = 648M used, 18G available

# projmod 
  -K "project.max-shm-memory=(privileged,20G,deny)" 
  group.staff

# projects -l group.staff
group.staff
 projid : 10
 comment: ""
 users  : (none)
 groups : (none)
 attribs: project.max-shm-memory=(privileged,21474836480,deny)



The 1st sample run will require 17 GiB of ISM.
As seen this would require enough available lockable physical memory.
This particular run will fail, but note how the swap figures are affected.
Virtual memory isn't directly involved, but it's availability is recomputed.
This is because virtual memory is partially backed by free physical memory.
As physical memory is reserved, less of it is free for backing virtual memory.
This was also discussed in the swap analysis post.

The essential excerpts are as follows:

# swap -lh; swap -sh
swapfile             dev    swaplo   blocks     free
/dev/swap             -         4K     4.0G     4.0G
total: 448M allocated + 212M reserved = 660M used, 18G available

// size == 17 GiB
int id = ::shmget( 1, size, IPC_CREAT | IPC_EXCL | 0660 );

# swap -lh; swap -sh
swapfile             dev    swaplo   blocks     free
/dev/swap             -         4K     4.0G     4.0G
total: 448M allocated + 17G reserved = 18G used, 772M available

// Failure!
void * p = ::shmat( id, 0, SHM_SHARE_MMU );

# swap -lh; swap -sh
swapfile             dev    swaplo   blocks     free
/dev/swap             -         4K     4.0G     4.0G
total: 448M allocated + 17G reserved = 18G used, 816M available

::shmctl( id, IPC_RMID, 0 );

# swap -lh; swap -sh
swapfile             dev    swaplo   blocks     free
/dev/swap             -         4K     4.0G     4.0G
total: 448M allocated + 212M reserved = 660M used, 18G available



The 2nd sample run will also require 17 GiB, this time of DISM kind.
As seen this would entirely rely on virtual memory and directly affect swap.
This particular run will succeed as no physical memory is initially required.

The essential excerpts are as follows:

# swap -lh; swap -sh
swapfile             dev    swaplo   blocks     free
/dev/swap             -         4K     4.0G     4.0G
total: 448M allocated + 212M reserved = 660M used, 18G available
 
// size == 17 GiB
int id = ::shmget( 1, size, IPC_CREAT | IPC_EXCL | 0660 );

# swap -lh; swap -sh
swapfile             dev    swaplo   blocks     free
/dev/swap             -         4K     4.0G     4.0G
total: 448M allocated + 17G reserved = 18G used, 800M available

// Success!
void * p = ::shmat( id, 0, SHM_PAGEABLE );

# swap -lh; swap -sh
swapfile             dev    swaplo   blocks     free
/dev/swap             -         4K     4.0G     4.0G
total: 448M allocated + 17G reserved = 18G used, 792M available

# ipcs -mA
IPC status ...
T ID KEY ... NATTCH       SEGSZ  CPID ... ISMATTCH     PROJECT
Shared Memory:
m 24 0x1 ...      1 18253611008 10385 ...        1 group.staff

# pmap -x 10385
10385:    /.../shm_v01/dist/Debug/...
         Address    Kbytes       RSS ... Mapped File
0000000000400000         8         8     shm_v01
0000000000411000         4         4     shm_v01
0000000000412000       160        68       [ heap ]
FFFF80FB40000000  17825792         -       [ dism shmid=0x18 ]
...

// Somewhat lengthy run...
::memset( p, '*', size );

# swap -lh; swap -sh
swapfile             dev    swaplo   blocks     free
/dev/swap             -         4K     4.0G     4.0G
total: 448M allocated + 17G reserved = 18G used, 792M available

# swap -lh; swap -sh
swapfile             dev    swaplo   blocks     free
/dev/swap             -         4K     4.0G     4.0G
total: 8.6G allocated + 9.0G reserved = 18G used, 792M available

# swap -lh; swap -sh
swapfile             dev    swaplo   blocks     free
/dev/swap             -         4K     4.0G     4.0G
total: 15G allocated + 2.7G reserved = 18G used, 792M available

# swap -lh; swap -sh
swapfile             dev    swaplo   blocks     free
/dev/swap             -         4K     4.0G     3.7G
total: 17G allocated + 214M reserved = 18G used, 792M available

# pmap -x 10385
10385:    /.../shm_v01/dist/Debug/...
         Address    Kbytes       RSS ... Mapped File
0000000000400000         8         8     shm_v01
0000000000411000         4         4     shm_v01
0000000000412000       160        68       [ heap ]
FFFF80FB40000000  17825792  17510200       [ dism shmid=0x18 ]
...

switch ( ::shmdt( p ) )

# pmap -x 10385
10385:    /.../shm_v01/dist/Debug/...
         Address    Kbytes       RSS ... Mapped File
0000000000400000         8         8     shm_v01
0000000000411000         4         4     shm_v01
0000000000412000       160        68       [ heap ]
...

# swap -lh; swap -sh
swapfile             dev    swaplo   blocks     free
/dev/swap             -         4K     4.0G     3.7G
total: 17G allocated + 214M reserved = 18G used, 792M available

switch ( ::shmctl( id, IPC_RMID, 0 ) )

# swap -lh; swap -sh
swapfile             dev    swaplo   blocks     free
/dev/swap             -         4K     4.0G     4.0G
total: 460M allocated + 201M reserved = 664M used, 18G available

From the above output we can list a few interesting conclusions, which I believe certainly will help improve knowledge on shared memory:
    
  • ISM relies exclusively on the ability to lock pages of physical memory. It doesn't really participate on virtual memory (virtual swap) although its usage affects the total amount of available virtual memory since any unused physical memory is also used to form virtual memory (virtual swap). As such, disk-based swap space (swap -lh) is never impacted by ISM. At least in theory, ISM can offer better performance as it isn't subject to memory paging.
       
  • DISM shares the optimizations of ISM but relies on a completely distinct layer, the virtual memory (virtual swap), not on free physical memory. The main advantages of DISM over ISM is its ability to be larger than the lockable available physical memory and be paged out from memory if it's not being used. So it does can make active use of disk-based swap space (swap -lh). If paging becomes frequent, then performance will certainly suffer.  
       
  • By the last output of pmap -x 10385 before ::shmdt(), from the 17825792 Kbytes reserved (= SEGSZ of 18253611008 listed by ipcs -mA), 17510200 KB (or 17930444800 bytes) could fit in physical memory and the rest was accommodated on disk-based swap (18253611008 - 17930444800 = 323166208 ≅ 0.3 GB ). I suspect that the largest lockable available physical memory by that time was that figure of 17510200. In general, I'm not sure if there's an objective method for finding this out.
        
  • Although DISM, by default, doesn't lock any memory as ISM, applications can judiciously perform that. Unfortunately, on previous output samples, I have omitted the Locked column of pmap -x output as they were null. When ISM succeeds, the three columns Kbytes, RSS and Locked list the same figure. For DISM, the last two generally vary or float, with KbytesRSSLocked, and which is precisely the advantage of DISM.
      
For very specialized applications such as Oracle Database, I'd say that if the server hardware is dedicated to an instance of it, then DISM would be advantageous with respect to the ability to dynamically adjusting its memory structures up to the defined DISM limit, avoiding application restarts.
 
In terms of resource control, beyond max-shm-memory, there is the max-shm-ids which refers to how many pointers or segments of shared memory can be obtained. What's important to know is that the larger the memory page size, the lesser of them are needed. Thus, SPARC architectures, possessing larger memory pages, tend to consume less of them. In addition, there is the max-locked-memory which affects both ISM and DISM. Remember that DISM may also lock physical memory pages. Of course max-rss, max-swap and max-address-space may also affect limits. I expect to explore some or all of this resource controls later on another post.