Shared memory with node.js  

This is more like a tutorial on writing a simple node.js add-on to share memory among node.js processes.

One of the limitations of node.js/io.js is that they are single threaded. Only way to use multiple cores in the processor is to run multiple processes [1]. But then you are working on different memory spaces. So it doesn’t help if you want multiple processes working on the same memory block. This is required in memory intensive tasks that cannot be efficiently sharded.

All the source code is available in Github.

Node addon #

You need node-gyp installed to build the node module.

npm install node-gyp -g

I think the node.js version matters as the addon api has changed. I was working on node 0.12.2, when I tested this out.

binding.gyp is required by node-gyp to build the addon.

Then comes shm_addon.cpp. This is a very basic addon that has one export createSHM, which creates a shared memory block of 800,000 bytes (attaches if exists) with read & write permission to all users[2].

Shared memory is allocated with shmget and attached to the address space of the process with shmat.

shmid = shmget( key, MEM, IPC_CREAT | 0666 );
data = (char *)shmat( shmid, NULL, 0 );

It keeps a pointer to the memory block and returns it if createSHM is called twice by the same node.js program [3]. createSHM returns an ArrayBuffer, initialized with the pointer to the shared memory.

Local<ArrayBuffer> buffer = ArrayBuffer::New(isolate, (void *)data, MEM);

The node module shm_addon is built with node-gyp with following commands.

node-gyp configure
node-gyp build

The node addon will be created in build/Release/shm_addon.node.

Parent and child programs for testing #

This is a simple counting problem to illustrate how shared memory can be used. We will populate the array of 200,000 32-bit integers with the sequence
0,1,2,...998,999,0,1,2,..998,999,0,1,2,.... So there are 200 positions with each integer between 0 and 999. Each of the child programs (workers) will count the number of occurrences of each integer between 0 and 999 by inefficiently traversing the array a 1,000 times.

spawn.coffee is the parent program that starts the child processes. child.coffee is the child program.

Shared memory is attached by parent program and child program by calling the node addon.

shm = require './build/Release/shm_addon'
a = new Int32Array shm.createSHM()

We are calculating the time taken for the child processes to count. Time it takes for processes to get spawn and exit is excluded. Therefore the child processes start counting when they receive something in the standard input. Number of child processes can be set with CHILDREN.

process.stdin.on 'data', (msg) ->
 start()

Running coffee spawn.coffee will start processes and do the counting and show the time it took to complete.

You can take a look at shared memory allocated by running command ipcs.

IPC status from <running system> as of Tue Apr 14 13:58:16 IST 2015
T     ID     KEY        MODE       OWNER    GROUP
Shared Memory:
m  65536 0x000019a5 --rw-rw-rw- varunajayasiri    staff
m  65537 0x000019a4 --rw-rw-rw- varunajayasiri    staff
m  65538 0x000019a2 --rw-rw-rw- varunajayasiri    staff

Results #

bench.coffee was used to find the time a single process takes to count.

@chethiyaa did some testing on a quad core i7.

# children single process (ms) multi process (ms)
1 398 430
2 782 394
4 1626 415
8 3300 799
16 6285 1594
32 3183
64 6372
128 13049

[1] Node modules like parallel.js fork new processes when on node and use web workers on browser.

[2] shmget (documentation) allocates shared memory and shmat (documentation) attaches the shared memory block.

[3] Since the ArrayBuffer is constructed with a memory pointer, it will be external. That is the memory will not be garbage collected and the addon will have to free the memory. Here’s the v8 documentation to ArrayBuffer.

[4] Shared memory limits are quite small by default. So trying to allocate a lot of shared memory will give errors. This article gives details on changing and viewing these settings.

 
178
Kudos
 
178
Kudos

Now read this

TCP Echo Server Example in C++ Using Epoll

This example is a simple server which accepts connections and echos whatever data sent to the server. This example also demonstrates the use of epoll, which is efficient than poll. In epoll unlike poll all events that need to be... Continue →