Thursday, August 8, 2013

Simple C++ smart pointers

C++ smart pointers are a wonderful thing.
I suspect that the previous sentence freaks us out, doesn't it?

Plain C/C++ pointers can easily become a nightmare as well, right?
Garbage collection may be ecological, but it sucks for computer science!

There are probably many incarnations of smart pointers, such as the deprecated C++03 (ISO/IEC 14882) auto_ptr and Boost's. Nevertheless, I dare say it is (or should be) clear that none of them are fully satisfactory in terms of a good balance between software engineering and performance. I'm not yet familiar with C++11, but I believe that by incorporating many of the TR1 proposals (originally/derived from Boost) the issue persists.

The C++03 auto_ptr fails by design. There are plenty of excellent authors thoroughly explaining why. Although a good attempt, it was a disaster and also very disappointing.

Boost smart pointers fail on performance because their implementation relies on virtual functions (which introduces inefficiencies) for coping with the optional deleter constructor argument. In fact, the extensive and complex implementation more than outweighs the convenience of a single class object with an optional deleter.

In my opinion it's much better to have two distinct classes of smart pointers objects, with and without the deleter, but no optionals. But wait a minute, I believe what I really need just the version with the deleter. There's no gain mixing up the automatic and manual lifetime management of the pointed to object. Isn't that obvious? This way there's no additional indirection on every access due to virtual functions calls.

I'm glad that Solaris Studio doesn't offer TR1.
I don't care anymore building Boost on Solaris either.
I prefer to implement my own lean and mean smart pointers.

I start with the most fundamental issue: automatic deallocation.
On another post, I attempt the more advanced version using reference counting.
Here's one implementation:

typedef void ( * deleter ) ( void * ); 
 
templatetypename T, deleter D = ::operator delete > 
struct pointer
{
    explicit pointer( T * const p ) throw() : 
        p( p ) 
    {
    }

    ~pointer() throw()
    {
        D( p );
    }

    operator T * () const throw()
    {
        return p;
    }

    T * operator -> () const throw()
    {
        return p;
    }

private:

    T * const p;

    pointer();
    pointer( pointer const & );
    pointer & operator = ( pointer const & );
};

Even on this simple lean and mean (elegant) smart pointer implementation there are many important aspects involved, such as:

  • The constructor takes a T * although many allocators do return void *.
    This is intentional for taking advantage of the C++ strong type checking.
    The trade-offs simply don't justify supporting void * at all.
      
  • It doesn't use reference counting.
    As such, copying & assignment must naturally be prevented.
    My convention is to specify these constraints as the last declarations.
       
  • It isn't intended to be extended by derivation.
    Its destructor isn't virtual which buys some extra performance.
       
  • By default, I'm storing a deleter as a simple function pointer instead of a wrapping function object. Of course, there's a reason for this decision: most of times, internally, a raw pointer is all that's necessary. An exception to this case is shared memory, which is manipulated via an integer handle, thus consisting on a special case. With this decision there's absolutely no waste of space with state information ( sizeof( pointer< T > ) == sizeof( T * ) ), nor any performance degradation upon deletion. If needed, to get rid of extern "C", an appropriate wrapping C++ function suffices. At least in Oracle Solaris Studio 12.3, the wrapping C++ function can be declared inline (against the C++03 Standard requiring non-type template arguments to have external linkage) as long as the pointer class template using it as a deleter isn't reused as a base class of another class template declaration. In time, I must also mention that the deleter could also be a custom T::operator delete from a class object.
       
  • It isn't intended to cripple programming or obfuscate implementation.
    The absolute minimal requirement are operators -> and T *.
    Hence, most operations are naturally granted, except &.
    Of course operator T * is danger if misused.
    
Of course this implementation is not bullet proof against an evil or undisciplined programmer, if any other current implementation out there is so, after all. In my opinion, it's part of the price to pay for top performance, efficiency and simplicity. But this implementation is also reasonably flexible. I say so because having a custom deleter as a template parameter, causes different deleters to imply different smart pointers types. Therefore, if used with STL containers (the upcoming variant with reference counting and copy & assignment semantics) it won't support "heterogeneous" elements such as Boost smart pointers. But that kind of Boost's flexibility doesn't pay the price for the added complexity and overhead necessary to support such functionality.

Even though the copy & assignment is prevented (as there's no reference counting) the above class is very useful for simple cases at local scopes, similarly to Boost's scoped_ptr and scoped_array or C++11 unique_ptr (without move semantics, of course).

Example 1:

std::size_t largest_pagesize()
{
    std::size_t largest = ::sysconf( _SC_PAGESIZE );

    int n = ::getpagesizes( NULL, 0 );

    // The following declaration is just for demonstration.
    // Particularly, tmp_array would still be better.
    pointer< std::size_t, ::operator delete [] > 
    size( new std::size_t[ n ] );

    if ( ::getpagesizes( size, n ) != -1 )
        while ( --n >= 0 )
            if ( size[ n ] > largest )
                largest = size[ n ];

    return largest;
}

Example 2: 

struct B
{
    virtual ~B() {}
    virtual void f() = 0;
};

struct D : B
{
    void f()
    {
        pointer< std::string > s( new std::string( "Hello!" ) );

        std::cout << *s << std::endl;
        std::cout << s->length() << std::endl;
    }
};

void f()

    // Polimorfism.
    pointer< B > p( new D );
    p->f();

}

Example 3:

// A C++ wrapping of the extern "C" free function.
// Can be inlined: pointer< T, std_c_free >
// isn't a base class of any other class template.
inline void std_c_free( void * p ) throw()
{
    ::free( p ); 
}

void f()

    std::size_t const n = 10;

    // Standard C allocation & deallocation.
    pointer< double, std_c_free >
   
p( (double *) ::malloc( n * sizeof( double ) ) );

    std::fill_n( static_cast< double * >( p ), n, 0.0 );
}

Example 4:

struct A
{

    ...

    void operator delete ( void * p ) throw()
    {
        ...
    }


    ....
};
 

void f()
{
    pointer< A, A::operator delete > p( new A );

    ...
}



Note that if the 1st template argument has the const qualifier applied, then a typecast modification inside the destructor is required, even if the destruction is to be made a kind of placebo (such as when dealing with read-only memory). But thinking twice, perhaps constant types isn't worthwhile as mere raw pointers to constants should suffice. Thus, I won't incorporate the following change just to support constant types:

~pointer() throw()
{
    D( const_cast< void * >
        ( static_cast< void const * >( p ) ) );
}



Regarding arrays, I think their occurrence is frequent enough to justify getting a simplified notation, for instance, by introducing a convenient derived type, such as:
  
template< typename T, deleter D = ::operator delete [] >
struct array : pointer< T, D >
{
    explicit array( T * const p ) throw()
        pointer( p )
    {
    }
};

Example:

void f( std::size_t const n )
{
    array< int > v( new int[ n ] );

    for ( std::size_t i = 0; i < n; i++ )
        v[ i ] = rand();

    // Again, note the trade-offs.
    // Staying raw provides the best performance.
    // But never sacrificing exception-safety robustness.
    std::sort( (int*)v, (int*)v + n, std::less< int >() );
}



Of course, so far it's not suited to the following kind of coding:
(anyway, a raw pointer return type isn't exception-safe at all)

inline char * f()
{
    // Don't do this!
    // Upon returning the pointer will be invalidated.
    return array< char >( new char[ 16 ] );
}

Fortunately, C++ Templates: The Complete Guide by Nicolai Josuttis and Daveed Vandevoorde proposes mitigate the previous limitation with the introduction of a helper class.

I propose a slight variation in a way I think it's neater.
The required additions to the previous pointer and array structs are:

templatetypename T, deleter D = ::operator delete >
struct pointer
{
    ...

    struct relay
    {
        friend struct pointer;

        relay( pointer const & source )
throw()
            p( source.p )
        {
            const_cast< T * >( source.p ) = 0;
        }

        relay( relay const & source )
throw()
            p( source.p )
        {
            const_cast< T * >( source.p ) = 0;
        }

        ~relay()
throw()
        {
            D( p );
        }

    private:

        T *
const p;

        relay();
        relay & operator = ( relay const & );
    };

    explicit pointer( relay const & source )
throw() :  
        p( source.p )
    {
        const_cast< T * >( source.p ) = 0;
    }

    friend struct relay;


    ...

private:

    ...
};

template< typename T, deleter D = ::operator delete [] >
struct array : pointer< T, D >
{
    ...

    explicit array( relay const & source ) throw()
        pointer( source )
    {
    }
};
 
The trade-off shall be clear: with the added overhead, it's now possible to use the smart pointer across function calls while being exception safe:

inline array< char >::relay f( std::size_t const n )
{
    array< char > a( new char[ n ] );
    std::memset( a, '\0', n );  
    return a;


void f()
{

    array< char > a( f( 22 ) );

    ...
}



If the deleter requires more parameters than a single void *, then we have a new trade-off between such a required functionality and the overhead of space and performance. A stateful function object is required. The new implementation (with another name or in another namespace) could be:

template< typename T, typename D >
struct pointer
{
    explicit pointer( T * const p, D const & d ) throw() : 
        p( p ), d( d )
    {
    }

    ~pointer() throw()
    {
        d( p );
    }

    operator T * () const throw()
    {
        return p;
    }

    T * operator -> () const throw()
    {
        return p;
    }

    struct relay
    {
        friend struct pointer;

        relay( pointer const & source ) throw() : 
            p( source.p ), d( source.d )
        {
            const_cast< T * >( source.p ) = 0;
        }

        relay( relay const & source ) throw() : 
            p( source.p ), d( source.d )
        {
            const_cast< T * >( source.p ) = 0;
        }

        ~relay() throw()
        {
            d( p );
        }

    private:

        T * const p;
        D d;

        relay();
        relay & operator = ( relay const & );
    };

    explicit pointer( relay const & source ) throw() :  
        p( source.p )
    {
        const_cast< T * >( source.p ) = 0;
    }

    friend struct relay;

private:

    T * const p;
    D d;

    pointer();
    pointer( pointer const & );
    pointer & operator = ( pointer const & );
};

template< typename T, typename D >
struct array : pointer< T, D >
{
    explicit array( T * const p, D const & d ) throw() : 
        pointer( p, d )
    {
    }

    explicit array( relay const & source ) throw() : 
        pointer( source )
    {
    }
};

Example:

struct stateful_deleter
{
    explicit stateful_deleter( int flags ) throw() : 
        flags( flags )
    {
    }

    template< typename T >
    void operator () ( T * const p ) const throw()
    { 
        special_free( p, flags );
    }

private:

    int flags;
};

void f()
{
    int flags = ...
    std::sizet_t size = ...
 
    pointer< char, stateful_deleter > p
    (
        special_alloc< char >( flags, size ),
        stateful_deleter( flags )
    );

    ...
}



When attempting to adapt shared memory operations to the previous smart pointer ideas, I was initially considering the stateful deleter version due to the associated shared memory handle that has to be maintained. Nevertheless, due to the required allocation & deallocation protocol I ended up with a refurbished solution. It's based on the relay-able non-stateful smart pointer version (now in namespace simple) and the C++ multiple inheritance (to factor out the management of the shmid handle). Having criticized Boost's engineering, I must admit the solution isn't as elegant as I would like to, but it's still acceptable in terms of performance, space and exception-safety. I'm not convinced that an alternative approach could be simpler than the following one I present next:

namespace shared_memory
{

struct bad_alloc 
{
    bad_alloc( int const error ) throw() : 
        error( error )
    {
    }

    int const error;
};

namespace internal
{

template< typename T >
inline T * alloc( int const handle, int const type ) throw()
{
    return static_cast< T * >( ::shmat( handle, 0, type ) );
}


inline bool bad_pointer( void * p ) throw()
{
    return reinterpret_cast< intptr_t >( p ) == -1;
}


// Exception safety metadata object. 
struct identifier
{
    identifier 
    ( 
        ::key_t const key, 
        std::size_t const size, 
        int const flags 
    )  
        throw( bad_alloc ) : 
            handle( ::shmget( key, size, flags ) )
    {
        if ( handle == -1 )
        {
            // Log the error.   

            throw bad_alloc( errno );
        }
    }

    identifier( identifier const & source ) throw() : 
        handle( source.handle )
    {
        // Relaying.
        * const_cast< int * >( & source.handle ) = -1;
    }

    ~identifier() throw()
    {
        // If not relayed, destroy.
        if ( handle != -1 )
            if ( ::shmctl( handle, IPC_RMID, 0 ) == 0 )
                return;

        // Log the error.
    }

protected:

    int const handle;

private:

    identifier();
    identifier & operator = ( identifier const & );
};

} // namespace internal

// Can't be inlined: simple::pointer< T, sys_v_shmdt >
// is a base class of another class template.
void sys_v_shmdt( void * p ) throw()
{
    // If not relayed, destroy.
    if ( !internal::bad_pointer( p ) )
        if ( ::shmdt( p ) == 0 )
            return;

    // Log the error.
}

template< typename T, simple::deleter D = sys_v_shmdt > 
struct pointer : 
    private internal::identifier
    simple::pointer< T, D >
{
    // High-level constructor for exception safety.
    pointer
   
        ::key_t const key, 
        std::size_t const size, 
        int const flags, 
        int const type 
    ) 
        throw ( bad_alloc ) : 
            internal::identifier( key, size, flags ), 
            simple::pointer( internal::alloc< T >( handle, type ) )
    {
        if ( internal::bad_pointer( *this ) )
        {
            // Log the error.  

            throw bad_alloc( errno );
        }
    }

    struct relay :  
        private internal::identifier
        simple::pointer::relay
    {
        friend struct pointer;

        relay( pointer const & source ) throw() :
            internal::identifier( source ), 

            simple::pointer::relay( source )
        {
        }

        relay( relay const & source ) throw() :
            internal::identifier( source ), 

            simple::pointer::relay( source )
        {
        }

    private:

        relay();
        relay & operator = ( relay const & );
    };

    explicit pointer( relay const & source ) throw() :
        internal::identifier( source ), 

        simple::pointer( source )
    {
    }

    friend struct relay;

};

} // namespace shared_memory

Example:

shared_memory::pointer< char >::relay 
shm_alloc
(
    ::key_t const key,
    std::size_t const size,
    int const type
)
{
    int const flags =  
        IPC_CREAT | IPC_EXCL
        SHM_R | SHM_W | SHM_R >> 3 | SHM_W >> 3

    shared_memory::pointer< char >
        p( key, size, flags, type );

    // Touch the shared memory segment.
    std::memset( p, '\xFF', size );

    return p;
}

void f()
{
    try
    {
        std::size_t const size = 
            1UL * 1024UL * 1024UL * 1024UL;

        shared_memory::pointer< char
            p( shm_alloc( 1, size, SHM_PAGEABLE ) );

        ::mlock( p, size );

        // Use the shared memory segment.
        p->...

        ::munlock( p, size );
    } 

    catch ( shared_memory::bad_alloc const & exception )
    {
        ...
    }
}