|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Announcements
Chapters
Services
Feature Zones
|
Note: This is an unedited contribution. If this article is inappropriate,
needs attention or copies someone else's work without reference then please
Report This Article
Introduction64 bit is wonderful - no longer we are limited to a just 2GB (per Win32 process) - you can now address a full 8TB of memory (per Win64 process)... which is great if your application really needs 8TB of memory. If you don't need the full address space, 64-bit can be wasteful: pointers take twice the memory, which may actually result in a slower application, especially when using pointer-laden data structure such as a binary tree (larger structures take more cache space). A simple scheme is presented that allows using 32-bit "pointers" to access up to 64GB of memory on a 64-bit machine Encoding 64-bit pointers in 32-bitsThe main idea of
Therefore, at least for pointers we allocated on the heap, we can just "delete" the 4 lowest bits (by shifting right) without losing information. If all memory addresses were allocated from the "bottom" (the highest bits are also 0), we could just encode a pointer as follows: #include <stddef.h> // for uintptr_t typedef unsigned __int32 uint32_t; uint32_t encode(void* p) { uintptr_t i = reinterpret_cast<uintptr_t>(p) >> 4; return static_cast<uint32_t>(i); } ( This will encode any (16-byte aligned) address in the range However, the operating system is free to map physical memory into other virtual memory segments - for example, mapping kernel memory "at the top", so it is preferable to keep such a map ourselves: #include <boost/thread/mutex.hpp> #include <boost/thread/locks.hpp> class sptr_base { protected: static const uint32_t ALIGNMENT_BITS = 4; static const uint32_t ALIGNMENT = (1<<ALIGNMENT_BITS); static const uintptr_t ALIGNMENT_MASK = ALIGNMENT - 1; protected: static uintptr_t _seg_map[ALIGNMENT]; static uintptr_t _segs; static boost::mutex _m; inline static uintptr_t ptr2seg(uintptr_t p) { p &= ~0xFFFFFFFFULL; // Keep only high part uintptr_t s = _segs; uintptr_t i = 0; for (; i < s; ++i) if (_seg_map[i] == p) return i; // Not found - now we do it the "right" way (mutex and all) boost::lock_guard<boost::mutex> lock(_m); for (i = 0; i < s; ++i) if (_seg_map[i] == p) return i; i = _segs++; if (_segs > ALIGNMENT) throw bad_segment("Segment out of range"); _seg_map[i] = p; return i; } };
The actual pointer encode/decode is then as follows: #include <boost/pool/detail/singleton.hpp> typedef boost::details::pool::singleton_default<segment_mapper> the_segment_mapper; uint32_t encode(const_native_pointer_type ptr) { uintptr_t p = reinterpret_cast<uintptr_t>(ptr); if ((p & ALIGNMENT_MASK) != 0) throw bad_alignment("Pointer is not aligned"); return (uint32_t)(ptr2seg(p) + p); } inline native_pointer_type decode(uint32_t e) { uintptr_t el = e; uintptr_t ptr = (_seg_map[el & ALIGNMENT_MASK] + el) & ~ALIGNMENT_MASK; return reinterpret_cast<native_pointer_type>(ptr); } sptr
sptr<int> p = new int; *p = 5; delete p; // Conversions from/to pointers and pointer arithmetics work int* q = new int; p = q; p++; q = p; Limitations
First, Second, although an address allocated on the heap will be aligned properly, pointer arithmetics may complicate things: sptr<char> buf = new char[100]; will work just fine as However, something like: for (sptr<char> p = buf; p != buf + 100; ++p) *p = '0'; will not work as p will now point to an unaligned address (the However, in most cases a standard pointer can be used: for (char* p = buf; p != buf + 100; ++p) *p = '0'; Third, mix For example, given: struct A { char i; }; struct B { int j; }; struct C : public A, public B { }; This may or may not work (depending on structure alignement): sptr<C> c = new C; sptr<B> b = c; This is because C* c = new C; B* b = c; std::cout << ((uintptr_t)b - (uintptr_t)c) << std::endl;
There is no problem accessing the object via a standard pointer: sptr<C> c = new C; B* b = c; PerformanceA small benchmark is provided with the code, implementing a simple linked-list which is repeatedly traversed. The list's node is: struct node { int val; node_ptr prev; node_ptr next; }; The size of the node is therefore 20 bytes with standard pointers vs. 12 bytes with However, this comes at a tradeoff - the extra bit shuffling impacts performance - the standard pointer version will perform 1.5x faster on the linked-list benchmark. Accessing the map accounts for much of this performance hit. If it could be guaranteed that all addresses are allocated from the region
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||