MANAGING MEMORY WITH AIX MALLOCs
When I started as an AIX performance practitioner in 1999, memory was a hot commodity. Most AIX systems held no more than a few gigabytes of RAM, and it was very expensive. Memory speeds were slow, paging was common, and kernel panics (intrusions into protected memory regions) were an accepted fact of life. If you called support due to a problem with a database, application or middleware, technicians would generally first check how that software was interacting with the system's RAM. Memory was a key resource, and we were all understandably focused on it.
In those days, "tuning memory" simply meant adding more paging space using an old formula that was dependent upon the amount of RAM you had installed. This started to change with the creation of vmtune options, which, as I recall, debuted somewhere around AIX 4.1 to 4.3. While these more sophisticated options provided a better method of tuning system memory behavior, they weren't well understood initially, and it took years before boilerplate recommendations for the vmtune options were published.
Obviously, things have changed dramatically over the past 15-20 years. Today's systems are loaded with memory. Databases have grown to enormous size and become clustered, and application programming techniques have advanced light years beyond where they were. Languages like Java require a high level of memory-usage sophistication even to operate correctly. 64-bit apps have replaced 16- and 32-bit programs, and memory speeds are orders of magnitude faster.
Along with these advancements comes an ever-more critical need to control memory across complex computing environments. Fortunately for AIX administrators, the MALLOCs offer a virtual toolbox of memory management utilities.
Most AIX admins know that a “malloc” in computer programming is a method used to allocate a block of memory on the heap (a portion of a system’s memory where blocks of dynamically allocated memory reside). However, the role of a “MALLOC” in AIX is less widely understood. A MALLOC is one of several methods AIX uses to parcel out memory to applications, databases and middleware and there are options to those MALLOCs that can be invoked to fine-tune their usage. In AIX, MALLOC stands for “Memory ALLOCator,” and there are four basic types: The first is the default – or “Yorktown” -- allocator, which is active once an AIX system is installed. Then there are the Watson, MALLOC 3.1 and Watson2 allocators. While the default allocator doesn't require any environmental variables to be set, the others do. This is accomplished by simply exporting them properly.
MALLOCs debuted back in AIX 4.3, but even today, too few admins make use of them -- particularly the MALLOC options (MALLOCOPTIONS). This is unfortunate, because MALLOCs, when properly applied, can noticeably enhance system performance. The remainder of this article lists and describes the MALLOCs and selected MALLOCOPTIONS. A quick caveat: I cannot tell you which MALLOCs are best suited for your environment. Those determinations can only be made after an extensive performance study.
The Memory Allocators (subhead)
Default/Yorktown -- The default memory allocator, which is also called Yorktown, is selected when the MALLOCTYPE environmental variable is unset. It's the most common MALLOC found in AIX systems, and, as noted, it's the allocator that's active after you install an AIX system. The default/Yorktown allocator maintains a consistent performance and is effective at handling the memory requests of badly behaving applications. However, it may not be as efficient as other allocators. Default/Yorktown functions best when implemented for 32-bit applications that make only infrequent calls to malloc(). The simplest way to determine if you have these types of applications is to either check your programming code or examine kernel trace data.
Watson -- The Watson allocator is efficient and scalable, and provides good performance. It's also specifically designed for 64-bit applications, which of course account for most programming efforts these days. But if you have any 32-bit applications, utilities or databases on your AIX system, be very careful deploying Watson. From my experience at least, the Watson MALLOC doesn't play well with a lot of 32-bit code. Watson is designed for a 64-bit world, and my advice is to use it strictly for that.
To implement Watson, place the following stanza in your /etc/profile and reboot. I usually place it at the top of the file:
??????????????export MALLOCTYPE=Watson
Note: A reboot is always required when implementing any allocator or sub-option.
MALLOC 3.1 -- With this allocator, you're gaining?performance at the expense of memory consumption. In most cases, MALLOC 3.1 will consume twice as much memory as the other allocators, so don't use it without having at least 2X the memory installed on your system that you've estimated is needed in your capacity study. MALLOC 3.1 reduces the overhead of memory reallocation because it likely allocates more memory to code than was needed to begin with.
Implement this allocator with the following export:
??????????????export MALLOCTYPE=3.1
??????????????(the generic export for 32 bit programs)
??????????????export MALLOCTYPE=3.1_64BIT
??????????????(for 64 bit programs)
Watson2 -- I think of this as the adaptable memory allocator. When applications change from single-threaded to multithreaded operations or vice versa, Watson2 uses a varying number of heap structures, depending on the behavior of that application. This means you shouldn’t need to rely upon the MALLOCOPTIONS as much with Watson2 as you generally would with the other allocators.
Implement this allocator with this:
??????????????export MALLOCTYPE=Watson2
The MALLOCOPTIONS (subhead)
The MALLOC sub-options are all set by exporting the MALLOCOPTIONS environment variable. There are many sub-options; I'll only list a few:
Multiheap -- By default, the memory allocation subsystem uses a single heap. When you're running a multithreaded application, you can run into lock contention with one heap. By exporting multiheap, you can control the number of parallel heaps used by the default/Yorktown, Watson and MALLOC 3.1 allocators (Watson2 handles this on its own.) You can have between 1 and 32 heaps. Export this option in your /etc/profile. I would place it below the stanza you use to export one of the MALLOCs:
??????????????export MALLOCOPTIONS=[multiheap:n] | [considersize]
Note: n is the number of heaps. If n isn't specified, the multiheap option defaults to 32. considersize, an optional setting, recommends selecting the heap that has enough free space to handle the allocation request.
Buckets -- Watson uses buckets of memory for allocation requests. This sub-option is similar to Watson's built-in bucket function, but it allows for more finely grained control over the number of buckets that can be configured, as well as the number of blocks per bucket and the size of each bucket.
This sub-option is enabled with:
??????????????export MALLOCOPTIONS=buckets
领英推荐
Note: Enabling this sub-option turns off Watson's built-in bucket function.
MALLOC pools -- This sub-option is similar to Watson's built-in bucket allocator, but MALLOC pools creates a bucket for each application thread, which is designed to improve the performance of multithreaded applications. This provides lock-free allocation for blocks smaller than 513 bytes. (Note that that pool sizes are limited to 512MB.) Keep in mind that the effectiveness of MALLOC pools is reduced if one thread of your application allocates memory while another thread frees that memory.
Export MALLOC pools like this:
??????????????export MALLOCOPTIONS=pool<:max_size>
MALLOC disclaim -- Exporting this sub-option automatically disclaims?-- or frees?-- memory. MALLOC disclaim is most useful for reducing paging space requirements in your system. Its primary use is to combat high memory usage that may occur in application processes even after you call free(). And if you do encounter this situation, be sure to check for memory leaks.
Which Option is Right for You? (subhead)
As I pointed out in the introduction, determining which MALLOC and which sub-options are best for your environment requires study. By diving deeper into you kernel trace data, you'll be able to brainstorm with your application owners/developers about what's right for you. The time and effort you invest will pay off, because the performance gains realized by choosing the correct MALLOC can be considerable.
Once you’ve implemented a particular MALLOC and/or any of its MALLOCTOPTIONS, check to make sure they’re functioning correctly. I'm sure you're all familiar with the dbx utility. dbx is an all-purpose debugger that ships with AIX. In this example, I'll implement the Watson allocator, run a program that uses Watson, and finally invoke dbx to examine my MALLOC statistics. Have your HEX calculator at the ready. If you repeat running MALLOC at the dbx prompt many times, you'll have a very good idea how much heap space your application needs.
So here, I’ve exported Watson in my /etc/profile on an AIX 7.1 LPAR and started running vmstat, the performance data collection tool that uses Watson. I’ve done a "ps -ef | grep vmstat" and taken the resulting PID as my argument to invoke the dbx with the -a flag, as root, like this:
?
ps -ef | grep vmstat
?
???root 1900584 7012554??0 13:00:10?pts/0?0:00 vmstat -w 2
?
dbx -a 1900584
Waiting to attach to process 1900584 ...
Successfully attached to vmstat.
?
…lines omitted …
?
stopped in _p_nsleep at 0x9000000011d3910 ($t1)
0x9000000011d3910 (_p_nsleep+0x10) e8410028????????????ld??r2,0x28(r1)
(dbx)
?
Next, I enter "malloc" at the dbx command prompt:
?
(dbx) malloc
The following options are enabled:
?
???????Implementation Algorithm........ Watson Allocator
???????Malloc Adaptive Buckets
???????Malloc Multiheap
???????????????Number of Heaps......... 32
?
We see that Watson is indeed active. Also note the last line ("Number of Heaps"). In this particular system, I’ve also exported the multiheap option. This gives me 32 memory heaps as opposed to the default of one. Now I don’t need to worry so much about threads competing for heap space.
?
Before exiting the dbx utility, make very sure to issue a “detach” as your last command at the dbx command prompt. If you don’t take this step, you’ll kill the process you’re working with!
?
Memory may not be as vital as it was all those years ago when I was starting out, but it's still very important. If memory doesn’t function correctly in your AIX system, nothing else will function well, either. So experiment on test or sandbox systems with the different memory allocators and MALLOCOPTIONS. See how your applications behave under different memory scenarios. Fortunately, there is a ton of documentation on the MALLOCs written for all levels of experience and skill, so read and learn. Again, your efforts should pay off in terms of improved system performance.