MANAGING MEMORY WITH AIX MALLOCs

MANAGING MEMORY WITH AIX MALLOCs

When I started as an AIX performance practitioner in 1999, memory was a hot commodity. Most AIX systems held no more than a few gigabytes of RAM, and it was very expensive. Memory speeds were slow, paging was common, and kernel panics (intrusions into protected memory regions) were an accepted fact of life. If you called support due to a problem with a database, application or middleware, technicians would generally first check how that software was interacting with the system's RAM. Memory was a key resource, and we were all understandably focused on it.

In those days, "tuning memory" simply meant adding more paging space using an old formula that was dependent upon the amount of RAM you had installed. This started to change with the creation of vmtune options, which, as I recall, debuted somewhere around AIX 4.1 to 4.3. While these more sophisticated options provided a better method of tuning system memory behavior, they weren't well understood initially, and it took years before boilerplate recommendations for the vmtune options were published.

Obviously, things have changed dramatically over the past 15-20 years. Today's systems are loaded with memory. Databases have grown to enormous size and become clustered, and application programming techniques have advanced light years beyond where they were. Languages like Java require a high level of memory-usage sophistication even to operate correctly. 64-bit apps have replaced 16- and 32-bit programs, and memory speeds are orders of magnitude faster.

Along with these advancements comes an ever-more critical need to control memory across complex computing environments. Fortunately for AIX administrators, the MALLOCs offer a virtual toolbox of memory management utilities.

Most AIX admins know that a “malloc” in computer programming is a method used to allocate a block of memory on the heap (a portion of a system’s memory where blocks of dynamically allocated memory reside). However, the role of a “MALLOC” in AIX is less widely understood. A MALLOC is one of several methods AIX uses to parcel out memory to applications, databases and middleware and there are options to those MALLOCs that can be invoked to fine-tune their usage. In AIX, MALLOC stands for “Memory ALLOCator,” and there are four basic types: The first is the default – or “Yorktown” -- allocator, which is active once an AIX system is installed. Then there are the Watson, MALLOC 3.1 and Watson2 allocators. While the default allocator doesn't require any environmental variables to be set, the others do. This is accomplished by simply exporting them properly.

MALLOCs debuted back in AIX 4.3, but even today, too few admins make use of them -- particularly the MALLOC options (MALLOCOPTIONS). This is unfortunate, because MALLOCs, when properly applied, can noticeably enhance system performance. The remainder of this article lists and describes the MALLOCs and selected MALLOCOPTIONS. A quick caveat: I cannot tell you which MALLOCs are best suited for your environment. Those determinations can only be made after an extensive performance study.

The Memory Allocators (subhead)

Default/Yorktown -- The default memory allocator, which is also called Yorktown, is selected when the MALLOCTYPE environmental variable is unset. It's the most common MALLOC found in AIX systems, and, as noted, it's the allocator that's active after you install an AIX system. The default/Yorktown allocator maintains a consistent performance and is effective at handling the memory requests of badly behaving applications. However, it may not be as efficient as other allocators. Default/Yorktown functions best when implemented for 32-bit applications that make only infrequent calls to malloc(). The simplest way to determine if you have these types of applications is to either check your programming code or examine kernel trace data.

Watson -- The Watson allocator is efficient and scalable, and provides good performance. It's also specifically designed for 64-bit applications, which of course account for most programming efforts these days. But if you have any 32-bit applications, utilities or databases on your AIX system, be very careful deploying Watson. From my experience at least, the Watson MALLOC doesn't play well with a lot of 32-bit code. Watson is designed for a 64-bit world, and my advice is to use it strictly for that.

To implement Watson, place the following stanza in your /etc/profile and reboot. I usually place it at the top of the file:

??????????????export MALLOCTYPE=Watson

Note: A reboot is always required when implementing any allocator or sub-option.

MALLOC 3.1 -- With this allocator, you're gaining?performance at the expense of memory consumption. In most cases, MALLOC 3.1 will consume twice as much memory as the other allocators, so don't use it without having at least 2X the memory installed on your system that you've estimated is needed in your capacity study. MALLOC 3.1 reduces the overhead of memory reallocation because it likely allocates more memory to code than was needed to begin with.

Implement this allocator with the following export:

??????????????export MALLOCTYPE=3.1

??????????????(the generic export for 32 bit programs)

??????????????export MALLOCTYPE=3.1_64BIT

??????????????(for 64 bit programs)

Watson2 -- I think of this as the adaptable memory allocator. When applications change from single-threaded to multithreaded operations or vice versa, Watson2 uses a varying number of heap structures, depending on the behavior of that application. This means you shouldn’t need to rely upon the MALLOCOPTIONS as much with Watson2 as you generally would with the other allocators.

Implement this allocator with this:

??????????????export MALLOCTYPE=Watson2

The MALLOCOPTIONS (subhead)

The MALLOC sub-options are all set by exporting the MALLOCOPTIONS environment variable. There are many sub-options; I'll only list a few:

Multiheap -- By default, the memory allocation subsystem uses a single heap. When you're running a multithreaded application, you can run into lock contention with one heap. By exporting multiheap, you can control the number of parallel heaps used by the default/Yorktown, Watson and MALLOC 3.1 allocators (Watson2 handles this on its own.) You can have between 1 and 32 heaps. Export this option in your /etc/profile. I would place it below the stanza you use to export one of the MALLOCs:

??????????????export MALLOCOPTIONS=[multiheap:n] | [considersize]

Note: n is the number of heaps. If n isn't specified, the multiheap option defaults to 32. considersize, an optional setting, recommends selecting the heap that has enough free space to handle the allocation request.

Buckets -- Watson uses buckets of memory for allocation requests. This sub-option is similar to Watson's built-in bucket function, but it allows for more finely grained control over the number of buckets that can be configured, as well as the number of blocks per bucket and the size of each bucket.

This sub-option is enabled with:

??????????????export MALLOCOPTIONS=buckets

Note: Enabling this sub-option turns off Watson's built-in bucket function.

MALLOC pools -- This sub-option is similar to Watson's built-in bucket allocator, but MALLOC pools creates a bucket for each application thread, which is designed to improve the performance of multithreaded applications. This provides lock-free allocation for blocks smaller than 513 bytes. (Note that that pool sizes are limited to 512MB.) Keep in mind that the effectiveness of MALLOC pools is reduced if one thread of your application allocates memory while another thread frees that memory.

Export MALLOC pools like this:

??????????????export MALLOCOPTIONS=pool<:max_size>

MALLOC disclaim -- Exporting this sub-option automatically disclaims?-- or frees?-- memory. MALLOC disclaim is most useful for reducing paging space requirements in your system. Its primary use is to combat high memory usage that may occur in application processes even after you call free(). And if you do encounter this situation, be sure to check for memory leaks.

Which Option is Right for You? (subhead)

As I pointed out in the introduction, determining which MALLOC and which sub-options are best for your environment requires study. By diving deeper into you kernel trace data, you'll be able to brainstorm with your application owners/developers about what's right for you. The time and effort you invest will pay off, because the performance gains realized by choosing the correct MALLOC can be considerable.

Once you’ve implemented a particular MALLOC and/or any of its MALLOCTOPTIONS, check to make sure they’re functioning correctly. I'm sure you're all familiar with the dbx utility. dbx is an all-purpose debugger that ships with AIX. In this example, I'll implement the Watson allocator, run a program that uses Watson, and finally invoke dbx to examine my MALLOC statistics. Have your HEX calculator at the ready. If you repeat running MALLOC at the dbx prompt many times, you'll have a very good idea how much heap space your application needs.

So here, I’ve exported Watson in my /etc/profile on an AIX 7.1 LPAR and started running vmstat, the performance data collection tool that uses Watson. I’ve done a "ps -ef | grep vmstat" and taken the resulting PID as my argument to invoke the dbx with the -a flag, as root, like this:

?

ps -ef | grep vmstat

?

???root 1900584 7012554??0 13:00:10?pts/0?0:00 vmstat -w 2

?

dbx -a 1900584

Waiting to attach to process 1900584 ...

Successfully attached to vmstat.

?

…lines omitted …

?

stopped in _p_nsleep at 0x9000000011d3910 ($t1)

0x9000000011d3910 (_p_nsleep+0x10) e8410028????????????ld??r2,0x28(r1)

(dbx)

?

Next, I enter "malloc" at the dbx command prompt:

?

(dbx) malloc

The following options are enabled:

?

???????Implementation Algorithm........ Watson Allocator

???????Malloc Adaptive Buckets

???????Malloc Multiheap

???????????????Number of Heaps......... 32

?

We see that Watson is indeed active. Also note the last line ("Number of Heaps"). In this particular system, I’ve also exported the multiheap option. This gives me 32 memory heaps as opposed to the default of one. Now I don’t need to worry so much about threads competing for heap space.

?

Before exiting the dbx utility, make very sure to issue a “detach” as your last command at the dbx command prompt. If you don’t take this step, you’ll kill the process you’re working with!

?

Memory may not be as vital as it was all those years ago when I was starting out, but it's still very important. If memory doesn’t function correctly in your AIX system, nothing else will function well, either. So experiment on test or sandbox systems with the different memory allocators and MALLOCOPTIONS. See how your applications behave under different memory scenarios. Fortunately, there is a ton of documentation on the MALLOCs written for all levels of experience and skill, so read and learn. Again, your efforts should pay off in terms of improved system performance.

要查看或添加评论,请登录

Mark Ray的更多文章

  • DEEP SEEK -- DOWN THE WHALE'S GULLET

    DEEP SEEK -- DOWN THE WHALE'S GULLET

    Deep Seek One word: Don't Let's leave aside that it's a Chinese model, and you have to agree to terms that give away…

  • Artificial Intelligence on Your Laptop

    Artificial Intelligence on Your Laptop

    I've started a new article series on TechChannel about how to create a fully-functioning AI system on your own computer!

    2 条评论
  • HISTORY OF ARTIFICIAL INTELLIGENCE: PART III

    HISTORY OF ARTIFICIAL INTELLIGENCE: PART III

    In this concluding article on the history of AI, I take you from the 2000s up to the present day. And next time: I show…

  • THE POWER HYPERVISOR

    THE POWER HYPERVISOR

    When we think of performance analysis, we automatically consider a system's physical resources; these resources are…

  • THE FEEDBACK DIRECTED PROGRAM RESTRUCTURING TOOL

    THE FEEDBACK DIRECTED PROGRAM RESTRUCTURING TOOL

    What happens when you're maintaining an AIX system running a very old application that has no vendor support? And for…

  • Running PerfPMR Scripts: Configuration and Network

    Running PerfPMR Scripts: Configuration and Network

    In part one we looked at different ways to run the PerfPMR diagnostic utility as a whole, focusing on some important…

  • SPLAT – The “Simple Performance Lock Analysis Tool”

    SPLAT – The “Simple Performance Lock Analysis Tool”

    I don’t know about you, but locks give me a headache. The way locking activity is implemented and the myriad types of…

  • PerfPMR Part 4: Adding Custom Scripts

    PerfPMR Part 4: Adding Custom Scripts

    In this, my concluding article on PerfPMR, I’ll introduce you to one of the simplest, yet most useful customizations…

  • AIO in AIX: The Fast Path to Great Performance

    AIO in AIX: The Fast Path to Great Performance

    Asynchronous input and output (AIO) is an essential performance feature of AIX. Without it, our world would be a much…

  • Analyzing AIX System Dumps

    Analyzing AIX System Dumps

    A system dump indicates a severe problem with an AIX system. System dumps usually halt the system, necessitating a…

    4 条评论

社区洞察

其他会员也浏览了