Memory Leak: Caused by a program that repeatedly allocates memory without freeing it.
When a process exits, its working storage is freed up immediately and its associated memory frames are put back on the free list.
However any files the process may have opened can stay in memory.
AIX tries to use the maximum amount of free memory for file caching.
High levels of file system cache usually means that is the way the application runs and likes it (you have to decide if this is expected by understanding the workload) or AIX can't find anything else to do with the memory and so thinks it might as well save disk I/O CPU cycles by caching - this is normal and a good idea.
Link to similar article about Oracle AIX memory issue: http://aix4admins.blogspot.co.uk/2011/09/memory-leak-caused-by-program-that.html
There are couple commands to run to find issues:
- ps gv | head -n 1; ps gv | egrep -v "RSS" | sort +6b -7 -n -r
- svmon
- svmon -P -O summary=basic,unit=MB
- nmon m
Examples (this is after-fix readings, and fix was to upgrade to another version of JDK):
ps gv | head -n 1; ps gv | egrep -v "RSS" | sort +6b -7 -n -r | head -n 10
$ ps gv | head -n 1; ps gv | egrep -v "RSS" | sort +6b -7 -n -r | head -n 10 PID TTY STAT TIME PGIN SIZE RSS LIM TSIZ TRS %CPU %MEM COMMAND 4587574 - A 4943:32 8010 679324 679452 32768 80 128 0.9 1.0 /opt/W 7077924 - A 5435:52 1930 666140 666268 32768 80 128 1.0 1.0 /opt/W 15794204 - A 233:22 219 550876 551004 32768 80 128 0.0 1.0 /opt/W 5439520 - A 236:28 164 530844 530972 32768 80 128 0.0 1.0 /opt/W 12845156 - A 132:39 423 470300 470428 32768 80 128 0.0 1.0 /opt/W 12321022 - A 131:39 271 406428 406556 32768 80 128 0.0 1.0 /opt/W 9764870 - A 68:25 157 255896 256024 32768 80 128 0.0 0.0 /opt/W 7340098 - A 14:18 387 63356 63428 32768 39 72 0.0 0.0 java - 7995524 - A 11:54 3049 59212 59332 32768 69 120 0.0 0.0 /opt/t
This command gives you info about current memory usage by processes (top 10).
RSS: The amount of RAM used for the text and data segments per process. (RSS:resident set size)
%MEM: The actual amount of the RSS / Total RAM. Watch for processes that consume 40-70 percent of %MEM.
TRS: The amount of RAM used for the text segment of a process in kilobytes.
SIZE: The actual amount of paging space (virtual mem. size) allocated for this process (text and data).
On size/rss column you can see current usage at level of 680MB (for to processes), when I investigated leak I saw usage at level of 8 GB even when my max heap was set to 4GB!
svmon
What is really going on
$ svmon size inuse free pin virtual mmode memory 16777216 16138874 638342 1431202 7262359 Ded pg space 1572864 220721 work pers clnt other pin 923826 0 0 507376 in use 7230007 0 8908867 PageSize PoolSize inuse pgsp pin virtual s 4 KB - 9824554 16721 652946 915687 m 64 KB - 394645 12750 48641 396667
Here we have all memory used by system (and all applications) - just keep in mind its in 4KB pages.
Top line:
size: total memory available: 16777216 * 4 KB = 64GB = 65536 MB
inuse: currently used memory () 16138874 * 4KB = 63042 MB
free: free memory: 638342 * 4KB = 2493 MB
So looks like over 95% of memory is used.
Real value of used pages are on bottom 2 lines, here we have two types of pages: s - small 4KB one and m - medium 64KB, together:
9824554 * 4KB + 394645 *64KB = 39298216 KB + 25257280 KB = 64555496 KB =63042 MB
svmon -P -O summary=basic,unit=MB
Lets look what processes are using:$ svmon -P -O summary=basic,unit=MB Unit: MB ------------------------------------------------------------------------------- Pid Command Inuse Pin Pgsp Virtual 4587574 java 5222.52 32.0 0.94 4920.19 7077924 java 5212.62 31.9 0.94 4914.32 12845156 java 4914.87 31.4 0.94 4620.29 5439520 java 2832.61 31.7 0.94 2522.76 15794204 java 2646.86 31.7 0.94 2331.34 12321022 java 2064.07 31.3 0.94 1762.70 9764870 java 965.89 31.3 0.94 673.54 8650818 webservd 178.28 30.8 2.31 150.90 11403468 svmon 115.71 30.6 0.94 116.89 15532206 ksh 113.70 30.5 2.31 120.28 5243046 ksh 113.67 30.5 2.31 120.26 10616930 sleep 113.00 30.5 2.31 119.88
As you can see I have 7 java processes; I'm interested only on first 6 - as that my application. On all of them max heap is set to 4GB, question is: why first of them are getting over 5GB?
In AIX IBM java version: process memory is divided into java managed (heap), native (calls to native system: threads, JNI, JDBC...), file caching.
svmon -P <pid>
Look into details:------------------------------------------------------------------------------- Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB 4587574 java 1336964 8183 240 1259569 Y Y N PageSize Inuse Pin Pgsp Virtual s 4 KB 96484 343 0 18689 m 64 KB 77530 490 15 77555 Vsid Esid Type Description PSize Inuse Pin Pgsp Virtual 5a32d8 72 work text data BSS heap m 4096 0 0 4096 5e90de 78 work text data BSS heap m 4096 0 0 4096 a74427 7f work text data BSS heap m 4096 0 0 4096 db1d5b 7a work text data BSS heap m 4096 0 0 4096 d74357 76 work text data BSS heap m 4096 0 0 4096 cd444d 7e work text data BSS heap m 4096 0 0 4096 b63234 70 work text data BSS heap m 4096 0 0 4096 2634a4 75 work text data BSS heap m 4096 0 0 4096 5034d2 73 work text data BSS heap m 4096 0 0 4096 c49444 77 work text data BSS heap m 4096 0 0 4096 f33471 71 work text data BSS heap m 4096 0 0 4096 7296f2 7c work text data BSS heap m 4096 0 0 4096 c62c46 74 work text data BSS heap m 4096 0 0 4096 9c451c 7d work text data BSS heap m 4096 0 0 4096 4190c1 79 work text data BSS heap m 4096 0 0 4096 a72e25 1002 work text data BSS heap m 4076 0 0 4076 6916e9 7b work text data BSS heap m 3916 0 0 3916 5996d9 1001 work text data BSS heap m 3837 0 0 3837 cc2b4c 1003 work text data BSS heap m 2640 0 0 2640 990019 90000000 work shared library text m 1033 0 8 1052 c34343 70000000 work default shmat/mmap sm 12800 0 0 12800 730ef3 - clnt /dev/fslv00:197305 s 9919 0 - - 20002 0 work kernel segment m 532 487 7 538 c79247 - clnt /dev/fslv00:193648 s 5021 0 - - c40f44 - clnt /dev/fslv00:197307 s 3058 0 - - e60f66 - clnt /dev/fslv00:197320 s 2580 0 - - 396cbb - clnt /dev/fslv00:172631 s 2505 0 - - c89648 - clnt /dev/fslv00:193602 s 2438 0 - - 50005 9ffffffd work shared library sm 2251 0 0 2251 cc0e4c - clnt /dev/fslv00:197330 s 2158 0 - - b99639 - clnt /dev/fslv00:193657 s 2144 0 - - dc275c - work s 1831 343 0 1831 2b0aab - clnt /dev/fslv00:197268 s 1762 0 - - 9b001b 90020014 work shared library s 1633 0 0 1633 a30d23 - clnt /dev/fslv00:173466 s 1523 0 - - .....Full detailed information about process and memory usage.
How to read it:
There are 3 areas that we are interesting in are:
1) Processes that using small esid numbers (70, 71, 72 ... 7e) - that's java heap
2) Processes using 1001, 1002, 1003 - that's native memory
3) -clnt - that files caching
So if you want to know how much memory is used by java heap - you need to summarize all 7x processes and multiply if by 64KB (PSize = m)
Calculations for native memory:
a72e25 1002 work text data BSS heap m 4076 0 0 4076
5996d9 1001 work text data BSS heap m 3837 0 0 3837
cc2b4c 1003 work text data BSS heap m 2640 0 0 2640
(4076 + 3837 + 2640) * 64 KB = 675456 KB = 659 MB
nmon m t
Every day tool to check memory usage
-->m [memory usage] --> t (top processes) --> 4 (order in process size)│ Memory ────────────────────────────────────────────────────────────────────────│ │ Physical PageSpace | pages/sec In Out | FileSystemCache │ │% Used 96.2% 14.0% | to Paging Space 0.0 0.0 | (numperm) 52.8% │ │% Free 3.8% 86.0% | to File System 0.0 0.0 | Process 36.3% │ │MB Used 63028.8MB 861.4MB | Page Scans 0.0 | System 7.0% │ │MB Free 2507.2MB 5282.6MB | Page Cycles 0.0 | Free 3.8% │ │Total(MB) 65536.0MB 6144.0MB | Page Steals 0.0 | ------ │ │ | Page Faults 12.5 | Total 100.0% │ │------------------------------------------------------------ | numclient 52.8% │ │Min/Maxperm 1905MB( 3%) 57163MB( 90%) <--% of RAM | maxclient 90.0% │ │Min/Maxfree 960 1088 Total Virtual 70.0GB | User 86.1% │ │Min/Maxpgahead 2 8 Accessed Virtual 27.7GB 39.6%| Pinned 8.5% │ │ | lruable pages 1│ │ Top-Processes-(140) ─────Mode=4 [1=Basic 2=CPU 3=Perf 4=Size 5=I/O 6=Cmds]────│ │ PID %CPU Size Res Res Res Char RAM Paging Command │ │ Used KB Set Text Data I/O Use io other repage │ │ 4587574 0.8 663M 663M 128 663M 8 1% 0 0 0 java │ │ 7077924 0.7 650M 650M 128 650M 7 1% 0 0 0 java │ │15794204 1.2 538M 538M 128 537M 131 1% 0 0 0 java │ │ 5439520 0.7 518M 518M 128 518M 7 1% 0 0 0 java │ │12845156 0.7 459M 459M 128 459M 14 1% 0 0 0 java │ │12321022 0.7 396M 397M 128 396M 26 1% 0 0 0 java │ │ 9764870 0.4 249M 250M 128 249M 62 0% 0 0 0 java │ │ 7340098 0.0 63392 63428 72 63356 0 0% 0 0 0 java │ │ 7995524 0.0 59280 59332 120 59212 0 0% 0 0 0 java │
Size KB: program on disk size
ResSize: Resident Set Size - how big it is in memory (excluding the pages still in the file system (like code) and some parts on paging disks)
ResText: code pages of the Resident Set
ResData: data and stack pages of the Resident Set
What interested here is:
Physical
% Used 96.2%
% Free 3.8%
and split:FileSystemCache
(numperm) 52.8%
Process 36.4%
System 7.0%
Free 3.8%
-----
Total 100.0%
Over 50% of memory is used by file system cache.Additional:
topas -P This does not tell how much of the application is paged out but how much of the application memory is backed by paging space.(things in memory (working segment) should be backed by paging space by the actual size in memory of the process.)
svmon -Pt15 | perl -e 'while(<>){print if($.==2||$&&&!$s++);$.=0 if(/^-+$/)}' top 15 processes using the most memory
ps aux | head -1 ; ps aux | sort -rn +3 | head -20 top memory processes (the above is better)
ps -ef | grep -c LOCAL=NO shows the number of client connections (each connection takes up memory, so if it is high then...)
paging:
svmon -Pg -t 1 |grep Pid ; svmon -Pg -t 10 |grep "N" top 10 processes using the most paging space
svmon -P -O sortseg=pgsp shows paging space usage of processes
ipcs -bm - it will show our memory segments, who owns them and what their size (in bytes). It shows the maximum allocated size, that a memory segment can go to.