Magic with memusage library in linux

Today, I have started writing a tool for finding memory leaks in my programs. So, I have started with defining my own memory functions like malloc, realloc, free with exactly same signature as of the original memory functions. In these functions, I explicitly load original memory libraries, call the original functions and I keep note of how many times each of them is called. So, if I wanted to find memory leaks in a program test, instead of executing it like $./test, I would execute it like $LD_PRELOAD=$PWD/libmemleaks.so ./test.

By mistake, I was explicitly loading /lib/libmemusage.so instead of libc.so. Initially I got so many compilation errors in my libmemleaks.so program. So, I removed lot of code and compiled it again succesfully. But, when I executed it finally, I got a table with heap/stack size and no.of times malloc/calloc/free are called. The table is shown below.memory_report

Surprised!! I was not expecting this multi-colored table. I did not write any code for printing this table. In yahoo/google, I searched and did not get any references to this problem [no documentation, problem with open source??]. Then I have used google code search and searched for “Memory usage summary” and got a link to glibc-2.2.5/malloc/memusage.c. After seeing that file, in _fini function, I understood that, they are writing the memory usage report. _fini is the function in shared library which is called, when the system is unloading the library.

After that, I have download glibc 2.2.5 source code and started looking for the clues. In the same directory as glibc-2.2.5/malloc, I found a shell script memusage.sh file. I have copied that into my linux machine and modified a line and executed it [sh memusage.sh ./test]. Then, I got the same multi colored table. Then I came to know, this memusage.so library is written for getting memory stats. memusage.sh, internally executes like LD_PRELOAD= /lib/libmemusage.so <prog.name>. Even I thought of writing my memory leak checking tool with the same concept and at last I have completed my program. But, my program is not as colorful as memusage.sh program. Using memusage.sh, we can find memory stats of any program [including java programs]. Execute sh memusage.sh –help for command help.

Normally, linux distributions dont distribute memusage.sh. You can see the same shell script here. you can download it from here [I have modified at 2 places. Replaced @SLIBDIR@ , @BINDIR@ with /lib, /bin. To get graphs, you need to have memusagestat executable in /bin]. Finally, I am very happy that I found this utility after researching on this for 3 1/2 days.

long and long long

I am sure, in java, size of long is 64 bits and there is not any datatype of long long. So, In C /C++ also, I used to think long is 64 bit and long long is bigger than 64 bit [128 bit]. But, recently I got a doubt regarding this. So, I have written a C program and compiled it on many systems. Size of long long is always 64bits. But size of long was varying based on compiler flags. In solaris, if we give cc longsize.c, by default ILP32 bit data model used. In this case, size of long is 32 bits. But, if we give cc -xarch=v9 longsize.c, code will be compiled for 64-bit SPARC machines and LP64 data model is used. In ILP32 bit model, size of integer, long and pointer is 32 bits. In LP64 data model, size of long and pointer is 64 bits. So, independent of machines and datamodels, if we want 64 bit data type, we have to use long long.

Then I have read in some article that long long is introduced recently to support very big numbers. we use %ld in printf to print long and %lld for long long.

http://developers.sun.com/prodtech/cc/articles/ILP32toLP64Issues.html

Update: Recently, I have learnt that, in windows, size of long in both 32 bit and 64 bit builds is same and it is 32 bits. It follows LLP64 [long long and pointer are 64 bit] data model in 64 bit builds.

Multi line greps

    In linux/solaris, we can use fgrep/grep for pattern matching in files. But one limitation with these commands is they limit their pattern matching to one line. They don’t search for the pattern in multiple lines. Then I found an open source tool, pcregrep [available as sdk also]. pcregrep provides multi-line grep. In many latest versions of linux and solaris machines, I have found this command, but they are older versions [maybe stable versions]. Older versions did not have multi-line grep functionality. So, I have downloaded its latest source and built it.

One usecase: I have function names, then I need to find out their declarations from header files. Function could be declared over many lines. This is my usecase. In this scenario, I have used pcregrep.

Problems with linking of shared libraries in AIX

    Normally, in Solaris, Linux and other common platforms, shared libraries are represented with .so/.sl suffix. Static libraries are represented with .a suffix in filenames. But in AIX, static libraries have .a suffix and shared libraries can have either .so or .a suffix.

when we try to compile a c file which uses shared library with .so suffix, it wont succeed by default. It gives a compilation error. Additionally we have to pass “-Wl,-brtl” flag to the compiler. “-Wl” is to say that it is a flag to the linker, so “-brtl” is internally passed to the linker [ld]. “-brtl” says that it should consider files with .so suffix also as shared libraries. There is no need to pass this flag when your shared library contains .a suffix. This type of linking is loadtime linking.

When we want to access a shared library at runtime using dlopen & dlsym calls, it is called runtime linking. In this case, we wont get any compilation errors. If the shared library contains .a suffix, we wont get any errors at runtime also. But if the shared library contains .so suffix, we get segmentation fault at runtime. Confusing thing is, it succesfully executes dlopen call, but at the time of dlsym, it exits with segmentation fault. If we give “-Wl-brtl” flag to compiler at compilation time, runtime linking goes fine.

Using opaque data types in C

Suppose if we have to write a SDK and don’t want others to know the details of function parameters, we can use opaque data types in C. In most of the cases, you may not need these. But if you use them, it will be very difficult for hacker/crackers to know what is being stored in the datatype. I have made a sample library and a program which uses them. You can download them here. I have written this code just to demonstrate; it is not optimized.
test1.h and test1.c are used to generate libtest1.so file. test2.c is used to test the library. We give only test1.h and libtest1.so files to our customers. Lets assume test2.c is the program written by him. By seeing test1.h, customer does not know what is “opaque_type1“. But he uses it in his code [test2.c] and gets the result.

If we are using C structures for making opaque data types, we can’t use them directly as compiler does not know the size of it to allocate[In my code, I have implemented opaque data types using C structures. They are just declared not defined]. We always have to use them as pointers.

Version scripts & Function aliasing

    In windows, there is a problem called “DLL HELL” which has been there for many years. I heard that, Vista is going to provide a complete solution to this problem. Unix guys thought about this problem long back and provided solution through “Version scripts” [First implemented in SunOS 2.5 and later also implemented in GCC and linux]. Through version scripts, we can simply maintain versions in shared libraries. When we link an application against a shared library that has versioned symbols, the application itself knows which version of each symbol it requires, and it also knows which version nodes it needs from each shared library it is linked against. Thus at runtime, the dynamic loader can make a quick check to make sure that the libraries you have linked against do in fact supply all of the version nodes that the application will need to resolve all of the dynamic symbols.

Simple version script looks like the following

	VERS_1.1 {
		 global:
			 foo1;
		 local:
			 old*;
			 original*;
			 new*;
		};
	VERS_1.2 {
		 foo2;
	} VERS_1.1;

More info on version scripts is given better in the reference links mentioned in the end.

GCC has added an extension to this and provided an excellent feature for function aliasing. We can do this by adding assember macro like the following in C code.

		__asm__(".symver old_foo,foo@VERS_1.1");
		__asm__(".symver old_foo1,foo@VERS_1.2");
		__asm__(".symver new_foo,foo@@VERS_2.0");

So, for the executables linked with VERS_1.2, a function call to foo will be redirected to  old_foo1 internally.  Similarly for executables linked with VERS_1.1,  function call to foo will be redirected internally to old_foo.

You can find more description about these at

http://people.freebsd.org/~deischen/symver/library_versioning.txt

www.redhat.com/docs/manuals/enterprise/RHEL-4-Manual/gnu-linker/version.html

http://snipurl.com/wavf   [This original URL is too long.. so snipped it]

Make a shared library which can be executable in linux

    Great day!! Finally I came to know, how to create a file which can act as both shared library and executable. You might not have observed it in your unix machines. In linux, execute /lib/libc.so as a command, it gives you all the version information. I have been trying to simulate the same feature in my shared libraries also. I have seen glibc source code and got to know that we have to use entry option of linker. With that knowledge, I have written a small program and tried for a complete working day, with the result ending up in Illegal instructions/ segmentation faults. Then I have just sent a mail to gcc mailing list and got the answers 1 ,2 ,3.

I am attaching a simple 5 line c program here. Commands that I have used are

# gcc -g -W -Wall -fPIC -o libtest.so -shared -Wl,-e,test1 test.c

    #./ libtest.so

    Hi dp! you finally made it

    # 

In our C program, we have to declare a char array for .interp section to hold the absolute path of the linker. In our entry function test1, after processing the logic, we should use exit() function instead of returning. I dont know the logic behind these 2 points.

In Solaris also, we can have this feature. But, I have not tried/tested it. If you are interested, you can look at this. Read the whole page and see the comment named “solaris 9 compile”.

Using TUI commands when debugging with GDB

    Today, when I was using GDB, by mistake I have typed refresh. Immediately my window was split into 2 and upper half was loaded with the source code I was debugging. Finally it was like Visual studio debugger [Showing source in the top frame and commands in the bottom frame]. I like this feature very much and but I did not know that it is available in GDB.
Then I did some research about that and came to know about TUI commands. TUI stands for Text User Interface. We can use TUI commands like layout, info, refresh in GDB. We can see the source code of our debugging program by executing “layout src” command in GDB. Similarly, we can see assembly instructions and regs values also. You can read more about TUI commands at this link. I have attached screenshot of my GDB. By seeing that, you can get the overview of how it looks like. In the src layout, we can see our breakpoints [Marked with B+ on left side] and at which step our program is executing [Marked with > on left side and highlighted text].

Newer entries »