Introduction to the use of valgrind cross compilation

Posted Jun 27, 20208 min read

Valgrind

Overview

Valgrind is a tool framework for building dynamic analysis tools. It comes with a set of tools, each of which performs certain debugging, analysis, or similar tasks to help improve the program. Valgrind's architecture is modular, so new tools can be easily created without disturbing existing structures. Many useful tools are provided as standard:

  1. Memcheck is a memory error detector. It helps you make your programs, especially those written in C and C++, more correct.
  2. Cachegrind is a cache and branch prediction analyzer. It can help you make your program run faster.
  3. Callgrind is a call graph generation cache analyzer. It has some overlap with Cachegrind, but also collects some information that Cachegrind does not.
  4. Helgrind is a thread error detector. It can help you make multi-threaded programs more correct.
  5. DRD is also a thread error detector. It is similar to Helgrind, but uses different analysis techniques, so you may find different problems.
  6. Massif is a heap analyzer. It helps reduce the memory used by the program.
  7. DHAT is another heap analyzer. It helps you understand the problems of block life cycle, block utilization and layout inefficiency.
  8. BBV is an experimental SimPoint basic block vector generator. It is useful for people engaged in computer architecture research and development.

practice

toolchain:gcc-9.1.0-2019.11-x86_64_arm-linux-gnueabihf

next code

git clone https://sourceware.org/git/valgrind.git

Compile

  1. ./autogen.sh
  2. Modify configure and change armv7_ to _arm as follows:

image-20200618104839275

  1. ./configure --host=arm-linux-gnueabihf --prefix=$PWD/output
     make -j5 && make install

The resulting layout is:

image-20200618105735856

Run

Environmental preparation

  1. Mount the shared directory on the board

    ifconfig eth0 up
    ifconfig eth0 xxx.xxx.xxx.xxx
    mount -t nfs xxx.xxx.xxx.xxx:/xxx/nfs /tmp -o nolock

  2. Copy the generated output/bin and output/lib to the corresponding location in the shared directory. The layout on the board is as follows:

image-20200618110311558

  1. Valgrind needs to link not stripped ld*.so/libc*.so/libdl*.so, these can be found in the corresponding toolchain release package, for example, 9.1.0 is from gcc-9.1.0-2019.11 -x86_64_arm-linux-gnueabihfarm-linux-gnueabihflibclib Find the following:

image-20200618111037810

Copy to the corresponding location of the shared directory, the layout on the board is as follows:

image-20200618111226655

  1. You can specify the location of libc*.so/libdl*.so by setting LD_LIBRARY_PATH, but it is invalid for the dynamic loading library ld*.so. The location of the dynamic loading library is neither specified by the system configuration nor the environment parameters, but by the ELF executable file. In the dynamically linked ELF executable file, there is a special section called .interp, which can be viewed through readelf:

    arm-linux-gnueabihf-readelf -p .interp $your_prog

The output is:

image-20200618122854706

How to specify the path of the dynamic loading library? Here is a way to compare geek, directly rewriting the .interp section in the ELF executable file through patchelf.

The resulting layout is:

image-20200618135613076

Run patchelf to modify the loading library path of the ELF executable file:

~/patchelf/output/bin/patchelf --set-interpreter /tmp/gcc_9.1.0_lib/ld-2.30.so test

Check the .interp section to confirm:

image-20200618140215363

You can see that the modification has been successful.

  1. Set the Valgrind lib path and c lib path:

    export VALGRIND_LIB=/tmp/valgrind/lib/valgrind
    export LD_LIBRARY_PATH=/tmp/gcc_9.1.0_lib:$LD_LIBRARY_PATH

Memcheck

Memcheck is a memory error detector. It can detect the following common problems in C and C++ programs:

  • Access memory that should not be accessed, such as overflowing the heap, overflowing the top of the stack, and accessing memory after freeing it.
  • Use undefined values, that is, uninitialized values, or values derived from other undefined values.
  • The heap memory is released incorrectly, such as the double release of heap blocks, or the use of malloc/new/new[]and free/delete/delete[]do not match.
  • The src and dst pointer areas in memcpy and related functions overlap.
  • Pass a suspicious value(may be negative) to the size parameter of the memory allocation function.
  • Memory leak.

The use form is:

 ./valgrind --tool=memcheck --leak-check=full [memcheck options]your-program [program options]

Cachegrind

Cachegrind simulates how the program interacts with the computer's cache hierarchy and(optional) branch predictor. It simulates a machine with independent level 1 instruction and data caches(I1 and D1), followed by a unified level 2 cache(L2). This fits perfectly with the configuration of many modern machines. However, some modern machines have three to four levels of cache. For these machines(in the case where Cachegrind can automatically detect the cache configuration), Cachegrind simulates the first and last levels of cache. The reason for this choice is that the last level of cache has the greatest impact on the runtime because it shields access to main memory. In addition, first-level caches usually have low correlation, so simulating them can detect situations where code interacts poorly with the cache(for example, traversing matrix columns by powers of 2 in row length).

The use form is:

./valgrind --tool=cachegrind [cachegrind options]your-program [program options]

Callgrind

Callgrind is an analysis tool that records the call history between functions in the program as a call graph. By default, the collected data consists of the number of instructions executed, the relationship between the instruction and the source line, the caller/callee relationship between the functions, and the number of calls. Alternatively, cache simulation or branch prediction(similar to Cachegrind) can generate further information about the runtime behavior of the application.

The use form is:

./valgrind --tool=callgrind [callgrind options]your-program [program options]

Will generate a file:callgrind.out.

This file can be taken to the server and parsed using callgrind_annotate(its location is under valgrind/output/bin):

./callgrind_annotate [options]callgrind.out.<pid>

Helgrind

Helgrind is a Valgrind tool used to detect synchronization errors in C, C++, and FORTRAN programs that use POSIX pthreads thread primitives.

The main abstractions in POSIX pthreads are:a set of threads sharing a common address space, thread creation, thread join, thread exit, mutex(lock), condition variable(inter-thread event notification), read-write lock, spin lock, signal Volume and barrier.

Helgrind can detect three types of errors as follows:

  1. Incorrect use of POSIX pthreads API.
  2. Potential deadlock caused by lock sequence problems.
  3. There is data contention, that is, access to memory without sufficient locking or synchronization.

The use form is:

./valgrind --tool=helgrind [helgrind options]your-program [program options]

DRD

DRD is a Valgrind tool for detecting errors in multi-threaded C and C++ programs. This tool is suitable for any program that uses the POSIX pthreads thread primitive or uses the thread concept built on the POSIX pthreads thread primitive.

The use form is:

./valgrind --tool=drd [drd options]your-program [program options]

Massif

Massif is a heap analyzer. It measures the amount of heap memory used by the program. This includes both useful space and extra bytes allocated for alignment purposes. It can also measure the size of the program stack, although it does not do this by default.

Heap analysis can help you reduce the amount of memory used by the program. On modern computers with virtual memory, this provides the following benefits:

  • It can speed up the program-smaller programs can better interact with the machine's cache and avoid paging.
  • If the program uses a lot of memory, this will reduce the chance of running out of machine swap space.

In addition, some space leaks cannot be detected by traditional leak checking programs(such as Memcheck). This is because the memory is never really lost, because there are still pointers to it, but it is not used. Programs with such leaks may unnecessarily increase the amount of memory they use over a period of time. Massif can help identify these leaks. Importantly, Massif not only tells you how much heap memory the program uses, it also provides very detailed information indicating which parts of the program are responsible for allocating heap memory.

The use form is:

./valgrind --tool=massif [massif options]your-program [program options]

Will generate a file:massif.out.

This file can be taken to the server and parsed using ms_print(its location is under valgrind/output/bin):

./ms_print [options]massif.out.<pid>

DHAT

DHAT is a tool to check how the program uses heap allocation. It keeps track of the allocated blocks and checks each memory access to find the block(if any) it wants to access. It is based on allocation points and displays information about these blocks, such as size, lifetime, number of reads and writes, and read and write modes. Use this information to identify distribution points with the following characteristics:

  • Potential process lifetime leak:the currently allocated blocks will only accumulate and will only be released at the end of the run.
  • Excessive transients:Allocate blocks with extremely short lifetimes.
  • Useless or underutilized allocation:blocks that are allocated but not completely filled, or blocks that are filled but not subsequently read.
  • Layout inefficient blocks-areas that have never entered, or thermal fields scattered throughout the block.

The use form is:

./valgrind --tool=dhat [dhat options]your-program [program options]

After execution, dhat.out. will be generated. This file is actually a json file. You need to open valgrind/libexec/valgrind/dh_view.html with a browser on the PC, click Load, and select the generated file dhat .out., as follows:

image-20200618180234124

If the following error occurs:

image-20200618180557992

It means there is a problem with the format of the generated json file. You can paste the content of the file into the json error check website to check the following problems:

image-20200618180848546

Modify it accordingly, and then Load again to see the corresponding content:

image-20200618181324967

reference

[1] git URL

[2] Official website quick start

[3] Official website manual

[4] Official website FAQ

[5] Research paper

[6] Programmer's self-cultivation

Related Posts