Reverse analysis of top image reinforcement

Posted May 24, 202010 min read

This article merges two articles "Analysis of Top Image Reinforcement and One-Point Reduction" and "Reverse Analysis of Top Image Enterprise Edition Reinforcement", which can be understood as the analysis of the free version of the top image reinforcement and the Enterprise Edition.

"Analysis and Reinforcement of Top Image Reinforcement"

One:first dump the core function so

Reference: (see an essential post of Xue "An Android DEX vmp reinforcement Reverse analysis ")

It can be seen from the above that the memory address is discontinuous, and the code area will disappear after reaching the .init \ _array. There may be some unpacked data in the middle or IDA has not read it. Regardless of these, go to the application-level function.

At the application-level function oncreate, dump according to the address of in .init in memory. Follow in 010

Fix it and open it with IDA, you can clearly see the oncreate and test methods.

Second, follow the next step analysis and try to manually restore.

  1. The function NewLocalRef is sometimes used to ensure that a tool function returns a local reference, so understand first, and look down.

The Sub \ _206C function is a solution for JNI throwing an exception, as shown in the figure below for the function body:

Then look down at the sub \ _2170 function:

You can see that the objects of the class are obtained through Findclass, and then the attributes of the oncreate method are obtained through GetMethodID. Next, call callNonvirtualVoidMethod through the sub \ _13C8 function to call the method attribute just obtained.

This is the process of a JNI reflection call. The corresponding Smali code is as follows:

Finally, it is released after the sub \ _1980 local method is executed.

It can be found that the arrangement of the onimage function is mainly based on the principle of JNI reflection call in the Android SDK.

Through the above analysis, we can speculate that the corresponding smali call process is:

_ Invoke-super {how to determine the parameters} Landroid/support/v7/app/AppCompatActivity;-> onCreate(Landroid/os/Bundle;) V _

  1. Next, you can find that setContentView is the same as the oncreate call above, but how to determine this parameter is unclear during static analysis.

Needs dynamic debugging for analysis.

What can be determined is that the smali code executed by this block is:

_ const v2, 0x7f040019 _

_ invoke-virtual {p0, v2}, Lcom/example/zbb/dingxiangdemo01/MainActivity;-> setContentView(I) V _

  1. Then look down

You can find the same principle as above, but what you can see here is that the obtained value is given to V6, so there must be a process of calling the function to return to a certain register. Take a look at the process of dynamic debugging:

Therefore, the corresponding smali code can be obtained as:

_ const vX, 0x7f0c0051 _

_ invoke-virtual {p0, vX}, Lcom/example/zbb/dingxiangdemo01/MainActivity;-> findViewById(I) Landroid/view/View; _

_ move-result-object vX _

  1. Then the analysis shows that

The role of these two functions is that what you can get in the previous step is obtained by findviewbyid, naturally you need to convert the object reference to an instance of ID type. So this can get the corresponding smali as:

_ check-cast v1, Landroid/widget/EditText; _

  1. Then look down, the following is similar to the above will not be described.

The code of smali can be roughly inferred as:

_ const v2, 0x7f0c0052 _

_ invoke-virtual {p0, v2}, Lcom/example/zbb/dingxiangdemo01/MainActivity;-> findViewById(I) Landroid/view/View; _

_ move-result-object v0 _

_ check-cast v0, Landroid/widget/Button; _

  1. Here you can see the emergence of a new class

_ new-instance v2, Lcom/example/zbb/dingxiangdemo01/MainActivity $1; _

_ invoke-direct {v2, p0, v1}, Lcom/example/zbb/dingxiangdemo01/MainActivity $1;-> (Lcom/example/zbb/dingxiangdemo01/MainActivity; Landroid/widget/EditText;) _

_ invoke-virtual {v0, v2}, Landroid/widget/Button;-> setOnClickListener(Landroid/view/View $OnClickListener;) V _

Third, for pure java2C

During the test, I wrote a pure java method. What can be seen is that the top image has become a C method, as shown in the following figure:using this as the original java, assembly, and pseudo C

For the above restoration, there is no good way to think of it at present. It can only comprehend assembly, understand pseudo C, and write the corresponding java. I think this kind of idea will be more difficult to implement.

  1. Manual restore or automatic restore

The above analysis is a process of roughly restoring manually. It is really troublesome. It is still a half-effect effect. It includes a reduction analysis for MainActivity $1. This is still a simple demo. If a mature APK is restored like this It certainly won't work. If you confuse so, you will directly increase the workload. For the automatic restoration of this piece, I have not thought of how to write a good custom processing script.

V. Summary

This kind of protection is currently considered to be ideal and the effect should be on the dex virtual machine represented by 360. First, all smali should be understood as C and compiled. In many cases, many semantic distortions are complete. For an attacker to restore, it can only be converted to smali on the basis of understanding ARM. Understanding the ARM assembly is still more difficult. Therefore, the value of this protection strategy is relatively large.

For the protector, with the further conversion of smali to C, we can do some confusion about dex before compiling or before compiling, or say protection measures such as packed encryption for the protected so, and some for C I think the protection of the white box conversion is amazing. There is now the most popular VMP protection for so. So there are many things that can be done.

Six, YY

Think about how to protect an APK to be protected as a hardener?

First decompile to get the corresponding Smali code, then if it is converted from simple Smali to C, it should not be very realistic.

The focus here is to understand how the Dalvik virtual machine or ART virtual machine executes these assemblies, and how to simulate the execution of Smali in C language. It needs further consideration and analysis.

"Reverse Analysis of Reinforcement of Top Elephant Enterprise Edition"

First of all, how does Dingxiang explain this kind of virtual source protection?

Combined with the free version of the analysis, you can guess that this is the reinforcement process:

  1. First, convert the free version to convert the corresponding methods in dex to cpp files through JNI reflection(of course, there are more conversion methods in the enterprise version than the free version, and the internal class MainActivity $1 in the free version is not converted);

  2. Then there is the compilation of the unique toolchain, guessing that it is likely to be a custom compilation based on LLVM;

  3. Perform virtualization protection, generate relative VMdata and corresponding Handler interpreter, and finally read VMdata through dispatcher to complete the interpretation.

  4. Analysis:

1.So shelling:

According to the analysis of the last free version, we know the deformed UPX packing used for the protection of the so in the top image.

After the linker executes the first init function, which is the loader function of upx, it starts dumping.

Breakpoint at offset 274E:

According to the base and Size dump

Dump it and fix it.

After a simple repair of Load and dynamic, basically all functions can be seen statically,

as the picture shows:

2. Direct analysis:

Static analysis:

There are several core issues to be grasped, one is the virtual instructions of the core code, and the other is the virtual safe operating environment.

OK, let's get started! Open the analysis in IDA and see that JNI \ _Onload, oncreate, wanwan01, wanwan02, wanwan0, Testloop and other functions basically follow one rule:

So it can be guessed that these functions are the key functions that constitute that virtual running environment, and also the focus of our analysis.

After entering the sub \ _43B0 function, I found the following:

After a simple analysis, I found

After looking at the pseudo code in F5, it was found that it is a control flow flattening diagram. There are about 250 cases in total, but there are only more than 50 cases, and about 200 are not displayed.

During the hardening process, grammatically analyze the cpp file formed by dex native, or the intermediate binary file mentioned above, and then form about 50 "Handlers" according to the characteristics of these binary files.(A total of 253 "Handler", only 50 are used here), read this virtual instruction set each time it is interpreted. Then call the corresponding Handler to further explain and complete.

Dynamic debugging analysis:

As shown in the figure above is a process to remove the virtual instruction. The following figure is a process that returns after each call is executed.

The following feeling is to maintain an address table:

And in the debugging process, it is found that the contents of case94, case249, case107, case222, and case197 are the same, maybe some useless garbage instructions.

In debugging, as shown in the figure below, you will find that there are a large number of repeated virtual instructions, and the corresponding Handler has no specific semantics. As shown below:

Therefore, to switch ideas, start with some system functions of JNI:

3. Indirect analysis:

To convert ideas, HOOK these JNI interface functions, such as Findclass, GetMethodID functions, or find the addresses of these functions and then up-reference to find the Handler and corresponding virtual instructions.

Through the reinforcement analysis of the previous free version, we only focus on the two key functions Findclass, GetMethodID here;

It is found that the address of FindClass is:41520778; the address of GetMethodID is:4151FC58 and this address is unchanged when the phone does not restart.

So then analyze:

The same continued F9 execution found that every time stayed in the case of 116, only then did I realize that the real role of the dispatcher is this branch.

-.. $.. $-. $-. $-..- $.. $.- $-. $-. This offset holds some strings of various class and method names

The same continued F9 execution can be seen:it is to execute the ExceptionCheck system function.

It can be seen that this process does not execute the sub \ _2170 function in the last free version, as shown in the following figure:

For reflection execution, as long as one of these system functions is clarified, the basic logic can be basically understood, and then there is no need to analyze it. Basically, this branch of case 116 executes the following sub \ _XX functions Go through it again:

The above takes onCreate a function containing the Android SDK as a key analysis example, and the next JAVA SDK is the Testloop method:

Through the above analysis, it is found that the case 116 is also the Handler, which can basically reverse all the processes and frameworks implemented by reflection, and will not be demonstrated here.

Because the core branch in the last unprotected app is 210, and the corresponding "virtual instructions" are also different. This may be that the virtual instructions of each device are different from what the top image said.

From the above, it can be explained that the core branch is the same in the same APP, the "virtual instructions" of different APPs are different, and the core branch is also different.

The reflection analysis with Android SDK or java SDK is basically here. In fact, you can use the HOOK JNI interface function and then log output. According to the log output and the above analysis, the basic flow of the entire reflection execution is better understood. So try to manually restore.

  1. Summary


  1. First of all, this method of compilation virtualization can perfectly solve the problem of platform compatibility. Unlike instruction-based virtualization, it requires a lot of adaptation.

  2. Realize the logic hiding and make sure that the "virtual instruction" of each hardened APP is different from the corresponding important case branch.

  3. By adding control flow confusion to the generated cpp, and adding garbage branches to it, the difficulty of positive analysis is increased.

Dex protection:

The java layer first made a translation of java2cpp, and then went to the virtual machine protection. The process of cpp2vmp. Since the process of java2cpp needs to use some interfaces of JNI, it is necessary to indirectly find the key points of the core.

Cpp protection:

The Cpp layer is directly a private NDK compilation tool provided by DX alone, and it is strongly hardened and protected directly during the compilation process. The above logic cannot be used, and a positive analysis is required.

  1. Finally

I have analyzed it myself a few years ago, but I haven't fully analyzed it yet. There are many other things after the year, and I don't have time to share it with everyone. I hope everyone can communicate and learn together.

DX reinforcement is really powerful, and it is very difficult to complete and perfect the processing at each step!

Students with reinforcement needs look here:

Android application hardening: [ ]( & unit =%E9%A1%B6%E8%B1%A1%E5%8A%A0%E5%9B%BA%E9%80%86%E5%90%91%E5%88%86%E6%9E%90)

iOS application hardening:[] ( & unit =%E9%A1%B6%E8%B1%A1%E5%8A%A0%E5%9B%BA%E9%80%86%E5%90%91%E5%88%86%E6%9E%90)