Laziest maldev : when your compiler do everything
0xkylm
Whoami
Student @ 2600
VR @ FuzzingLabs
Hypervisor, compilation and maldev enthusiast
Agenda
-
Introduction
-
LLVM Deep Dive
-
IR level pass
-
Evasion
-
Machine level pass
Introduction
Sometimes some yara check for exact bytes or strings, if the binary change at each compile we can bypass those dumb checks
And also increases the effort required for reverse engineering
Why polymorphisme and obfuscation

Introduction
Lot's of different projects



LLVM Deep Dive
What's llvm

LLVM Deep Dive
LLVM IR


LLVM Deep Dive
How optimization works
https://zhuanlan.zhihu.com/p/618817970


LLVM Deep Dive
Simpliest pass
#include "llvm/Transforms/MyObfsPass/Obf.h"
using namespace llvm;
PreservedAnalyses ObfsPass::run(Function &F, FunctionAnalysisManager &AM) {
//print the function name
outs() << "Processing function: " << F.getName() << "\n";
if (F.getName() != "main") {
outs() << "Skipping function: " << F.getName() << "\n";
return PreservedAnalyses::all();
}
IRBuilder<> Builder(F.getContext());
bool Changed = false;
//for the current function, find parse every BasicBlocks
for (BasicBlock &BB : F) {
//For each BasicBlock, parse every instruction
for (Instruction &I : BB) {
//Get operands for all instructions
for (unsigned i = 0; i < I.getNumOperands(); i++) {
Value *Op = I.getOperand(i);
//if it's a ConstantInt we change it to 42
if (ConstantInt *CI = dyn_cast<ConstantInt>(Op)) {
errs() << "Found constant: " << CI->getValue() << " in instruction: " << I << "\n";
I.setOperand(i, ConstantInt::get(CI->getType(), 42));
Changed = true;
}
}
}
}
return Changed ? PreservedAnalyses::none() : PreservedAnalyses::all();
}
[PassPluginLibraryInfo.... ]
LLVM Deep Dive
Simpliest pass
#add.c
#include <stdio.h>
int main(){
int a = 10;
int b = 12;
printf("a + b = %d",a+b);
return 1;
}
----------------------
Processing function: main
Found constant: 1 in instruction: %1 = alloca i32, align 4
Found constant: 1 in instruction: %2 = alloca i32, align 4
Found constant: 1 in instruction: %3 = alloca i32, align 4
Found constant: 0 in instruction: store i32 0, ptr %1, align 4
Found constant: 10 in instruction: store i32 10, ptr %2, align 4, !dbg !33
Found constant: 12 in instruction: store i32 12, ptr %3, align 4, !dbg !35
Found constant: 1 in instruction: ret i32 1, !dbg !37
Processing function: _vsprintf_l


IR lvl pass
Back2Hack : Encrypt Strings


IR lvl pass
Back2Hack : Encrypt Strings
Objectives :
- What a string?
- Find them
- Xor them compile time
- Add stub to decrypt / encrypt again
- enjoy
IR lvl pass
Back2Hack : Encrypt Strings
Well cheat a little bit
std::vector<StringUsage> collectStringUsages(Function &F) {
std::vector<StringUsage> Usages;
for (BasicBlock &BB : F) {
for (Instruction &I : BB) {
for (unsigned i = 0; i < I.getNumOperands(); ++i) {
Value *Op = I.getOperand(i);
auto *GV = dyn_cast<GlobalVariable>(Op);
if (!GV)
continue;
if (!GV->isConstant() || !GV->hasInitializer())
continue;
auto *CA = dyn_cast<ConstantDataArray>(GV->getInitializer());
if (!CA)
continue;
if (!CA->isString())
continue;
Usages.push_back({&I, GV, i});
[...]
IR lvl pass
Back2Hack : Encrypt Strings
Creating stub

BasicBlock *LoopCond = BasicBlock::Create(Ctx, "loop.cond", DeobfFunc);
BasicBlock *LoopBody = BasicBlock::Create(Ctx, "loop.body", DeobfFunc);
BasicBlock *LoopEnd = BasicBlock::Create(Ctx, "loop.end", DeobfFunc);
B.CreateBr(LoopCond);
// cond
B.SetInsertPoint(LoopCond);
PHINode *PtrPhi = B.CreatePHI(Type::getInt8Ty(
Ctx)->getPointerTo(), 2, "ptr");
PtrPhi->addIncoming(StrPtrArg, EntryBB);
Value *Cur = B.CreateLoad(Type::getInt8Ty(Ctx), PtrPhi, "cur");
Value *IsNotNull = B.CreateICmpNE(Cur, B.getInt8(0));
B.CreateCondBr(IsNotNull, LoopBody, LoopEnd);IR lvl pass
Back2Hack : Encrypt Strings
Enjoy


Decipher string, use it and cipher again
Evasion
Modularity is the key
LLVM uses LLVM IR modules, each compilation unit creates an IR module. Based on this, we can link a module even if it's not part of the project, allowing us to add bytecode via LLVM passes
We got all the symbols no strip yet, and already play with ir instruction creation with the strings encryption
Evasion
Loading bitcode
bool ObfsPass::loadByteCode(llvm::Module &M, const unsigned char Bc[], unsigned int BcLen) {
llvm::LLVMContext &Context = M.getContext();
auto MemBuffer = llvm::MemoryBuffer::getMemBuffer(
llvm::StringRef(reinterpret_cast<const char*>(Bc), BcLen),
"embedded_bitcode",
false
);
auto Module = llvm::parseBitcodeFile(MemBuffer->getMemBufferRef(), Context);
if (!Module) {
llvm::errs() << "Error parsing embedded bitcode :( \n";
return false;
}
std::unique_ptr<llvm::Module> ExternalMod = std::move(*Module);
ExternalMod->setModuleIdentifier("embedded_module");
llvm::Linker L(M);
if (L.linkInModule(std::move(ExternalMod))) {
llvm::errs() << "[-] Failed to link :(\n";
return false;
}
return true;
}
Evasion
Replace funcs
if (!loadByteCode(M, sub_bc, sub_bc_len)) {
return Changed ? PreservedAnalyses::none() : PreservedAnalyses::all();
}
// Now let's find it (because we linked it into current module)
Function *SubFunction = M.getFunction("sub");
Function *AddFunction = M.getFunction("add");
if(!SubFunction && AddFunction ){
return Changed ? PreservedAnalyses::none() : PreservedAnalyses::all();
}
//replace all calls to "sub" with a call to "add"
//get all call of AddFunction, and for each one change to SubFunction
for (auto &U : AddFunction->uses()) {
if (CallInst *CI = dyn_cast<CallInst>(U.getUser())) {
CI->setCalledFunction(SubFunction);
}
}
Evasion
CallstackSpoof


Machine level pass
Optimizing machine code
During the compilation process, while code generation there are also some optimizations, those optimizations are machine-level passes. Let's play with them
Machine level pass
Breaking Ghidra
Rex, are prefix use in x86 to tell the cpu the next opcode is a x64, but what if we use 2,3 or 10 Rex ?
Not all instructions require a REX prefix. The prefix is necessary only if an instruction references one of the extended registers or uses a 64-bit operand. If a REX prefix is used when it has no meaning, it is
ignored.
Machine level pass
Breaking Ghidra
Buuuuuuut Ghidra and other compiler don't work like a cpu and sometimes bugs can occured

Machine level pass
Breaking classical code path

Machine level pass
VT test
Okay this is not insane cuz it's not a very famous sample but
(only machine lvl pass)


Questions ?
Laziest maldev : when your compiler do everything
By 0xkylm
Laziest maldev : when your compiler do everything
- 54