Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems for nested GEP instructions #30

Open
cpumask opened this issue Feb 5, 2019 · 0 comments
Open

Problems for nested GEP instructions #30

cpumask opened this issue Feb 5, 2019 · 0 comments

Comments

@cpumask
Copy link

cpumask commented Feb 5, 2019

Hi, @Machiry

I tried to do taint analysis w/ Dr.checker for a sample program, a part of the code is shown below:

typedef struct {
int time;
int data;
} Info;

int foo(Info *info){
int t = info->data;
.....
}

The whole "info" structure is set as the taint source (including all the fields), so obviously that "t" should also be tainted since it will be assigned "info->data", but it's not. After a further inspection, I found that "int t = info->data;" will be translated into the below IR:

%0 = load i32, i32* getelementptr inbounds (%struct.Info, %struct.Info* @info, i32 0, i32 1)

You can see that there is a constant GEP operator embedded in the "load" instruction, which causes some troubles for Dr.Checker's alias analysis. The point-to information of this embedded GEP operator is not recorded in the global state, so when doing taint analysis, it will say "no objects are pointed to by this GEP operator", thus "t" will not be tainted.

I then checked the AliasAnalysis source code of Dr.Checker, in visitLoadInstruction(), if the load instruction has an embedded GEP operator, the "srcPointer" will be simply set as "gep->getPointerOperand()", in this example it's "@info", this way we will lose the field information right? So it will be field-insensitive in this case? Besides, here the point-to information of this GEP will not be analyzed and recorded.

Another related problem is for visitGEPInstruction() of the alias analysis, consider we have the below source code:

typedef struct {
int a;
int b;
} Extra;

typedef struct {
int time;
int data;
Extra p;
} Info;

foo(Info *info){
int t = info->p.b;
....
}

The resulting llvm IR for "int t = info->p.b" may look like this:

%0 = i32* getelementptr, struct Extra * (getelementptr, struct Info * @info, 0, 2), i32 0, i32 1
%1 = load i32, i32* %0

So we have a nested GEP instruction now, from my understanding of visitGEPInstruction() of Dr.Checker's alias analysis, in this case, the "srcPointer" will be set as "@info", but the "structFieldId" will still be "1" (i.e. the offset of "b" in the "Extra" structure), so the final result is "%0 will point to field 1 in the @info structure", which seems not correct?

Please correct me if this is actually not an issue, as I'm not as familar to the Dr.Checker's source code as you. Thanks in advance for your kind help!

Best,
Hang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant