Discussion:
Add a rpath to an already compiled binary
(too old to reply)
Quentin Godfroy
2007-05-31 21:29:01 UTC
Permalink
Hi all,

I was told by a friend that having a tool which could add a rpath to
an already linked binary would help a lot of people.
So I wrote a small utility (in C) to do it, and it works quite well,
apart from a very crude interface.
The actual big problem is that to do the things properly, it moves the
program headers table, which triggers a bug in the linux kernel
(2.6.21.1 has it)(*). I have got a patch to the kernel which corrects
the problem(#). There is also a compilation option for the binaries
produced to work on an uncorrected kernel, but it doesn't work on PIE
for a reason I don't understand (whereas on patched kernels it works).
(maybe a bug in ld.so)
You will find a page which explains a bit how it works, and which
provides a link to the source.

http://www.eleves.ens.fr/home/godfroy/addrpathen.html

I would be happy to recieve any comments on this. (or on the kernel
patch)

Thanks.

(*) And it's not the only software for which produced binaries trigger
bugs. The libBFD seems to have dirty assumptions about elf binaries,
as well as eu-strip.

(#) I have submitted it to linux-kernel, but it seems to have been
forgotten. I will resubmit it when I recieve enough reviews on it.

--
Quentin Godfroy
Tim S
2007-05-31 22:05:52 UTC
Permalink
Post by Quentin Godfroy
Hi all,
I was told by a friend that having a tool which could add a rpath to
an already linked binary would help a lot of people.
Hi,

It already exists:

man chrpath

(but you may have to install it - it's not always lying around on a default
install).

Why not have a look at it's source too - I presume it does what you are
trying to achieve.

HTH

Tim
Quentin Godfroy
2007-05-31 22:18:36 UTC
Permalink
Post by Tim S
Post by Quentin Godfroy
I was told by a friend that having a tool which could add a rpath to
an already linked binary would help a lot of people.
man chrpath
(but you may have to install it - it's not always lying around on a default
install).
Why not have a look at it's source too - I presume it does what you are
trying to achieve.
I don't think so.
chrpath can shrink a rpath, remove one but not add one, nor expand
one.
(at least on the version I have on my debian)

Cheers,
Quentin
Tim S
2007-05-31 22:53:40 UTC
Permalink
Post by Quentin Godfroy
Post by Tim S
Post by Quentin Godfroy
I was told by a friend that having a tool which could add a rpath to
an already linked binary would help a lot of people.
man chrpath
(but you may have to install it - it's not always lying around on a
default install).
Why not have a look at it's source too - I presume it does what you are
trying to achieve.
I don't think so.
chrpath can shrink a rpath, remove one but not add one, nor expand
one.
(at least on the version I have on my debian)
Cheers,
Quentin
You're right - having bothered to actually read the man page myself *blush*.

I see the problem now... Unfortunately I'm not in a position to help :(

Cheers

Tim
Paul Pluzhnikov
2007-06-01 01:55:18 UTC
Permalink
Post by Quentin Godfroy
I was told by a friend that having a tool which could add a rpath to
an already linked binary would help a lot of people.
So I wrote a small utility (in C) to do it, and it works quite well,
apart from a very crude interface.
Not diminishing your work in any way, I wounder why these people
can't either set LD_LIBRARY_PATH [1], or use chrpath? [2]

[1] Yes, I know it doesn't work for set-uid executables. But how
often do such executables need their RPATH modified?

[2] The limitation that new RPATH must not be longer than the
original has a trivial workaround -- just create a short symlink
to the (desired) long path.

Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
Rainer Weikusat
2007-06-01 07:03:09 UTC
Permalink
Post by Paul Pluzhnikov
Post by Quentin Godfroy
I was told by a friend that having a tool which could add a rpath to
an already linked binary would help a lot of people.
So I wrote a small utility (in C) to do it, and it works quite well,
apart from a very crude interface.
Not diminishing your work in any way, I wounder why these people
can't either set LD_LIBRARY_PATH [1], or use chrpath? [2]
And I wonder why people obviously not interested in developing
something keep posting this ...
Quentin Godfroy
2007-06-01 07:04:22 UTC
Permalink
Post by Paul Pluzhnikov
Post by Quentin Godfroy
I was told by a friend that having a tool which could add a rpath to
an already linked binary would help a lot of people.
So I wrote a small utility (in C) to do it, and it works quite well,
apart from a very crude interface.
Not diminishing your work in any way, I wounder why these people
can't either set LD_LIBRARY_PATH [1], or use chrpath? [2]
LD_LIBRARY_PATH is not really adapted when you have tons of
directories where programs & libraries are compiled. Personally I
never compile things in the same root, for the reason that it becomes
really rapidly a real mess.
And shell wrappers are not really reliable.

But that's a question of taste I suppose.
Post by Paul Pluzhnikov
[1] Yes, I know it doesn't work for set-uid executables. But how
often do such executables need their RPATH modified?
Quite rarely indeed. But you don't always have the sources or whish to
recompile everything to move a library. Or even try to understand why
the makefile or libtool decides to put a bad rpath, or something.
Post by Paul Pluzhnikov
[2] The limitation that new RPATH must not be longer than the
original has a trivial workaround -- just create a short symlink
to the (desired) long path.
That supposes you have access to large parts of the filesystem, which
is not always the case, and personnally I would not like to have my
libraries being looked for by a symlink in /var/tmp/libfoo.so. IMHO,
making a symlink is really worse than everything else.

But anyway, nobody asks you to use the tool. I'm just suggesting it.
Nix
2007-06-04 09:28:30 UTC
Permalink
Post by Quentin Godfroy
LD_LIBRARY_PATH is not really adapted when you have tons of
directories where programs & libraries are compiled. Personally I
never compile things in the same root, for the reason that it becomes
really rapidly a real mess.
And shell wrappers are not really reliable.
But that's a question of taste I suppose.
Why can't you use /etc/ld.so.conf?
Post by Quentin Godfroy
Post by Paul Pluzhnikov
[1] Yes, I know it doesn't work for set-uid executables. But how
often do such executables need their RPATH modified?
Quite rarely indeed. But you don't always have the sources or whish to
recompile everything to move a library. Or even try to understand why
the makefile or libtool decides to put a bad rpath, or something.
Again, that's what /etc/ld.so.conf is for.
--
`On a scale of one to ten of usefulness, BBC BASIC was several points ahead
of the competition, scoring a relatively respectable zero.' --- Peter Corlett
Quentin Godfroy
2007-06-04 17:05:38 UTC
Permalink
Post by Nix
Post by Quentin Godfroy
LD_LIBRARY_PATH is not really adapted when you have tons of
directories where programs & libraries are compiled. Personally I
never compile things in the same root, for the reason that it becomes
really rapidly a real mess.
And shell wrappers are not really reliable.
But that's a question of taste I suppose.
Why can't you use /etc/ld.so.conf?
Because a user may not whish to ask the administrator to add its
libraries in ld.so.conf
Post by Nix
Post by Quentin Godfroy
Post by Paul Pluzhnikov
[1] Yes, I know it doesn't work for set-uid executables. But how
often do such executables need their RPATH modified?
Quite rarely indeed. But you don't always have the sources or whish to
recompile everything to move a library. Or even try to understand why
the makefile or libtool decides to put a bad rpath, or something.
Again, that's what /etc/ld.so.conf is for.
For root. Not for a simple user.
Rainer Weikusat
2007-06-01 08:02:03 UTC
Permalink
Post by Quentin Godfroy
The actual big problem is that to do the things properly, it moves the
program headers table, which triggers a bug in the linux kernel
(2.6.21.1 has it)(*).
[...]
Post by Quentin Godfroy
http://www.eleves.ens.fr/home/godfroy/addrpathen.html
I would be happy to recieve any comments on this. (or on the kernel
patch)
I'll confine myself to the kernel patch. A introductory remark: If
everybody else is doing it in the different way and there is a reason
to assume that 'everybody else' may have been doing it quite some time
longer, it is not an unreasonable assumption that 'the different way'
may actually be the correct one.

+++ linux-2.6.21.1-patch/fs/binfmt_elf.c 2007-05-12 21:10:46.000000000 -0400
@@ -133,7 +133,7 @@ static int padzero(unsigned long elf_bss

static int
create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
- int interp_aout, unsigned long load_addr,
+ int interp_aout, unsigned long phdr_addr,
unsigned long interp_load_addr)

The person who called this 'load address' instead of 'program header
address' presumably thought 'load address' would make more
sense. Since you didn't change the caller, you shouldn't change the
parameter name it doesn't matter if you think it makes more sense
this way, because you might be wrong and apart from that, the
maintainer, if he or she isn't the very person with the differing
opinion, will consider this to be a pointless change (if you were
designing a language called C++, doing it wrong from the start and
then doing it wrong again by changing to what should have been used
from the beginning after everybody got accustomed to the other would be
normal behaviour ...).

- NEW_AUX_ENT(AT_PHDR, load_addr + exec->e_phoff);
+ NEW_AUX_ENT(AT_PHDR, phdr_addr);

The e_phoff member of an ELF header is defined as

This member holds the program header table's file offset in bytes.
If the file has no program header table, this member holds
zero.
(SysV ABI, p. 50)

This means that 'load_addr + exec->e_phoff' is decidedly the location
where the program header should be.

- for (i = 0; i < loc->elf_ex.e_phnum; i++, elf_ppnt++)
+ for (i = 0; i < loc->elf_ex.e_phnum; i++, elf_ppnt++) {
+ if (elf_ppnt->p_type == PT_PHDR) {
+ phdr_addr = elf_ppnt->p_vaddr;
+ continue;
+ }

At this point, you overwrite the value passed as 'load address' into
the subroutine with the virtual address stored in the program header.
The p_vaddr member of the program header structure is defined as

This member gives the virtual address at which the first byte of
the segment resides in memory.
(ib, p. 75)

Since the exact details of program loading are architecture specific,
the gabi specification refers to a 'processor specific
supplement'. For i386, this contains the following text (on page 2-3)

Base Address

The virtual addresses in the program headers might not
represent the actual virtual addresses of the program's memory
image. Executable files typically contain absolute code. To
let the process execute correctly, the segments must reside at
the virtual addresses used to build the executable file. On
the other hand, shared object segments typically contain
position-independent code. This lets a segment's virtual
address change from one process to another, without
invalidating execution behavior. Though the system chooses
virtual addresses for individual processes, it maintains the
segments' relative positions. Because

Conclusion: The kernel is right and your code is wrong.
Quentin Godfroy
2007-06-01 19:30:50 UTC
Permalink
Post by Rainer Weikusat
Post by Quentin Godfroy
The actual big problem is that to do the things properly, it moves the
program headers table, which triggers a bug in the linux kernel
(2.6.21.1 has it)(*).
[...]
Post by Quentin Godfroy
http://www.eleves.ens.fr/home/godfroy/addrpathen.html
I would be happy to recieve any comments on this. (or on the kernel
patch)
I'll confine myself to the kernel patch. A introductory remark: If
everybody else is doing it in the different way and there is a reason
to assume that 'everybody else' may have been doing it quite some time
longer, it is not an unreasonable assumption that 'the different way'
may actually be the correct one.
Agreed.
Post by Rainer Weikusat
+++ linux-2.6.21.1-patch/fs/binfmt_elf.c 2007-05-12 21:10:46.000000000 -0400
@@ -133,7 +133,7 @@ static int padzero(unsigned long elf_bss
static int
create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
- int interp_aout, unsigned long load_addr,
+ int interp_aout, unsigned long phdr_addr,
unsigned long interp_load_addr)
The person who called this 'load address' instead of 'program header
address' presumably thought 'load address' would make more
sense.
Since you didn't change the caller, you shouldn't change the
parameter name it doesn't matter if you think it makes more sense
this way, because you might be wrong and apart from that, the
maintainer, if he or she isn't the very person with the differing
opinion, will consider this to be a pointless change (if you were
designing a language called C++, doing it wrong from the start and
then doing it wrong again by changing to what should have been used
from the beginning after everybody got accustomed to the other would be
normal behaviour ...).
- NEW_AUX_ENT(AT_PHDR, load_addr + exec->e_phoff);
+ NEW_AUX_ENT(AT_PHDR, phdr_addr);
The e_phoff member of an ELF header is defined as
This member holds the program header table's file offset in bytes.
If the file has no program header table, this member holds
zero.
(SysV ABI, p. 50)
This means that 'load_addr + exec->e_phoff' is decidedly the location
where the program header should be.
Yes, in the file, not in the memory map.

To me, giving to the ld.so load_addr + exec->e_phoff seems more like a
guess than looking for where is exactly in memory

Well, If the program headers are NOT in the first loaded segment, this
code is wrong. And I don't see where the norm prohibits this.
Post by Rainer Weikusat
- for (i = 0; i < loc->elf_ex.e_phnum; i++, elf_ppnt++)
+ for (i = 0; i < loc->elf_ex.e_phnum; i++, elf_ppnt++) {
+ if (elf_ppnt->p_type == PT_PHDR) {
+ phdr_addr = elf_ppnt->p_vaddr;
+ continue;
+ }
At this point, you overwrite the value passed as 'load address' into
the subroutine with the virtual address stored in the program header.
The p_vaddr member of the program header structure is defined as
yes, but It is corrected later by adding load_bias, which makes
Post by Rainer Weikusat
This member gives the virtual address at which the first byte of
the segment resides in memory.
(ib, p. 75)
Since the exact details of program loading are architecture specific,
the gabi specification refers to a 'processor specific
supplement'. For i386, this contains the following text (on page 2-3)
Base Address
The virtual addresses in the program headers might not
represent the actual virtual addresses of the program's memory
image. Executable files typically contain absolute code. To
let the process execute correctly, the segments must reside at
the virtual addresses used to build the executable file. On
the other hand, shared object segments typically contain
position-independent code. This lets a segment's virtual
address change from one process to another, without
invalidating execution behavior. Though the system chooses
virtual addresses for individual processes, it maintains the
segments' relative positions. Because
Conclusion: The kernel is right and your code is wrong.
This patch works for PIE, I perfectly know that elf_ppnt->p_vaddr
might not be the address where the program headers are in the actual
map.
Rainer Weikusat
2007-06-03 12:08:35 UTC
Permalink
[...]
Post by Quentin Godfroy
Post by Rainer Weikusat
- NEW_AUX_ENT(AT_PHDR, load_addr + exec->e_phoff);
+ NEW_AUX_ENT(AT_PHDR, phdr_addr);
The e_phoff member of an ELF header is defined as
This member holds the program header table's file offset in bytes.
If the file has no program header table, this member holds
zero.
(SysV ABI, p. 50)
This means that 'load_addr + exec->e_phoff' is decidedly the location
where the program header should be.
Yes, in the file, not in the memory map.
To me, giving to the ld.so load_addr + exec->e_phoff seems more like a
guess than looking for where is exactly in memory
load_addr is calculated in the caller as p_vaddr - p_offset of the
first text segment found in the file. This is by definition the
beginning of the loaded file (see TIS/ ELF, p 2-2). Since e_phoff is
the offset of the program header from the beginning of the file,
e_phoff + (p_vaddr - p_offset) is the location of the program header
if the code that actually maps the respective pages works like the
code used here does (wich freaked-outedly loads the file into some
memory from the start ...).
Post by Quentin Godfroy
Well, If the program headers are NOT in the first loaded segment, this
code is wrong. And I don't see where the norm prohibits this.
Reading it could help here. Try the figure on page 1-1.
Quentin Godfroy
2007-06-03 21:36:50 UTC
Permalink
Post by Rainer Weikusat
beginning of the loaded file (see TIS/ ELF, p 2-2). Since e_phoff is
the offset of the program header from the beginning of the file,
e_phoff + (p_vaddr - p_offset) is the location of the program header
No. This is a guess.
Post by Rainer Weikusat
Post by Quentin Godfroy
Well, If the program headers are NOT in the first loaded segment, this
code is wrong. And I don't see where the norm prohibits this.
Reading it could help here. Try the figure on page 1-1.
You should read it too :

NOTE
Although the figure shows the program header table immediately after
the
ELF header, and the section header table following the sections,
actual files
may differ. Moreover, sections and segments have no specified order.
Only the ELF header has a fixed position in the file.
(Book I : ELF, 1-2, Tool Interface Standard, Executable and Linking
Format
Specification, Version 1.2)

Has your version of the PDF got holes??
Rainer Weikusat
2007-06-04 08:14:26 UTC
Permalink
Post by Quentin Godfroy
Post by Rainer Weikusat
beginning of the loaded file (see TIS/ ELF, p 2-2). Since e_phoff is
the offset of the program header from the beginning of the file,
e_phoff + (p_vaddr - p_offset) is the location of the program header
No. This is a guess.
It is the required file format for SVR4 ELF executables on
Intel. Which happens to be the executable file format that is used on
Linux, too. Even if this wasn't true, the kernel works the way it
works and your program modifies a working executable in a way that it
can no longer be executed. If is, of course, possible to modify the
ELF code in the kernel to work with QG-ELF, too, but why?

BTW, don't bother to respond. You are too much of a nuisance to
continue to read your posts.
Quentin Godfroy
2007-06-04 17:04:21 UTC
Permalink
Post by Rainer Weikusat
Post by Quentin Godfroy
Post by Rainer Weikusat
beginning of the loaded file (see TIS/ ELF, p 2-2). Since e_phoff is
the offset of the program header from the beginning of the file,
e_phoff + (p_vaddr - p_offset) is the location of the program header
No. This is a guess.
It is the required file format for SVR4 ELF executables on
Intel.
So SRV4 ELF files do not respect the norm itself.
By the way, why to bother do a code like load_address + header-
Post by Rainer Weikusat
e_phoff as we all know that the header is 52 bytes long? Why not
load_address + sizeof(ELF32_Ehdr)?
Post by Rainer Weikusat
Which happens to be the executable file format that is used on
Linux, too. Even if this wasn't true, the kernel works the way it
works and your program modifies a working executable in a way that it
can no longer be executed. If is, of course, possible to modify the
ELF code in the kernel to work with QG-ELF, too, but why?
That is NOT true. The patch perfectly works for every executable and
shared libraries produced by the binutils. Please show me an example
where the code fails instead of saying that I am wrong.
Quentin Godfroy
2007-06-04 17:46:29 UTC
Permalink
Post by Rainer Weikusat
Post by Quentin Godfroy
Post by Rainer Weikusat
beginning of the loaded file (see TIS/ ELF, p 2-2). Since e_phoff is
the offset of the program header from the beginning of the file,
e_phoff + (p_vaddr - p_offset) is the location of the program header
No. This is a guess.
It is the required file format for SVR4 ELF executables on
Intel. Which happens to be the executable file format that is used on
Linux, too. Even if this wasn't true, the kernel works the way it
works and your program modifies a working executable in a way that it
can no longer be executed. If is, of course, possible to modify the
ELF code in the kernel to work with QG-ELF, too, but why?
Because it adds a lot of generality to the cost of changing TEN lines
of code, and it makes the kernel respect the norm.
k***@gmail.com
2018-04-20 02:19:04 UTC
Permalink
Post by Quentin Godfroy
Post by Rainer Weikusat
Post by Quentin Godfroy
Post by Rainer Weikusat
beginning of the loaded file (see TIS/ ELF, p 2-2). Since e_phoff is
the offset of the program header from the beginning of the file,
e_phoff + (p_vaddr - p_offset) is the location of the program header
No. This is a guess.
It is the required file format for SVR4 ELF executables on
Intel. Which happens to be the executable file format that is used on
Linux, too. Even if this wasn't true, the kernel works the way it
works and your program modifies a working executable in a way that it
can no longer be executed. If is, of course, possible to modify the
ELF code in the kernel to work with QG-ELF, too, but why?
Because it adds a lot of generality to the cost of changing TEN lines
of code, and it makes the kernel respect the norm.
bumping an old thread because godfroy seems quite reasonable and i just feel like possibly annoying that cranky bastard who insisted that the kernel is sacred or something.
christ coding in a kernel is abusing everything holy from time to time, you literally are writing the ticket... and you give him hell for what comes down to adherence to the spec? or rather his failure to adhere to your lazy, broken and more importantly conveniently illiterateness... it's appropriate that this fellow in the end was so damned arrogant he couldn't even hold himself to his pedantic and irritating standard. in another decade maybe someone else will get a laugh out of it and the curmudgeon will never forget...that PDF bit is a real gas. holes in it...maybe the wall afterwards. also thanks, a decade later, i'm dealing with some proprietary binaries in a closed system and this is helping me keep it working. if any of you larp at me about proprietary binaries, i swear to christ this thread will be open through the year 3000. the funny part is some people don't mind doing this sort of thing and aren't bitter trolls about it when someone does something useful even if they don't quite see it as such.
personally i'm just probably mildly sadistic in certain ways and i'm really just enjoying the thought of this jackass blowing his top ten years after being a tremendous blowhard.
good day to you kind sir, as for the rest of you lot, i've a sentiment for you today but not that one. :D
k***@gmail.com
2018-04-20 02:21:24 UTC
Permalink
also, dynamic linking is a plague and if i am bitter over anything it is the way your mentality regarding linkage has salted the earth with this abomination so thoroughly.
except fixes rpath, i like his style. the rest of you, that sentiment was you can go suck a barrel of eggs.
Post by k***@gmail.com
Post by Quentin Godfroy
Post by Rainer Weikusat
Post by Quentin Godfroy
Post by Rainer Weikusat
beginning of the loaded file (see TIS/ ELF, p 2-2). Since e_phoff is
the offset of the program header from the beginning of the file,
e_phoff + (p_vaddr - p_offset) is the location of the program header
No. This is a guess.
It is the required file format for SVR4 ELF executables on
Intel. Which happens to be the executable file format that is used on
Linux, too. Even if this wasn't true, the kernel works the way it
works and your program modifies a working executable in a way that it
can no longer be executed. If is, of course, possible to modify the
ELF code in the kernel to work with QG-ELF, too, but why?
Because it adds a lot of generality to the cost of changing TEN lines
of code, and it makes the kernel respect the norm.
bumping an old thread because godfroy seems quite reasonable and i just feel like possibly annoying that cranky bastard who insisted that the kernel is sacred or something.
christ coding in a kernel is abusing everything holy from time to time, you literally are writing the ticket... and you give him hell for what comes down to adherence to the spec? or rather his failure to adhere to your lazy, broken and more importantly conveniently illiterateness... it's appropriate that this fellow in the end was so damned arrogant he couldn't even hold himself to his pedantic and irritating standard. in another decade maybe someone else will get a laugh out of it and the curmudgeon will never forget...that PDF bit is a real gas. holes in it...maybe the wall afterwards. also thanks, a decade later, i'm dealing with some proprietary binaries in a closed system and this is helping me keep it working. if any of you larp at me about proprietary binaries, i swear to christ this thread will be open through the year 3000. the funny part is some people don't mind doing this sort of thing and aren't bitter trolls about it when someone does something useful even if they don't quite see it as such.
personally i'm just probably mildly sadistic in certain ways and i'm really just enjoying the thought of this jackass blowing his top ten years after being a tremendous blowhard.
good day to you kind sir, as for the rest of you lot, i've a sentiment for you today but not that one. :D
Quentin Godfroy
2007-06-01 19:37:08 UTC
Permalink
Post by Rainer Weikusat
Post by Quentin Godfroy
http://www.eleves.ens.fr/home/godfroy/addrpathen.html
I would be happy to recieve any comments on this. (or on the kernel
patch)
I'll confine myself to the kernel patch.
[snip]
Post by Rainer Weikusat
+++ linux-2.6.21.1-patch/fs/binfmt_elf.c 2007-05-12 21:10:46.000000000 -0400
@@ -133,7 +133,7 @@ static int padzero(unsigned long elf_bss
static int
create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
- int interp_aout, unsigned long load_addr,
+ int interp_aout, unsigned long phdr_addr,
unsigned long interp_load_addr)
The person who called this 'load address' instead of 'program header
address' presumably thought 'load address' would make more
sense. Since you didn't change the caller,
I did change it. Or I do not understand what do you mean.
Post by Rainer Weikusat
you shouldn't change the
parameter name it doesn't matter if you think it makes more sense
this way, because you might be wrong and apart from that, the
maintainer, if he or she isn't the very person with the differing
opinion, will consider this to be a pointless change
Loading...