How to bind variables in C ++
Ok, my problem is this: I am trying to translate x86 assembly source code to C ++ source code.
Explanation as to what registers are.
skip this if you know what they are and how they work.
As you may or may not know, assembly language makes use of "general purpose registers".
In x86 assembly these registers are, and can be considered as "4 bytes" in length variables ( int var in c++ ), their names are: eax, ebx, ecx and edx.
Now, these registers are each respectively broken down into ax, bx, cx and dx that represent the 2 bytes less significant value of each register.
ax, bx, cx and dx are also broken down into ah, bx, ch and dh ( most significant byte ) and al, bl, cl and dl ( less significant byte ).
So, for example:
If I set eax:
EAX = 0xAB12CDEF
that would automatically change ax, al and ah
AX would become 0xCDEF
AH would become 0xCD
AL would become 0xEF
My question is, how do I do this in C ++?
int eax, ax, ah, al; eax = 0xAB12CDEF
How can I make ax, ah and al change at the same time? Or can we make them pointers to different parts, if so, how? Thank you! Postscript Also how can I use another char variable? How can I make the variable the new variable "char chAL" point to al that points to eax. So when I make changes to chAL, the changes are automatically reflected in eax, ah and al.
source to share
If your goal is to emulate X86 assembly code, you really need to support X86 register behavior.
Here's a simple implementation using union
:
#include <iostream>
#include <cstdint>
using namespace std;
union reg_t {
uint64_t rx;
uint32_t ex;
uint16_t x;
struct {
uint8_t l;
uint8_t h;
};
};
int main(){
reg_t a;
a.rx = 0xdeadbeefcafebabe;
cout << "rax = " << hex << a.rx << endl;
cout << "eax = " << hex << a.ex << endl;
cout << "ax = " << hex << a.x << endl;
cout << "al = " << hex << (uint16_t)a.l << endl;
cout << "ah = " << hex << (uint16_t)a.h << endl;
cout << "ax & 0xFF = " << hex << (a.x & 0xFF) << endl;
cout << "(ah << 8) + al = " << hex << (a.h << 8) + a.l << endl;
}
output:
rax = deadbeefcafebabe eax = cafebabe ax = babe al = be ah = ba ax & 0xFF = be (ah << 8) + al = babe
You will get the correct result on the correct platform (little-endian). You will have to swap bytes and / or add additions for other platforms.
This is a basic down-to-earth solution that will certainly work on many x86 platforms (at least X86 / linux / g ++ works great), but the behavior this approach uses is similar to undefined in C ++.
Here's a different approach using a byte array to store the contents of the register:
class x86register {
uint8_t bytes[8];
public:
x86register &operator =(const uint64_t &v){
for (int i = 0; i < 8; i++)
bytes[i] = (v >> (i * 8)) & 0xff;
return *this;
}
x86register &operator =(const uint32_t &v){
for (int i = 0; i < 4; i++)
bytes[i] = (v >> (i * 8)) & 0xff;
return *this;
}
x86register &operator =(const uint16_t &v){
for (int i = 0; i < 2; i++)
bytes[i] = (v >> (i * 8)) & 0xff;
return *this;
}
x86register &operator =(const uint8_t &v){
bytes[0] = v;
return *this;
}
operator uint64_t(){
uint64_t res = 0;
for (int i = 7; i >= 0; i--)
res = (res << 8) + bytes[i];
return res;
}
operator uint32_t(){
uint32_t res = 0;
for (int i = 4; i >= 0; i--)
res = (res << 8) + bytes[i];
return res;
}
operator uint16_t(){
uint16_t res = 0;
for (int i = 2; i >= 0; i--)
res = (res << 8) + bytes[i];
return res;
}
operator uint8_t(){
return bytes[0];
}
};
This simple class should work no matter what platform it is running on. Also, you probably want to add a few other accessories / mutators to handle HSB word registers (AH, BH, etc.).
source to share
You can extract parts of eax using bitwise operations like:
void main()
{
int eax, ax, ah, al;
eax = 0xAB12CDEF;
ax = eax & 0x0000FFFF;
ah = (eax & 0x0000FF00) >> 8;
al = eax & 0x000000FF;
printf("ax = eax & 0x0000FFFF = 0x%X\n", ax);
printf("ah = (eax & 0x0000FF00) >> 8 = 0x%X\n", ah);
printf("al = eax & 0x000000FF = 0x%X\n", al);
}
Output
ax = eax & 0x0000FFFF = 0xCDEF
ah = (eax & 0x0000FF00) >> 8 = 0xCD
al = eax & 0x000000FF = 0xEF
You can also define a macro:
#define AX(dw) ((dw) & 0x0000FFFF)
#define AH(dw) ((dw) & 0x0000FF00) >> 8)
#define AL(dw) ((dw) & 0x000000FF)
void main()
{
int eax = 0xAB12CDEF;
cout << "ax = " << hex << AX(eax) << endl; // prints ax = 0xCDEF
}
source to share
If you want it to work as simply as you put the ints example, you can get around it with reinterpretations, although this violates the pointer alias rules, so the behavior is undefined.
std::uint32_t eax = 0xAB12CDEF;
std::uint16_t& ax = reinterpret_cast<std::uint16_t*>(&eax)[1];
std::uint8_t& ah = reinterpret_cast<std::uint8_t&>(ax);
std::uint8_t& al = (&ah)[1];
The second line passes the address eax in std::uint16_t*
, applying to it [1]
you get the second half of 32 bits.
The third line is just a cast to uint8_t
, which works because ah will be the same as before the ax.
Indexing into an address ah
by 1 gives the next byte, which is al
.
What you are trying to do seems quite insecure and strange. Thus, to get the most similar behavior in the safest way, you can simply use your own type. However, the results will be consistent with machine to machine below, but they will not be in the above due to different endian schemes.
class Reg {
private:
std::uint32_t data_;
public:
Reg(std::uint32_t in) : data_{in} { }
std::uint32_t ex() const {
return data_;
}
std::uint16_t x() const {
return static_cast<std::uint16_t>(data_ & 0xFFFF);
}
std::uint8_t h() const {
return static_cast<std::uint8_t>((data_ & 0xFF00) >> 8);
}
std::uint8_t l() const {
return static_cast<std::uint8_t>(data_ & 0xFF);
}
};
source to share