Bypassing encapsulation in C++

Bypassing encapsulation in C++

One of the most popular programming paradigms in existence is the Object-Oriented Programming. It is a programming paradigm whereby related program data and code are group together into structures called classes. But we are not here to discuss object-oriented programming, we are here to talk about encapsulation one of the features of object-oriented programming. Encapsulation is basically data hiding. Sometimes we might want to control access to internal state variables of an object so that we allow them to transform from one state to another in a predictable way. To control this access, we have to provide a separation between internally visible state variables and logic, and externally accessible state variables and logic. C++ provides this granular level of code and data visibility by providing a couple of keywords, public, private and protected.

When data or logic is marked as public, then this means they can be accessed from outside the class. Example below

#include <iostream> 

class SomeClass 
{
public:
    void somefunction(void)  
    {
       std::cout << "SomeClass::somefunction() called\n";
    }
};        

With the snippet above, the somefunction method (a method is like an action that an object takes that modifies its state. Some methods don't modify state, they are called constant member functions in C++, some only return copies of internal data, these are accessor methods commonly called getters) can be called outside the class. Also, all subclasses of SomeClass class will inherit public member functions and variables. The private keyword is another keyword for encapsulation in C++. It basically prohibits data and logic from accessed outside the or any of its subclasses. It's used to represent those data and logic that is not exposed to the external world, something like an internal interface. The protected keyword is basically like the private keyword with the exception that subclasses can inherit protected data members and methods from the base class. We are not interested in both the public and protected keywords, our keyword of interest is the private keyword.

What many people don't know is that these visibility rules are only compile time enforced, at runtime there is nothing like private and public member variables and methods. So, with that in mind we can modify a private variable outside a class by using some tricks that only work at runtime. Before we do that lets discuss this, let's talk about how C++ represents objects in memory.

The data members variables of a class are collected by the compiler into word boundary aligned structs, which are passed as a hidden first parameter to a method of a class at runtime (with the exception of static member variables and methods). The compiler/linker also fixes class code into the binary (executable) such that the jump instructions that the compiler emits to jump between methods of a class are called relative jump. What this means is that encoded in the opcode of the jump instruction is a signed offset that the destination of the jump instruction should be relative to the current program segment (Used to provide separate addresses spaces for both code and data cannot go into detail). Let's look at example below.

struct Struct 
{

? ? Struct(int x, float y)
? ? ? ? : x_(x),
? ? ? ? ? y_(y)
? ? {}


? ? friend std::ostream& operator<<(std::ostream &oss, const Struct &st)?
? ? {
? ? ? ? oss << "(x: " << st.x_ << " , y: " << st.y_ << ")\n";
? ? ? ? return oss;
? ? }


private:
? ? int x_;
? ? float y_;
};        

Ignoring the class methods the compiler collects the data members into a struct that looks something like this (we are thinking on what the compiler does)

struct SomeClass_Struct {
   int x_;
   float y_;
} *this ;        

This might not be correct because some processors require that data and code be aligned at processor word boundaries, if not their performance would be heavily impacted or will not work at all (as is the case with some RISC processors). To prevent this from happening, various compiler implementations perform platform specific alignments of structs so as to improve performance. Some of these alignments might even reorder the data member variables of the class. The standard C++ ABI does not provide guarantee of order and position of aligned member variables in memory. The encapsulation bypass is my best effort strategy and highly doubt if its portable across OS platforms. Now that we have a workable understanding of memory representation of member variable data. Now for a small subset of cases where we are very sure of the alignment and order of data members of classes we can use a very simple trick to bypass encapsulation. Basically we take an address of object of choice, we the cast the said object pointer to a pointer to a struct that models the layout of the struct we expect. if correct we can use this pointer to change private member variables.

In the example below we will change the value of a member variable referenced by the symbol x_ and we can see a private member variable change outside for the class. This is not a very portable so use it for demonstration.

struct Struct {

? ? Struct(int x, float y)
? ? ? ? : x_(x),
? ? ? ? ? y_(y)
? ? {}


? ? friend std::ostream& operator<<(std::ostream &oss, const Struct &st)?
? ? {
? ? ? ? oss << "(x: " << st.x_ << " , y: " << st.y_ << ")\n";
? ? ? ? return oss;
? ? }


private:
? ? int x_;
? ? float y_;
};


struct Guess {
? ? int a;
? ? float b;
};


int main()
{
? ? Struct s(0,10.0);
? ? std::cout << s;


? ? reinterpret_cast<Guess*>((&s))->a = 50;


? ? std::cout << s;
}        


I hope this post was informative. Comments, suggestions and corrections are welcome. Thank you.

Kicki Frisch

Software and knitting.

1 年

Think about why that encapsulation was put in place in the first place. It’s been put there by design by the original dev team as an indication that there is other things going on, and that this data should not be accessed directly. E.g. there can be internal logic that works on the data when an object is created. By breaking the encapsulation, we basically say that we know better than the original developer how to use this class. If we feel the need to break encapsulation, it’s a signal to me that we’re trying to do things that the original software design does not intend. Rather than go out of our way to use low-level hacks to get around the encapsulation, it usually results in better code to sit down, think about the design, consider why it has been put in place, and either work with it, or redesign it to better fit the evolution of the software purpose. In this case, it sounds like we could introduce a simpler data transfer object that is suitable for serialisation and deserialisation.

回复
Kicki Frisch

Software and knitting.

1 年

Abdul Hameed Oluwashegu Tade can you give a real life example where you would use this?

回复
Bernard Djangbah

Software engineer. Physics graduate. Gamer.

1 年

Nice article Abdul Hameed Oluwashegu Tade. This reminds of reflection in VM languages.

要查看或添加评论,请登录

Abdul Hameed Oluwashegu Tade的更多文章

  • Function Cache in C++

    Function Cache in C++

    Sometimes, certain functions that perform very expensive computation, you might want to perform the computations only…

    3 条评论
  • Malware Filesystem Redesigned

    Malware Filesystem Redesigned

    Malware File System a redesign File storage is one of the operating system's basic but crucial functions. But an…

  • Memory Leaks "the hell of dynamic memory allocation"

    Memory Leaks "the hell of dynamic memory allocation"

    The computers that we use now a days fall under two general classes of architectures. The Harvard architecture and the…

  • What are opaque types in C ? How do you use them ?

    What are opaque types in C ? How do you use them ?

    Sometimes when working with C, you might want to perform some data abstraction. There is an easy technique to do this…

    5 条评论
  • What are datatypes in C/C++ and what does typecasting even mean ?

    What are datatypes in C/C++ and what does typecasting even mean ?

    What are datatypes ? From a hardware view of the computer, data is just a contiguous array of bytes in memory. These…

    2 条评论

社区洞察

其他会员也浏览了