Research Insights On JVM (Java Virtual Machine)
JVM

Research Insights On JVM (Java Virtual Machine)

What is Java Virtual Machine?

JVM(Java Virtual Machine) acts as a run-time engine to run Java applications. JVM is the one that actually calls the?main?method present in a java code. JVM is a part of JRE(Java Runtime Environment). Java applications are mainly called WORA (Write Once Run Anywhere) which basically means a programmer can develop Java code on one system and can expect it to run on any other Java-enabled systems or Platforms without any adjustment. This is all possible because of JVM.

JVM Architecture

JVM works by compiling Java code into bytecode. This bytecode gets interpreted on different machines Between the host system and the Java source, Bytecode is an intermediary language. JVM in Java is responsible for allocating memory space.

No alt text provided for this image

The above Structural program body clearly states that each Java application runs inside a runtime instance of some concrete implementation of the abstract specification of the?Java virtual machine. There are three notions of JVM: specification, implementation, and instance.

  1. Specification:?A document that describes what is required for JVM Implementation.

2. Implementation:?Known as JRE(Java Run Time Environment.)

3. Instance:?Whenever you will run a java class file an instance of JVM is created.


  • JVM architecture in Java contains a Class Loader Subsystem, Runtime Data Area or memory area, execution engine, etc.

No alt text provided for this image

1. Class Loader Subsystem:

The?Java virtual machine?has a flexible?Class Loader?architecture that allows a Java application to load classes in custom ways. In a JVM, each and every class is loaded by some instance of java. lang.ClassLoader. A classloader is a special Java class file that is responsible for loading other classes onto a Java Virtual Machine. If a Java class is invoked and needs to be executed on a Java Virtual Machine, a special Java component, called a?classloader, is used to find the Java class of interest, pull that Java class off of the file system, and execute the?bytecode?of that class file on the Java Virtual Machine.

No alt text provided for this image

Java?Class Loader?Subsystem loads, links and initializes the class file when it refers to a class for the first time at runtime. It is responsible for loading class files from the file system, network, or any other source. There are three default class loaders used in Java,?Bootstrap,?Extension,?and?System or Application?class loader.

> Loading:-

  • BootStrap class Loader: When a JVM starts up, a special chunk of machine code runs that loads the system class loader. This machine code is known as the?Bootstrap / Primordial?classloader. It is platform-specific machine instructions that kick off the whole classloading process.
  • Extension ClassLoader: The Extension class loader loads the classes from the JRE’s extension directories, such?as lib/ext?directories.
  • System/Application Class Loader: The system/Application?Class Loader?is responsible for loading Application Level Classpath, path mentioned?Environment Variable,?etc.

> Classloader - Linking:-

Linking is the process of incorporating the loaded bytecodes into the Java?Runtime System?so that the loaded Type can be used by the JVM. It involves verifying and preparing that class or interface, its direct superclass, its direct?superinterfaces?, and its element type (if it is an array type), if necessary. It mainly,

  • Verify:?The bytecode verifier will verify whether the generated bytecode is proper or not if verification fails we will get a verification error.
  • Prepare:?For all static variables memory will be allocated and assigned with default values.
  • Resolve:?All symbolic memory references are replaced with the original references from Method Area.

> Initialization:-

This is the final phase of Class Loading, here all static variables will be assigned with the original values and?the static blocks?will be executed.


2. Runtime Data Areas:

The?Java Virtual Machine?(JVM) defines various run-time data areas that are used during the execution of a program. Some of these data areas are created on Java Virtual Machine start-up and are destroyed only when the Java Virtual Machine exits. Other data areas are per?thread. Per-thread data areas are created when a thread is created and destroyed when the thread exits.

No alt text provided for this image


  1. Method area:?In the method area, all class-level information like class name, immediate parent class name, methods and variables information, etc. are stored, including static variables. There is only one method area per JVM, and it is a shared resource. From java 8, static variables are now stored in the Heap area.
  2. Heap area:?Information of all objects is stored in the heap area. There is also one Heap Area per JVM. It is also a shared resource.
  3. Stack area:?For every thread, JVM creates one run-time stack which is stored here. Every block of this stack is called an activation record/stack frame which stores method calls. All local variables of that method are stored in their corresponding frame. After a thread terminates, its run-time stack will be destroyed by JVM. It is not a shared resource.
  4. PC Registers:?Store address of current execution instruction of a thread. Obviously, each thread has separate PC Registers.
  5. Native method stacks:?For every thread, a separate native stack is created. It stores native method information.?


3. Execution Engine:

This is mostly the core of the JVM.?The execution engine?can communicate with various memory areas of JVM. Each thread of a running Java application is a distinct instance of the virtual machine's execution engine. The byte code that is assigned to the runtime data areas in the JVM via?the class loader?is executed by the execution engine.

  • Interpreter: It mainly reads, interprets, and executes the?bytecode instructions?one by one. As it interprets and executes instructions one by one, it can quickly interpret one bytecode, but slowly executes the interpreted result. This is the disadvantage of the interpreted language. The 'language' called Bytecode basically runs like an?interpreter.
  • JIT Compiler: The?JIT compiler?converts the bytecode to an intermediate-level expression, IR (Intermediate Representation), to execute?optimization, and then converts the expression to native code. The main purpose of the JIT compiler is to improve performance. Internally?JIT compiler?maintains a separate count for every method. Whenever JVM across any method call, first that method will be?interpreted?normally by the interpreter, and the JIT compiler increments the corresponding count variable.
  • Garbage Collector: Garbage collection (GC) is the process that aims to free up?occupied memory?that is no longer referenced by any reachable Java object. All Java objects automatically grab the memory that they need when they are created, and when the object is no longer needed, the Java?Garbage Collection?process reclaims the memory. That basically means Garbage Collector tracks live objects and everything else designated garbage.


Recent Innovation and research in JVM


  • With the rapid development of electronic information science and technology, the Java virtual machine technology is becoming more and more important, its role is becoming more and greater at the same time, the role of the distributed algorithm is becoming more and greater, too. this study analyzes and studies the Java language and distributed architecture, on the basis of which it conducts the combined research of the Java virtual machine and the distributed computing, and, in turn, designs the distributed computing architecture of Java virtual machine based on thread migration. By this way the prototype of the Java virtual machine is realized, the distributed thread migration mechanism of Java virtual machine is also realized; finally, the distributed applications of Java virtual machine migration online are summarized and prospected. This study not only summarizes the developing status of distributed system and Java virtual machine, but also proposes a new idea that Java virtual machine realizes the distributed computing in a thread migration, on the basis of which this study puts forward distributed Java virtual machine controlled by single control node where control node takes charge of the management of virtual machine, while the general node is responsible for the execution of the thread. It allows multiple threads' concurrent execution, as well as the intercommunication between various nodes. Object module heap can carry out object access of physical node whether remote or not, and has a unified interface. Threading module is mainly responsible for creation and migration of thread, and is also responsible for balance in the allocation of task at the same time. Current research of the application of JVM is mostly in distributed thread migration.

A good deal convenience brought about by Java virtual machine makes itself more and more important, and become a universal computing platform. up to now, the Java virtual machine can run more and more programming and scripting languages, which run well because of the excellent characteristics of the Java virtual machine. Usually, this kind of operation will have two ways, one is for implementation using the interpreter, and the other one is specialized compiler, which can compile the language into the Java code to be thus carried out. The design of the Java virtual machine is very flexible without fixed format and process. Different Java virtual machines can be designed adapting to underlying changes according to the different scenarios, by which the code's seamless migration can be achieved generally needless of varying the upscale application interface once again. there is no fixed format and processes dedicated to the design of the Java virtual machine, with a strong sense of flexibility, can be different according to the different scenarios designed to adapt to the changes of the underlying Java virtual machine, and to the upper layer application interface generally don't need to make other changes to achieve seamless migration of the Code.


DISTRIBUTED COMPUTING:-

The so-called distributed system means the system made up of several autonomous computers, where the communication between the nodes can be achieved through a network. These nodes will operate for a common task. the distributed computing has two common scenarios : The first one is that nodes of more than one computer are required to involve due to the certain features of the application itself, to share the data across the network. The second is that the combination of multiple computers will have greater advantage in completing some tasks than single computer. Common distributed architecture mainly includes the following modes: the first one is c/s model, which is widely used, such as the common distributed Web services. The customers mainly use it to obtain the data from the server. The second is closely coupled system, which means multiple subtasks are operating, and finally the operating results are combined. The third is unified addressing, which means that a virtual and unified address space is created so that the data can be copied each other. The fourth was a 3- story architecture, which means that an interlayer is created at both client side and the server side again to complete some corresponding service so as to simplify the deployment of the client side and streamline the client side. The fifth is P2P system, which does not have a special node to manage the network and computers on the network where tasks are evenly distributed in each node.


DESIGN CONCEPTS:-

According to the characteristics of the java virtual machine and distributed computing, this study is expected to design a new type of distributed virtual machine, which can achieve the migration of the concurrent task and concurrent execution. Fully transparent distributed facilities are required to provide to make it become distributed computing platformby design thinking. it should not only enable the programmers to write, release and implement exactly like in a normal Java virtual machine, but also makes the unmodified legacy code operate directly. Code dynamic deployment mechanism is thereafter designed. This system requires to completely keep the Java programming model, for which a distributed Java heap is needed so that all nodes can access the distributed environment of the local Java heap, after which they can access overall object, realize the synchronization of data reading and writing, to ensure consistency and integrity of the object the system should also be simple, using only a control node, and control node is responsible only for cluster management not involved in the running of Java code. This system belongs to the above-mentioned closely coupled system in the distributed architectures in which the communication between the nodes of the architecture is very frequent, requiring to enhance the throughput capacity between nodes, and introduce asynchronous response mechanism for the requests unable to respond. The final model of the mechanism is shown below,

No alt text provided for this image


COMMUNICATIONS:-

As a distributed system, communication is particularly important. Communication module is mainly responsible for data communication between nodes, masking off the implementation details of network communication, provides network communications services for upper tier object module and threading module. The implementation of this module realizes the point-to-point communication mainly using the socket of the Linux system according to TCP, and realizes the communication between groups using UDP protocol. Communication mode mainly utilizes the modes of request and answer. The request can be sent between nodes while a multicast can implement more than one answer. Here we set up a control node (master) which corresponds to the listening port of 10000, while other nodes can independently choose the listening port, reporting to the control node only when registering. After completing registration of the node, the node must use a fixed port, prior to which the port is random when the node sendsrequest, see the following figure for specific procedures,

No alt text provided for this image

For each node, including control node and general nodes, there will be a listening thread monitoring the nodes all the time. The listening thread ignores other requests, and is responsible only for monitoring ports and processing of the distributed requests. Upon arrival of a request, it can seek the corresponding function based on the types of data packet, and then transmit the request source address, network data, data length, and the descriptor connected with the corresponding network to the request handling function. The below diagram describes the above procedures,

No alt text provided for this image

In the system, immediate response for quick processing sometimes occurs when the thread will stay connected until request is responded to, meanwhile other requests will be blocked. If the request persists for a long time, delay will occur. Ifthe processing time cannot be measured, greater impact can be caused. In order to eliminate this effect, we introduced the asynchronous response mechanism. Request segment first produces an ID, and send this ID to the client side, which will wait for the appropriate time, during which time it will not accept the dispatch of thread. The server will send the response in the mode of request to the client side after completing the request. The client side will wake up the corresponding thread after acceptance of request. By this way, the unnecessary delay can be effectively solved.


OBJECTS AND THREADS:-

Object module is also known as heap module, which is responsible for providing the transparent access to the execution engine, and providing unified interface for access, during this process it doesn't matter whether the object is a remote physical node. The system uses a 32-bit identifier assigned by the control node. The same reference indicates the same class or the same object on all the nodes within the system. Access across the nodes of the object field can be achieved by the overall reference of the object. References to the classes are distributed by control nodes. When general node requires referring to a class, it needs to apply to the control node at first. After the application is approved, all the nodes can access the class. Java provides the support for multi-thread spontaneously, so the memory model of Java virtual machine is designed according to visual angle of the thread. Threads in Java can access the data directly. Java heap is the main memory of the Java virtual machine. Private memory of every thread is just its working memory. The role of threading module is responsible for creation and migration of thread. When Java program is executed, Java virtual machine creates a new thread at first. The tasks of the virtual machine are distributed with thread as a unit among multiple nodes for concurrent execution. At the outset, more content will be added into the control nodes, which are included when general nodes execute thread, and the ID of the main thread is set to 0X00000001. Whenever general nodes create a new thread, start function of the classes will be called for, when the general node will send requests to a control node, and thread object will be included in the request. After acceptance of the request, the control node allocates the ID to the thread, and selects a node as a running node. Then 1 is added to the number of the threads, the control node send the request to the selected node. after receiving the request, the node obtains the object copy from objects, gets the ClassBloc from classes, begins to executes the Start() function, so much for the whole migration of a thread.

No alt text provided for this image


MEMORY MANAGEMENT:-

Memory management is responsible for space allocation and garbage collection. At the start nodes will request to the system to allocate memory space, this memory space is aligned according to 8 bytes. Every time the allocation of space needs to be unlocked, and then conduct the space allocating operation. After operation, the space continues to be locked. Java Virtual Machine is also responsible for the automatic recovery of useless space. This process is divided into two kinds, the first is the user execution, and that is, the user calls the mechanism to compel the Java virtual machine to recover space. The second means collecting the garbage automatically when space allocation failed. Garbage recovery is divided into local recovery and overall recovery, of which local recovery means only recovering the garbage at a node, is triggered by the space distribution failed; overall recovery is triggered by control node. It is caused by general node sending request to control node. The overall recovery is to carry out garbage recovery from all the nodes in overall scope; another situation will also trigger overall recovery: when the allocation need of the heap space can't yet be met after local recovery, the overall recovery will be done.

No alt text provided for this image


CONCLUSION:-

At present, distributed computing of the thread migration of the Java virtual machine is still comparatively out of practice, and the study on this aspect is relatively less. There are still some flaws with some ideas raised by this study, for example, erroneous node may lead to the potential threat of the downtime of the whole system. in the future work, special attention should be focused on how to standby mechanism of the general nodes, so as to make the general work node where the error occurred automatically exit, then replace them with the alternate nodes to continues to execute tasks quickly, and carry out the dynamic management of system topology on the basis of this system, therefore improving the reliability and MTTR of the whole system.

Atlast, for the application of Java in real-time and safety-critical domains, an analysis of the worst-case execution times of primitive Java operations is necessary. All primitive operations must either execute in constant time or have a reasonable upper bound for their execution time. The difficulties that arise for a Java virtual machine and a Java compiler in this context will be presented here. This includes the implementation of Java's class and interface model, class initialization, monitors, and automatic memory management. A new Java virtual machine and compiler that solves these difficulties has been implemented and its performance has been analyzed.

要查看或添加评论,请登录

Ajeenkya S.的更多文章

社区洞察