java classloader and jar path parsing

Statement: reproduced in this article, Original link:

java classloader and jar path analysis - simple book https://www.jianshu.com/p/546a7e3dc427

1, Basic principle of class loader

 

The virtual machine provides three kinds of loaders: Bootstrap loader, Ext loader and App loader. They load classes through parental delegation mode

Bootstrap class loader: it mainly loads the classes required by the JVM itself. This class loading is implemented in C + + language and is a part of the virtual machine itself. It is responsible for loading the core class library under the {jdk}/lib path or the jar package under the path specified by the - Xbootclasspath parameter into memory. Note that the virtual machine loads the jar package according to the file name, such as rt.jar. If the file name is not recognized by the virtual machine, Even dropping the jar package into the Lib directory is useless (for security reasons, the bootstrap startup class loader only loads classes with package names beginning with java, javax, sun, etc.).

Ext class loader: refers to sun misc. The Launcher $extclassloader class, implemented by the Java language, is the static internal class of the Launcher. It is responsible for loading in the {jdk}/lib/ext directory or by the system variable - DJava Ext.dir refers to the class library in the location path. Developers can directly use the standard extension class loader.

App class loader: sun misc. Launcher$AppClassLoader. It is responsible for loading the system classpath java -classpath or - D Java class. Path specifies the class library under the path, that is, the classpath path we often use. Developers can directly use the system class loader. Generally, this class loader is the default class loader in the program.

BootStrap is the top-level class loader. Ext holds the BootStrap reference and App holds the EXT reference. When a class is loaded, it is first loaded by the superior loader. Only when the superior loader cannot load, it is loaded by itself. (for details, you can search the parental delegation model by yourself)

ExtClassLaoder and AppClassLoader are subclasses of URLClassLaoder (Bootstrap is implemented in C + +, so it is not a subclass). When we define our own ClassLoader, we generally inherit URLClassLoader.

ClassLoader

ClassLoader is the parent class of all class loaders. There are three main methods: loadClass (load a class), findClass (find the location of the disk where the class file is located (or network flow)), and defineClass (transfer the class to jvm memory)
When a class is loaded, it will be loaded through loadClass. The main logic of loadClass is as follows:

// The code retains only the core logic
protected Class<?> loadClass(String name, boolean resolve) {
    Class<?> c = findLoadedClass(name);  //Judge whether it has been loaded
    if (c == null) {
        if (parent != null) {
            c = parent.loadClass(name, false);   //First, the parent loader loads
        }
        if (c == null) {
            c = findClass(name);    //Find the class And installed in memory
        }
    }
    return c;
}

protected Class<?> findClass(String name) throws ClassNotFoundException {
    throw new ClassNotFoundException(name);
}

URLClassLoader

URLClassLoader inherits from ClassLoader. Its main function is to locate the location of class file through the fully qualified name of class (package name + class name).
Let's take a look at the URLClassLoader construction method

URLClassLoader(URL[] urls, ClassLoader parent) 
URLClassLoader(URL[] urls, ClassLoader parent,AccessControlContext acc) 
public URLClassLoader(URL[] urls)
URLClassLoader(URL[] urls, AccessControlContext acc)
Constructor, including the parameter URL []. In fact, this parameter represents the path where the class is located (can be: file path, network flow, jar path) In this way, when you load a class, you can find it through these paths.
Therefore, when we customize a class loader, we will generally inherit urlclassloader. In this way, we pass the path URL of the class to urlclassloader, and urlclassloader will help us find and load the class in the path without asking about the logic.
URLClassLoader rewrites findClass. The main logic is as follows
protected Class<?> findClass(final String name) {
    final Class<?> result;
    String path = name.replace('.', '/').concat(".class");  //name Fully qualified name of representative class
    Resource res = ucp.getResource(path, false);   //ucp That's right URL[] Package, in URL[] Find the class to load in the path list
    if (res != null) {
        try {
            return defineClass(name, res);  //Install class in jvm Memory
        } catch (IOException e) {
            throw new ClassNotFoundException(name, e);
        }
    } else {
        return null;
    }

    if (result == null) {
        throw new ClassNotFoundException(name);
    }
    return result;
}

You can see that URLClassLoader implements: find the class file in the path and load it into memory.
Now let's demonstrate

Examples

Example 1:

public class TestClass {
    public static void main(String[] args) {
        TestClass testClass = new TestClass();
        ClassLoader classLoader = testClass.getClass().getClassLoader();
        URL[] urls = ((URLClassLoader) classLoader).getURLs();
        for(URL url :urls) {
            System.out.println(url);
        }
    }
}

From the above, we know that our running code is AppClassLoader by default, that is, a URLClassLoader. We print the path in it, and the results are as follows:

file:/Library/Java/JavaVirtualMachines/jdk1.8.0_65.jdk/Contents/Home/jre/lib/charsets.jar
file:/Library/Java/JavaVirtualMachines/jdk1.8.0_65.jdk/Contents/Home/jre/lib/deploy.jar
file:/Library/Java/JavaVirtualMachines/jdk1.8.0_65.jdk/Contents/Home/jre/lib/ext/cldrdata.jar
file:/Library/Java/JavaVirtualMachines/jdk1.8.0_65.jdk/Contents/Home/jre/lib/ext/dnsns.jar
file:/Library/Java/JavaVirtualMachines/jdk1.8.0_65.jdk/Contents/Home/jre/lib/ext/jaccess.jar
file:/Library/Java/JavaVirtualMachines/jdk1.8.0_65.jdk/Contents/Home/jre/lib/ext/jfxrt.jar
file:/Library/Java/JavaVirtualMachines/jdk1.8.0_65.jdk/Contents/Home/jre/lib/ext/localedata.jar
.......

As you can see, they are all jar packages under our classPath.


Example 2:
We put this test code into the springBoot project, then make a jar package and run it. The above code will get the following output:

jar:file:/Users/yt/test/spring-boot-test.jar!/BOOT-INF/classes!/
jar:file:/Users/yt/test/spring-boot-test.jar!/BOOT-INF/lib/api-core-0.0.4-SNAPSHOT.jar!/
jar:file:/Users/yt/test/spring-boot-test.jar!/BOOT-INF/lib/raptor-es-common-1.0.3-SNAPSHOT.jar!/
jar:file:/Users/yt/test/spring-boot-test.jar!/BOOT-INF/lib/httpclient-4.5.7.jar!/
jar:file:/Users/yt/test/spring-boot-test.jar!/BOOT-INF/lib/httpmime-4.5.7.jar!/
jar:file:/Users/yt/test/spring-boot-test.jar!/BOOT-INF/lib/httpcore-4.4.11.jar!/
These jar packages are all third-party jar packages introduced in our project. We can see that these jar package paths are passed into the classloader for path search when the classloader loads classes.
We'll find these paths with/ This symbol, which actually represents the unique path symbol of java, represents a jar file. When java reads it, it will decompress and read it using the jar situation. (because the jar file cannot be read like other files, jar is actually a compressed file and must be decompressed)
Now let's throw a question: why is the URL in example 1 in the form of file: / library / Java / javavirtualmachines / jdk1 8.0_ 65.jdk/Contents/Home/jre/lib/charsets. Jar instead of: File: / library / Java / javavirtualmachines / jdk1 8.0_ 65.jdk/Contents/Home/jre/lib/charsets. jar!/, End with! /. Since/ The representative is a jar file. The jvm will extract and read the jar file with! /, Just like when we were in example 2, jar/ ending. Why does the jar here not have a/

2, jar file path analysis

URL class resolution

URLClassLoader will search the location of the class through URL []. Let's take a look at the implementation of this URL. First, let's take a look at the constructor:

 public URL(String spec) throws MalformedURLException {
        this(null, spec);
    }
public URL(URL context, String spec) throws MalformedURLException {
        this(context, spec, null);
    }
public URL(URL context, String spec, URLStreamHandler handler) {
        protocol = getProto(spec);  //Parse out: the preceding characters as the protocol
        this.handler = getURLStreamHandler(protocol)  //Get the processing class corresponding to the protocol. Be responsible for reading and writing the agreement
        this.handler.parseURL(this, spec, start, limit); //check
    }

Let's take a look at getURLStreamHandler:

static URLStreamHandler getURLStreamHandler(String protocol) {
        //GetPropertyAction("java.protocol.handler.pkgs", "") Is to get jvm Do you have this property Variables,
       //That means we can define it ourselves URL Protocol. Define the protocol processing method by yourself. And write the class name to jvm property Variable
        packagePrefixList = java.security.AccessController.doPrivileged(
                new sun.security.action.GetPropertyAction("java.protocol.handler.pkgs", "")
        );
        if (packagePrefixList != "") {
            packagePrefixList += "|";
        }

        packagePrefixList += "sun.net.www.protocol";

        StringTokenizer packagePrefixIter =
                new StringTokenizer(packagePrefixList, "|");

        while (handler == null && packagePrefixIter.hasMoreTokens()) {

            String packagePrefix = packagePrefixIter.nextToken().trim();
            try {
                String clsName = packagePrefix + "." + protocol +
                        ".Handler";
                Class<?> cls = null;
                try {
                    cls = Class.forName(clsName);
                } catch (ClassNotFoundException e) {
                    ClassLoader cl = ClassLoader.getSystemClassLoader();
                    if (cl != null) {
                        cls = cl.loadClass(clsName);
                    }
                }
                if (cls != null) {
                    handler =
                            (URLStreamHandler) cls.newInstance();
                }
            } catch (Exception e) {
                // any number of exceptions can get thrown here
            }
        }
        return handler;

    }

We can see from the above code. When we new URL(“ jar:file:/yt/test/test.jar "), a URL will be constructed, in which the Handler responsible for interacting with the jar file is sun.net.www.protocol.jar.Hnadler( In addition, there are sun.net.www.protocol.file.Handler,sun.net.www.protocol.http.Handler When we read and write the URL, we use this handler internally. To read a jar file in this way is to use jar Handler to handle; To read an http is to use http Handler processing

this.handler.parseURL(this, spec, start, limit); This code is mainly used to verify the URL. For jar protocol, it will verify that the characters contain! /, If it is missing, an error will be reported. So we need to write a new URL("jar:file:/yt/test/test.jar! /") so that we won't report an error. The main logic of parseurl is as follows:

Object var2 = null;
boolean var3 = true;
int var6;
if ((var6 = indexOfBangSlash(var1)) == -1) {
    throw new NullPointerException("no !/ in spec");
} else {
    try {
        String var4 = var1.substring(0, var6 - 1);
        new URL(var4);
        return var1;
    } catch (MalformedURLException var5) {
        throw new NullPointerException("invalid url: " + var1 + " (" + var5 + ")");
    }
}

URLClassLoader

The most important function of URLClassLoader is to query the path of the class to be installed from the URL [] list, which is the findClass method

protected Class<?> findClass(final String name) {
        String path = name.replace('.', '/').concat(".class");  //name Fully qualified name of representative class
        Resource res = ucp.getResource(path, false);   //ucp That's right URL[] Package, in URL[] Find the reproduced class in the path list
       return defineClass(name, res);  //Install class in jvm Memory
}

ucp is the URLClassPath object. Let's take a look at ucp Getresouce method (the original method is too complex, which is abstractly summarized here)

 public Enumeration<Resource> getResources(final String var1, final boolean var2) {
       for url: urls{     //urls namely URLClassLoader that URL[] List, which is used to search the path list of classes
            URLClassPath.Loader  loader = getLoader(url);
            res = loader.getResource(var1, var2);
            if (res != null) retun null;
       }
    //The original method will cache and other efficient operations on the logic here
}

private URLClassPath.Loader getLoader(final URL var1) throws IOException {
        String var1x = var1.getFile();
        if (var1x != null && var1x.endsWith("/")) {
            return (URLClassPath.Loader)("file".equals(var1.getProtocol()) ? new URLClassPath.FileLoader(var1) : new URLClassPath.Loader(var1));
        } else {
            return new URLClassPath.JarLoader(var1, URLClassPath.this.jarHandler, URLClassPath.this.lmap);
        }
}

Let's take a look at urlclasspath For the internal class loader, the getResource logic is mainly used to judge whether the class is under the url path.
Now let's return to the above question:

1. When we run a non jar package, the class path is like this (actually corresponding to AppClassLoader): File: / library / Java / javavirtualmachines / jdk1 8.0_ 65.jdk/Contents/Home/jre/lib/charsets. jar

When we get loader, we will go to new urlclasspath jarLoader () logic, you can see that this is a jarLoader, that is, it will read according to the way of jar package reading.

 

2. When we run a spring boot packaged jar, its class path is in the form of: jar: File: / users / YT / test / spring boot test jar!/ BOOT-INF/lib/api-core-0.0.4-SNAPSHOT. jar!/

When we get laoder, we will follow this logic, new urlclasspath Loader(var1)); The URL itself is a jar protocol, so it will be read through the jar protocol.

3, getResource

We create a project whose directory is as follows:

src/main/java: TestClass.java
src/main/resouce: /res.txt

public class TestClass {
    public static void main(String[] args) {
        TestClass testClass = new TestClass();
        URL fileURL = testClass.getClass().getResource("/res.txt");
        System.out.println(fileURL.getFile());
    }
}

We run this method

Results after operation:
/Users/yt/test/res.text
We made the project into a jar package (test.jar). The results after running are as follows:
/Users/yt/test.jar!/res.text

Therefore, for the file path in the jar package, its format is jar: File: {path}/ {path}



Author: one day's
Link: https://www.jianshu.com/p/546a7e3dc427
Source: Jianshu
The copyright belongs to the author. For commercial reprint, please contact the author for authorization. For non-commercial reprint, please indicate the source.

 

Posted by anybody99 on Tue, 10 May 2022 13:47:43 +0300