Apache Maven Part 1 - Introduction and basics

This multi-post series will be looking at Apache Maven. In this first part, we will provide a high-level overview, explain some of the basics and show an example of a simple Maven managed project.

But since Apache Maven is a complex tool with a lot of possibilities, this post will focus on using Maven for a simple project without a complex hierarchical structure or a complex build process.
If you already understand the concepts of artifacts, repositories, dependencies and Maven properties, you can skip to the second part, which will cover more complex setups and configurations.


This post is part of a series about Apache Maven.
You can read part two - Maven plugins here.
You can read part three - Maven inheritance and aggregation here.


What is Apache Maven?

Apache Maven is a software tool that helps developers manage and build any Java-based project. The idea behind maven was to try and solve the issue of different build results on different machines, which were cause by subtle differences in versions of the tools used.
It solved this problem by providing a central place to formally define and configure the build process, which is called the Project Object Model (POM).
The project object model controls the versions of the tools used and the steps taken and the result is a replicable build process. This eliminates the problem of different build results on different machines and also simplifies developers’ life because we do not have to learn a new build process every time we switch between projects.

How to configure a project to be managed by Maven

A maven project is a normal Java project that also contains POM files (pom.xml files), which describe and define different things related to the build process.
They are always located in the project root folder and they define the current project’ build process.
We can also nest the projects and pom files to create complex hierarchical structures and multi-module projects, where each module has a specific build process but the build process for all of them is triggered by running a single Maven build from the parent project. We will look more thoroughly at these kinds of setups in the second post of this series.
To sum up, in order to have a project managed by Maven, you just need a pom.xml file in your project folder that describes the build process and Maven can take it from there.

POM files

As mentioned, the Project Object Model or POM is the unit of work for Maven and pom.xml files are where everything is defined and configured.
An important note is that every Maven version has a super POM from which every POM inherits unless it has a parent specified. The super POM file has a lot of defaults configured, such as the central repository, default output directory for compiled classes, the final name of the artifact, etc.
What is important to know is that everything is inherited from the super POM and can be overwritten if needed.
We can find the full content of the super POM in the reference documentation of Maven or in our local Maven install directory inside the jar lib/maven-model-builder-{version}.jar in the folder org/apache/maven/model.

What does a POM file look like?

Since all POM files inherit from the super POM, project POM files vary significantly from project to project, depending on the build complexity.
POM files are XML files that start with a base project element. The project element can be empty or we can add schema references to it to have validation of the file. Nested inside the project element are all the elements that are used to configure Maven.
And like we mentioned since the super POM defines a lot of default values there are only four elements mandatory for every Maven project, the rest are optional to add as needed.
The ordering of the elements in the POM file is not important so we can order them as we like.
An example of a pom.xml with the schema defined and all the elements looks like this:

<project xmlns="http://maven.apache.org/POM/4.0.0"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
					  http://maven.apache.org/xsd/maven-4.0.0.xsd">

	<!-- Mandatory elements -->
	<modelVersion>4.0.0</modelVersion>	 
	<groupId>...</groupId>
	<artifactId>...</artifactId>
	<version>...</version>
	
	<!-- Optional elements -->		
	<dependencies>...</dependencies>
	<dependencyManagement>...</dependencyManagement>
	<modules>...</modules>
	<packaging>...</packaging>
	<parent>...</parent>
	<properties>...</properties>
	 
	<!-- Build Settings -->
	<build>...</build>
	<reporting>...</reporting>
	 
	<!-- Information about the project -->
	<contributors>...</contributors>
	<description>...</description>
	<developers>...</developers>
	<inceptionYear>...</inceptionYear>
	<licenses>...</licenses>
	<name>...</name>
	<organization>...</organization>
	<url>...</url>
	 
	<!-- Environment Settings -->
	<ciManagement>...</ciManagement>
	<distributionManagement>...</distributionManagement>
	<issueManagement>...</issueManagement>
	<mailingLists>...</mailingLists>
	<pluginRepositories>...</pluginRepositories>
	<prerequisites>...</prerequisites>
	<profiles>...</profiles>
	<repositories>...</repositories>
	<scm>...</scm>
	
</project>

The mandatory fields are modelVersion, groupId, artifactId and version, and the rest are optional.
modelVersion is used to specify which POM model version and currently only the model version 4.0.0 is supported in Maven 2 & 3.
The other three mandatory fields are used to uniquely identify artifacts in the Maven repositories.

Maven repository

Maven repository holds the build artifacts and dependencies of varying types. It is basically a file directory, where the dependency artifacts are located.
There are two types of repositories - local and remote. The local repository is located on the computer where Maven is run and it caches the dependencies that have already been downloaded. The default location of the local repository is:

// Windows
C:\Users\<User_Name>\.m2 
// Linux
/home/<User_Name>/.m2
// Mac
/Users/<user_name>/.m2

But we can also modify this default location by changing the localRepository field in the file settings.xml that is located in the Maven install directory in the conf directory, like this:

<localRepository>C:\Users\Devflection\newMavenRepositoryDirectory</localRepository>

Any other repository is a remote repository, and they are usually published on a server and are accessed via HTTP.
When Maven finds a dependency it does not have in the local repository, it tries to find it in the specified remote repository, downloads it and caches it in the local repository. If there is no repository specified in the repository field in the POM file, Maven defaults to the Maven central repository, which is specified in the parent super POM.

Identifying artifacts

To locate artifacts in the Maven repositories the three mandatory fields are used - groupId, artifactId and version.
This trio appears often in the Maven landscape, and it is used to uniquely identify a project (groupId and artifactId) at a specific point in time (version).

  • groupId represents the organization or the project top level name (e.g. com.devflection).
  • artifactId represents the name of the project (e.g. maven-example-project).
  • version represents the version of the project (e.g. 1.2.3).

The values inside all three fields point to a location in the directory structure in the repository where we can find the artifact or where it should be uploaded.
For example, a POM with the values:

<groupId>com.devflection</groupId>
<artifactId>maven-example-project<artifactId>
<version>1.0</version>

will mean that in the Maven repository ($M2_REPO), our maven-example-project artifact version 1.0 should be located in the folder

$M2_REPO/com/devflection/maven-example-project/1.0

Another important field is the packaging field since it tells Maven how we want our project to be packaged. The value of the element can be any of the following: pom, jar, maven-plugin, ejb, war, ear, rar, and if nothing is present it defaults to jar.

Dependencies

Now that we know what a Maven repository is and how Maven locates artifacts, we can begin to imagine how Maven manages dependencies for projects. We can list dependencies of our project in the POM file, and then Maven searches for the artifacts in the Maven repository and includes them in our build process at the specified point. The drawback of this is that Maven only supports dependencies that are in a Maven repository.
The project’ dependencies are listed in the dependencies field in the POM file using the identifying trio (groupId, artifactId and version), which are again mandatory.
The dependencies can also have other fields set:

  • type
    what kind of dependency we want to include. If nothing is present it defaults to jar.

  • classifier
    with the classifier, we can distinguish between artifacts that were built from the same POM but are different in content.
    For example, some projects use the classifiers sources and javadoc to share the project source files and the API documentation.

  • scope
    The scope element refers to the classpath of the task that is currently executing. There are five scope values available:

    • compile
      This scope is the default one. Compile dependencies are available in all classpaths.

    • provided
      With this scope, we indicate that we expect the JDK or container to provide the dependency. The dependency is available in compile and test classpaths, and it is not transitive.

    • runtime
      With this scope, we indicate that the dependency is not needed for compiling, but is for running. The dependency is available in the runtime and test classpaths, but not in compile classpath.

    • test
      With this scope, we indicate that this dependency is not needed for normal application execution, but that this dependency is only available in the test classpath. This scope is also not transitive.

    • system
      With this scope, we indicate that the dependency is always there and we have to supply the jar which contains it.

  • systemPath
    This element is used only if the dependency scope is system. The value of the element should be the absolute path to the jar containing this dependency.

  • optional
    With the optional element, we mark dependencies that are not necessary. This is used in specific cases when our project is used as a dependency on another project and our project is dependent on some other project to compile some code that is not used at runtime. Then we can mark the project we are depending on as optional since the project that depends on our project does not need it.

  • exclusions
    With this element, we tell Maven we do not want to include this specific transitive dependency in our project. We could also remove all transitive dependencies by putting the wildcard character (*) in both the groupId and artifactId.

A complete dependency example looks like this:

<dependencies>
	<dependency>
	  <groupId>...</groupId>
	  <artifactId>...</artifactId>
	  <version>...</version>
	  <type>...</type>
	  <scope>...</scope>
	  <classifier>...</classifier>
	  <optional>...</optional>
	  <systemPath>...</systemPath>
	  <exclusions>
		<exclusion>
			<groupId>...</groupId>
			<artifactId>...</artifactId>
		</exclusion>
	  </exclusions>
	</dependency>
	...

</dependencies>

Properties

One final thing we are going to look at in this post are the Maven properties.
We can add a properties field to the POM file where we can define all the properties in one place and reuse them in other places in the same POM or the child POM files.
To access a property we listed in the properties field we can use the notation ${PROPERTY_NAME}.
For example, if we define a property like this:

<properties>
	<deflection-utils.version>1.0.2</deflection-utils.version>
</properties>

We can reuse it in the dependencies list like so:

<dependencies>
	<dependency>
            <groupId>com.devflection</groupId>
            <artifactId>utils-collections</artifactId>
            <version>${deflection-utils.version}</version>
	</dependency>		
	<dependency>
            <groupId>com.devflection</groupId>
            <artifactId>utils-io</artifactId>
            <version>${deflection-utils.version}</version>
	</dependency>	
</dependencies>

And then later, when we want to upgrade to a newer version of utils, we just update the property value and all the new dependencies will be used.

There are also four other groups of properties we can access in the POM files and they are accessed by using a prefix in the notation. The groups are:

  • Environment
    Environment properties are accessible under the prefix env; e.g. ${env.PATH}.

  • Project
    Project properties are accessible under the prefix project; e.g. ${project.version}.

  • Settings
    Properties that are defined in a settings.xml file are accessible with the prefix settings; e.g. ${settings.localRepository}.

  • Java
    Java specific properties are accessible with the prefix java; e.g. ${java.home}.

Example

Now that we learned some of the basics of Maven, let us put it all together and look at a POM file of a simple example project. The POM file that we are using for our example project is:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
		 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
		 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>

	<groupId>com.devflection</groupId>
	<artifactId>simple-maven-example</artifactId>
	<version>1.0-SNAPSHOT</version>
	<packaging>jar</packaging>

	<properties>        
		<apache.commons.lang3.version>3.9</apache.commons.lang3.version>
		<junit.version>4.11</junit.version>
	</properties>

	<dependencies>
		<dependency>
			<groupId>junit</groupId>
			<artifactId>junit</artifactId>
			<version>${junit.version}</version>
			<scope>test</scope>
		</dependency>
		<dependency>
			<groupId>org.apache.commons</groupId>
			<artifactId>commons-lang3</artifactId>
			<version>${apache.commons.lang3.version}</version>
		</dependency>
	</dependencies>

</project>

This is almost the simplest project setup we might see. We have the mandatory trio to identify the project - groupId, artifactId and version, together with the packaging which specifies that we want to build a jar file. Then we have a couple of properties, that are used later in the dependencies to set the versions of the dependencies.
Our POM file relies a lot on the super POM to provide the default settings.
For example some of the settings in the super POM that we are counting on are:

  • Maven central repository
    The maven central repository is used to search for the listed dependencies
  • Output location for build things
    Everything is built into the target folder in the project root folder
  • Output location for compiled class files
    All the classes are compiled into the classes folder in the target folder
  • Final artifact build name
    The final build artifact name is used by joing the artifactId and version fields together with a - between them.

The part of the super POM that defines this is

<project>
  <modelVersion>4.0.0</modelVersion>

  <repositories>
	<repository>
	  <id>central</id>
	  <name>Central Repository</name>
	  <url>https://repo.maven.apache.org/maven2</url>
	  <layout>default</layout>
	  <snapshots>
		<enabled>false</enabled>
	  </snapshots>
	</repository>
  </repositories>

	...
	
	<build>
		<directory>${project.basedir}/target</directory>
		<outputDirectory>${project.build.directory}/classes</outputDirectory>
		<finalName>${project.artifactId}-${project.version}</finalName>
		...
	</build>

	...
</project>

These are just some of the settings in the super POM that we are relying on and there are more, so make sure to check the super POM in your Maven version to see what is defined there and to get a better understanding to why Maven does things a particular way.

As always, the full project is on GitHub.


This is it for our Maven basics post.
Thank you for reading through, I hope you found it useful and be sure to check out the next post, where we will dive deeper into Maven plugins.


java  maven