Skip to main content
Redhat Developers  Logo
  • Products

    Platforms

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat AI
      Red Hat AI
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • View All Red Hat Products

    Featured

    • Red Hat build of OpenJDK
    • Red Hat Developer Hub
    • Red Hat JBoss Enterprise Application Platform
    • Red Hat OpenShift Dev Spaces
    • Red Hat OpenShift Local
    • Red Hat Developer Sandbox

      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Secure Development & Architectures

      • Security
      • Secure coding
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • Product Documentation
    • API Catalog
    • Legacy Documentation
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

XML editing with Bash script

December 5, 2013
Romain Pelisse
Related topics:
DevOpsJavaLinux
Related products:
Red Hat Enterprise Linux

Share:

    Photo by seeweb

    Countless products uses XML files, whether it is for data persistence, serialization or mere configuration. This is even more true when it comes to the Red Hat middleware portfolio, the JBoss projects having always been keen on using this format for configuration files - on top of the ones specified by JEE such as the famous (or infamous ?) web.xml.  While the XML format has some definitive qualities, it is not the easiest format to parse, and this often causes issues when integrating product inside an RPM or designing an automated installation procedure.

    As I've been working on such automation for most of my career, I've picked up a bunch of nifty tricks and also designed some useful practices that I wanted to share on this blog.

    Command lines

    While one can use 'sed' or 'awk' to process XML files, it is always a tricky job. Indeed, those tools, on the contrary of the XML standard, assume that spacing within the files is structured and relevant. For instance, if a 'sed' statement assume that the XML attribute to edit is on the same line as the node tag, it will break if the file spacing is modified, while the XML file will remains valid.

    Along with this, it is also extremely difficult to rely on such editing tools to perform rather crucial XML changes such as adding child node or removing a complete block. Bottom line is:  those awesome and standard tools are simply not the best ones for the job.

    In this section, we will therefore introduce a couple of tools, available on any good Linux distribution (or easy to install), that will provide better support to handle XML content.

    Validation

    One of the good things with XML is that it's a structured format. However, the bad thing is with it is that it's quite easy to break such structured format. For this reason, it's pretty important, when editing such file within a script, to validate before and after editing that the structure is proper XML.

    I quite recently discovered the command 'xmlwf', coming with the 'expat' package, which allow to perform such validation operation:

    $ xmlwf /tmp/index.xml
    /tmp/index.xml:825:2: mismatched tag

    While quite old, and not perfect (for instance, an invalid file does not result into the command returning a non zero status), this command is still quite handy to me on a daily basis.

    XML edition

    If 'xmlwf' is helpful, the hard point in handling XML files certainly does not rely in their validation, but their editing. As stated previously, adding or removing child elements, or tweaking attributes, are simply not easy to achieve with the regular script tricks. Fortunately, another useful command from the 'libxml' package comes to our rescue for this purpose: xsltproc.

    This allows you to process an XML files using a XSLT style sheet, enabling one to easily modify its structure while ensuring the file remains valid. As the command allows one to pass parameters for the style sheet, it is also quite a handy tool for script usage. Let's look at a concrete example to see how one can leverage this.

    Adding a server to a server group in JBoss AS host definition

    Editing with the XML structure using XSLT

    Since the release of JBoss AS 7 (which is used as a base for JBoss EAP 6), the JEE application server offers a new mode of operation, called domain mode, which allows you to run several instances of the server, even across several systems, as a whole. One key configuration file of this feature is the 'domain/configuration/host.xml' file, which describes how many instance should be run on one host.

    This example will focus on editing this file, within a script, to add server definitions to it.

    The first step here consists of writing an appropriate style sheet. Sadly (and no one will get an argument from me about that) XSLT instructions is not that easy to understand. Especially, if you are coming from a regular RHEL administrator background and never had the basis of it. While I would love to provide some enlightment to the reader on this topic, it is simply off topic, so I will just show the content of the style sheet I designed to add a server entry to the host.xml:

    <xsl:stylesheet version="1.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:domain="urn:jboss:domain:1.4">
    
      <xsl:param name="server-name"/>
      <xsl:param name="server-group"/>
      <xsl:param name="port-offset"/>
    
      <xsl:template match="*" priority="-1">
        <xsl:element name="{name()}">
          <xsl:apply-templates select="node()|@*"/>
        </xsl:element>
      </xsl:template>
    
      <xsl:template match="node()|@*" priority="-2">
        <xsl:copy/>
      </xsl:template>
    
      <xsl:template match="domain:servers">
        <xsl:element name="servers">
          <xsl:apply-templates select="node()|@*"/>
          <xsl:text>	</xsl:text>
          <xsl:message>Adding server 'bob' to 'main-group'</xsl:message>
          <xsl:element name="server">
            <xsl:attribute name="name"><xsl:value-of select="$server-name"/></xsl:attribute>
            <xsl:attribute name="group"><xsl:value-of select="$server-group"/></xsl:attribute>
            <xsl:attribute name="auto-start">true</xsl:attribute>
            <xsl:text>
    </xsl:text><xsl:text>	</xsl:text><xsl:text>	</xsl:text>
            <xsl:element name="socket-bindings">
                <xsl:attribute name="port-offset"><xsl:value-of select="$port-offset"/></xsl:attribute>
            </xsl:element>
            <xsl:text>
    </xsl:text><xsl:text>	</xsl:text><xsl:text>	</xsl:text>
    	<profile name="full-ha"/>
            <xsl:text>
    </xsl:text><xsl:text>	</xsl:text>
          </xsl:element>
          <xsl:text>
    </xsl:text>
        </xsl:element>
      </xsl:template>
    </xsl:stylesheet>

    Here are some key points regarding the style sheet above:

    1. Three XSL parameters - those values need to be provided in the style sheet - they are used to define the server's name, the server group it belonged to along with the port shift value. This last value is indeed quite important as, to peacefully share the same network interface, each instance of the JBoss AS server will need to start its services (HTTP, JMS, and so on) on a different set of ports. The port-shift value is therefore used to shift the default port values for each instance.
    2. A couple of 'xsl:template' instructions are then used to defined how the style sheet should treat ANY elements (node, text,...) of the files it processes. In our case, the default behavior will be to simply copy them as they are to the resulting document. Of course, we'll override this behavior, for the node 'server' to add our server definition, in the last 'xsl:template' instruction.
    3. The last instruction contains all the required code to append a new server definition. An important point to note in this part of the code is the use of our three parameters described above with the instruction 'xsl:value-of'.

    Let's see now how we can now edit this file, using 'xsltproc', to add a server definition:

    add-server.sh
    #!/bin/bash
    
    readonly JBOSS_HOME=/opt/jboss-eap-6
    readonly INSTANCE_ID=1
    readonly PORT_OFFSET=100
    readonly SERVER_GROUP=${SERVER_GROUP:-'main'}
    
    xlstproc --stringparam server-name "server${INSTANCE_ID}" 
             --stringparam server-group "${SERVER_GROUP}" 
             --stringparam port-shift "${PORT_OFFSET}"
             add-server.xsl 
             "${JBOSS_HOME}domain/configuration/host.xml"

    Running this script, one will get the resulting new document with the standard output:

    <host>
        ...
        <servers>
            <server name="server-one" group="main-server-group" auto-start="true">
                <socket-bindings port-offset="100"/>
            </server>
            <server name="server-two" group="main-server-group" auto-start="true">
                <socket-bindings port-offset="200"/>
            </server>
            <server name="server-three" group="main-server-group" auto-start="true">
                <socket-bindings port-offset="300"/>
            </server>
        <server name="server1" group="main" auto-start="true">
            <socket-bindings port-offset="100"/>
            <profile name="full-ha"/>
        </server>
    </servers>
    </host>

    Adding several server definition

    From here, it is quite easy to enhance the script to automatically create as many instances as needed, automatically calculating the required port shift value:

    add-server.sh:
    #!/bin/bash
    
    readonly ORIGINAL_FILE=${1}
    readonly TARGET_FILE=${2:-'$(mktemp)'}
    readonly SERVER_GROUP=${SERVER_GROUP:-'main'}
    
    current_file="${ORIGINAL_FILE}"
    
    for instanceId in {0..2}
    do
      result_file=$(mktemp)
      xlstproc --stringparam server-name "server${instanceId}" 
               --stringparam server-group "${SERVER_GROUP}" 
               --stringparam port-shift "$(expr ${instanceId} * 100)" 
               'add-server.xsl' 
               "${current_file}" > ${result_file}
      current_file=${result_file}
    done
    cp "${current_file}" "${TARGET_FILE}"

    Removing previous server definitions

    The provided 'host.xml' file came with a set of predefined servers, given as an example. Before adding our own server definitions, using the script presented above, we'll need to remove all the existing ones. This is rather easy to implement - we just need to copy the all XML structure except the 'servers' element:

    rm-all-servers.xsl:
    <xsl:stylesheet version="1.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:domain="urn:jboss:domain:1.4">
    
      <xsl:template match="domain:host">
        <xsl:copy>
          <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
      </xsl:template>
    
      <xsl:template match="*" priority="-1">
        <xsl:element name="{name()}">
          <xsl:apply-templates select="node()|@*"/>
        </xsl:element>
      </xsl:template>
    
      <xsl:template match="node()|@*" priority="-2">
        <xsl:copy/>
      </xsl:template>
    
      <xsl:template match="domain:servers">
        <xsl:element name="servers"/>
      </xsl:template>
    </xsl:stylesheet>

    Trouble with name spacing...

    While the command itself is certainly not to blame here, it is worth mentioning that one can run into trouble with name spaces, more especially with their associate attribute 'xmlns'. Indeed, style sheets processing may sometime induce the addition or removal of such attributes, and it is sadly quite difficult to work around. Nevertheless, this is probably the only "XML idiosyncrasies" that I have not successfully (meaning here in a "satisfactory manner") defeated using 'xsltproc'.

    The example above is also a good example of this situation. If you run the command yourself and compare the resulting file with the original, you'll see that the name space attribute for the 'host' node has been removed - which will sadly cause JBoss to refuse to start. After some investigating, I failed to come up with an elegant solution to this problem, so I simply fell back to the use of a good old 'sed' statement. Following the configuration file edition, the statement just adds the missing attribute, as we'll see in the script below.

    readonly JBOSS_HOME=${JBOSS_HOME:-'/opt/jboss/jboss-eap/}
    readonly HOST_FILE="${JBOSS_HOME}/domain/configuration/host.xml"
    readonly EDITED_FILE=$(mktemp)
    readonly RESULT_FILE=$(mktemp)
    
    set -e # fails on the first error
    
    echo -n "Checking if original host.xml is valid... '
    xmlwf "${HOST_FILE}"
    echo 'Done.'
    
    echo -n "Deleting all previous server definition... "
    xsltproc 'rm-all-servers.xsl' "${HOST_FILE}" > "${EDITED_FILE}"
    
    echo -n "Add server instance to host.xml ...'
    ./add-server.sh > "${EDITED_FILE}" "${RESULT_FILE}"
    echo 'Done'
    
    echo -n "Add missing name space attribute... '
    sed -e 's;<host ;<host xmlns="urn:jboss:domain:1.4" ;' "${RESULT_FILE}"
    echo 'Done.'
    
    echo -n "Checking if resulting file is still valid... '
    xmlwf "${RESULT_FILE}"
    echo 'Done.'
    
    echo -n "Replacing host.xml'
    cp "${HOST_FILE}" "${HOST_FILE}.bck" # backing up never hurts...
    cp "${RESULT_FILE}" "${HOST_FILE}"
    echo 'Done.'

    Final words

    As one can see on the script above, the resulting procedure which automates the addition of server definitions to a host.xml, is pretty simple to both understand and maintain. It could easily integrate into an RPM, or simply run by a deployment tool, or a configuration management tool such as Puppet (or by Kickstart when the host is set up).

    But one thing is certain now, having an XML configuration file is no longer a blocker to properly automate deployment or design maintenance scripts. With those two commands line tools and a fair understanding of XSLT, the sky is limit (well, it's not an excuse to go crazy on this...).

    Last updated: February 22, 2024

    Recent Posts

    • Migrating Ansible Automation Platform 2.4 to 2.5

    • Multicluster resiliency with global load balancing and mesh federation

    • Simplify local prototyping with Camel JBang infrastructure

    • Smart deployments at scale: Leveraging ApplicationSets and Helm with cluster labels in Red Hat Advanced Cluster Management for Kubernetes

    • How to verify container signatures in disconnected OpenShift

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue