As part of our internal security work at Codethink, we've been working on repeatedly deploying and configuring openvas to constantly scan and report on our internal system.
There are various roles out there on ansible-galaxy that will take care of installing this application for you, but their main advantage is the ability to install on platforms we don't use, and the ones we looked over didn't provide any configuration of openvas itself.
Initial installation was easy enough, but when we came to do the configuration we hit an issue - openvas is very... XML. There aren't any existing modules to control openvas' configuration from ansible, so we need to implement this ourself.
Typically when we do this, we want our playbooks to remain as idempotent as possible. This means if we're running manual commands, we should be checking the current state, and modifying the state to meet our desired state only when necessary, as this means we're not having to change things and restart services unnecessarily. To do this, we need the ability to look at the output of the command
module, check what entries are currently there, and add / remove / edit any entries that need modification.
Our initial assumption was that we could just throw the xml for reading and asserting state through a from_xml
filter, and receive a nice object to manipulate at ease, similar to various json-talking applications. Unfortunately this doesn't exist, because XML doesn't really work like the json / yaml / ini files we're used to parsing.
parse_xml
Ansible does have an xml parsing module though! It was written for, and is documented as a part of the network orchestration subsystem in the documentation:
To convert the XML output of a network device command into structured JSON output, use the parse_xml filter:
We aren't controlling a networking device, but we do need to parse XML. We can probably use this to make our playbook idempotent! Let's have a go at configuring schedules:
Grabbing some XML
- name: Get all openvas schedules
command: "/usr/bin/omp --username=adminuser --password=adminuser -p 9390 -X '<get_schedules/>'"
check_mode: no # run in check mode
changed_when: no # never show as having updated anything
register: openvas_existing_schedules_raw
This returns us a chunk of XML showing all the schedules currently configured within openvas. I'm not going to paste it here, because it's giant, but we can pull up the specification for the XML in the omp protocol documentation.
We're going to be taking the xml returned by this and turning it into a dict of objects that we can use within ansible to check what we need to do to assert state. First, we need to identify which bits of this are useful to us. As we plan on only having specific schedules set up, We're going to make a couple of assumptions here, namely that if a schedule exists with the correct name, it is the correct schedule. So we need the name
. If we're going to modify a schedule, we need its id
. We can't remove a schedule that's currently being used, so in_use
is useful for us, and the friendly name in comment
might be useful for displaying to the user while running the playbook.
parse_xml configuration
Once we've identified the above, we can get on with using the parse_xml
module to... parse the XML. We need to set up a new YAML file containing two entries, keys:
, which is the map of how to extract the data from the xml and vars:
, which is the object we'll be creating. These are a bit interleaved, as they reference each other, so keeping them both in the same file appears to be good practice.
For the schedules, we have a file in our role, xml_spec/schedule
containing the following:
keys:
schedules:
value: "{{ schedule }}"
top: "schedule"
items:
name: name
comment: comment
id: ".[@id]"
in_use: in_use
vars:
schedule:
key: "{{ item.name }}"
values:
name: "{{ item.name }}"
comment: "{{ item.comment }}"
in_use: "{{ item.in_use == 1 }}"
id: "{{ item.id.get('id') }}"
keys:
is being used here to define how to extract data from the XML:
value
is the object described invars:
we'll be unpacking the XML into, in this caseschedule
- this is applied when doing the parsing as a jinja filtertop
is an xpath expression pointing to the element in the xml which contains the elements we want to turn into our dict. In this case,schedule
searches under the root node for the element<schedule>...</schedule>
items
is a dict of items within those elements we want to extract. These can be specified using xpath expressions as above.name: name
,comment: comment
andin_use: in_use
all refer directly to the tags of elements within our schedule elementid: ".[@id]"
is an xpath expression that grabs theid
attribute of the top level element, in our case the schedule element
vars:
is being used to describe the objects we'll be unpacking the XML into, in this case schedule
. They unpack into a top-level dict, containing our specified object as a sub-dict.
-
key
is the key that should be used in the top level dict for each object. In this case, we're setting it toitem.name
, which refers to the name field of the schedule we set up inkeys:
-
values
are the values we're loading into each dict entry. We're using jinja expressions to extract them, so for simple text elements likename
andcomment
we can grab them directly fromitem
, and forin_use
we're using a comparison to have a boolean available in our dict.id
is a bit more complicated as we have to extract the actual id from the xpath return object.
Actually using parse_xml
- name: Parse XML to extract schedules
set_fact:
openvas_existing_schedules: "{{ ( openvas_existing_schedules_raw.stdout | parse_xml ('roles/openvas/xml_spec/schedule') )['schedules'] }}"
This takes the stdout from the command we ran previously, filters it using our schedule
specification, and extracts the schedules key.
We only want two schedules for our system, a daily run at 2am, and a weekly run at 6am on sunday. We can now use the above variable to create our new schedules if they don't already exist, and remove any existing schedules that shouldn't exist:
- name: Create daily schedule if it doesn't already exist
command: "{{ omp_command }} -X '<create_schedule><name>daily</name><comment>Daily @ 2am</comment><first_time><day_of_month>7</day_of_month><hour>2</hour><minute>0</minute><month>6</month><year>2020</year></first_time><duration>6<unit>hour</unit></duration><period>1<unit>day</unit></period></create_schedule>'"
when:
- "'daily' not in openvas_existing_schedules"
- name: Create weekly schedule if it doesn't already exist
command: "{{ omp_command }} -X '<create_schedule><name>weekly</name><comment>Weekly @ 6am Sunday</comment><first_time><day_of_month>7</day_of_month><hour>6</hour><minute>0</minute><month>6</month><year>2020</year></first_time><duration>12<unit>hour</unit></duration><period>7<unit>day</unit></period></create_schedule>'"
when:
- "'weekly' not in openvas_existing_schedules"
- name: Remove any schedules not listed above
command: "{{ omp_command }} -X '<delete_schedule schedule_id=\"{{ item.value.id }}\"/>'"
when:
- item.key not in ['daily', 'weekly']
loop: "{{ openvas_existing_schedules | dict2items }}"
Conclusion
As you can see above, the parse_xml filter is a significantly more useful addition to the ansible toolbox than the documentation would have you believe. Given our occasional need to deploy xml-speaking applications using modern devops tooling, i'm quite glad to have it available!
Photo by Richard Clyborne of Music Strive
Other Content
- FOSDEM 2025: What to Expect from Codethink
- Codethink Joins Eclipse Foundation/Eclipse SDV Working Group
- Codethink/Arm White Paper: Arm STLs at Runtime on Linux
- Speed Up Embedded Software Testing with QEMU
- Open Source Summit Europe (OSSEU) 2024
- Watch: Real-time Scheduling Fault Simulation
- Improving systemd’s integration testing infrastructure (part 2)
- Meet the Team: Laurence Urhegyi
- A new way to develop on Linux - Part II
- Shaping the future of GNOME: GUADEC 2024
- Developing a cryptographically secure bootloader for RISC-V in Rust
- Meet the Team: Philip Martin
- Improving systemd’s integration testing infrastructure (part 1)
- A new way to develop on Linux
- RISC-V Summit Europe 2024
- Safety Frontier: A Retrospective on ELISA
- Codethink sponsors Outreachy
- The Linux kernel is a CNA - so what?
- GNOME OS + systemd-sysupdate
- Codethink has achieved ISO 9001:2015 accreditation
- Outreachy internship: Improving end-to-end testing for GNOME
- Lessons learnt from building a distributed system in Rust
- FOSDEM 2024
- QAnvas and QAD: Streamlining UI Testing for Embedded Systems
- Outreachy: Supporting the open source community through mentorship programmes
- Using Git LFS and fast-import together
- Testing in a Box: Streamlining Embedded Systems Testing
- SDV Europe: What Codethink has planned
- How do Hardware Security Modules impact the automotive sector? The final blog in a three part discussion
- How do Hardware Security Modules impact the automotive sector? Part two of a three part discussion
- How do Hardware Security Modules impact the automotive sector? Part one of a three part discussion
- Automated Kernel Testing on RISC-V Hardware
- Automated end-to-end testing for Android Automotive on Hardware
- GUADEC 2023
- Embedded Open Source Summit 2023
- RISC-V: Exploring a Bug in Stack Unwinding
- Adding RISC-V Vector Cryptography Extension support to QEMU
- Introducing Our New Open-Source Tool: Quality Assurance Daemon
- Achieving Long-Term Maintainability with Open Source
- FOSDEM 2023
- Think before you Pip
- BuildStream 2.0 is here, just in time for the holidays!
- A Valuable & Comprehensive Firmware Code Review by Codethink
- GNOME OS & Atomic Upgrades on the PinePhone
- Flathub-Codethink Collaboration
- Codethink proudly sponsors GUADEC 2022
- Tracking Down an Obscure Reproducibility Bug in glibc
- Web app test automation with `cdt`
- FOSDEM Testing and Automation talk
- Protecting your project from dependency access problems
- Full archive