I had just settled down to write some ScalaCheck tests when suddenly my peace and harmony was shattered by the news that the “download PDF” functionality on one of our sites had recently ceased functioning, and it would be quite nice if it could be restored to good working order. And so begins our tale.
First step – can I reproduce it? Yes – if I visit the page in the supplied screenshot and attempt to download it as a PDF I get an error message “Failed to create a PDF for the document”
Although some documents on the site have PDFs already available for download, there are some that don’t, and in those cases we invoke the application wkhtmltopdf to create one.
So I took a look at the site logs, and attempted to download a PDF:
2024-11-19 13:05:26,089 INFO controllers.DocumentController :: Displaying document 2018ewhc818fam_TNA in index card 2018000893 as a generated pdf
2024-11-19 13:05:26,094 INFO com.sixtysevenbricks.iclr.documents.DocumentService :: Allowing access to 2018ewhc818fam_TNA for reporter/editor/admin user REDACTED
2024-11-19 13:05:26,094 DEBUG com.sixtysevenbricks.iclr.pdf.WkHtmlToPdfGenerator :: generateUrl(): http://127.0.0.1:9000/document/2018000893/2018ewhc818fam_TNA/html?name=REDACTED
2024-11-19 13:05:26,096 WARN controllers.DocumentController :: displayDocumentAsPdf(): Failed to create PDF for document '2018ewhc818fam_TNA'
java.io.IOException: Cannot run program "/usr/local/bin/wkhtmltopdf": error=0, Failed to exec spawn helper: pid: 2403844, exit value: 1
at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1143)
at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1073)
at com.sixtysevenbricks.iclr.pdf.WkHtmlToPdfGenerator.$anonfun$generatePdf$1(WkHtmlToPdfGenerator.scala:26)
So there’s a problem running wkhtmltopdf
. Check it exists at the given path (it does) and can be run and produces output (it does). The interesting thing is “Failed to exec spawn helper”
It’s going to be fiddly to debug this code in situ but I have the cunning plan of writing the simplest possible Java program that will do what the actual application code is doing:
public class RunProcess {
public static void main(String[] args) {
File pdfFile = new File("/tmp/daniel.pdf");
File outputFile = new File("/tmp/daniel.txt");
try {
Process process = new ProcessBuilder("/usr/local/bin/wkhtmltopdf",
"--print-media-type",
"--no-background",
"http://127.0.0.1:9000/document/2018000893/2018ewhc818fam_TNA/html?displayIndexCardOverlay=false&pdfGenerate=true&name=REDACTED",
pdfFile.getAbsolutePath())
.redirectErrorStream(true)
.redirectOutput(ProcessBuilder.Redirect.appendTo(outputFile))
.start();
int result = process.waitFor();
System.out.println("Status " + result);
} catch (InterruptedException ex) {
ex.printStackTrace();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
Put this on the server as RunProcess.java
, compile and run and… it works. The PDF is generated as expected.
So my next thought was to wonder whether I could get more information about what was going wrong. It turns out that the log messages from the site log weren’t the whole story; there were more tantalising bits to be found using journalctl -f
and repeating the “download PDF” test:
Nov 19 13:05:26 [debug] - com.sixtysevenbricks.iclr.pdf.WkHtmlToPdfGenerator - generateUrl(): http://127.0.0.1:9000/document/2018000893/2018ewhc818fam_TNA/html?displayIndexCardOverlay=false&pdfGenerate=true
Nov 19 13:05:26 Incorrect number of arguments: 2
Nov 19 13:05:26 jspawnhelper version 17.0.13+11-Ubuntu-2ubuntu120.04
Nov 19 13:05:26 This command is not for general use and should only be run as the result of a call to
Nov 19 13:05:26 ProcessBuilder.start() or Runtime.exec() in a java application
Nov 19 13:05:26 [warn] - controllers.DocumentController - displayDocumentAsPdf(): Failed to create PDF for document '2018ewhc818fam_TNA'
Nov 19 13:05:26 java.io.IOException: Cannot run program "/usr/local/bin/wkhtmltopdf": error=0, Failed to exec spawn helper: pid: 2403844, exit value: 1
Nov 19 13:05:26 at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1143)
I wonder what jspawnhelper
is? Turns out it is /usr/lib/jvm/java-17-openjdk-amd64/lib/jspawnhelper
. What happens if I run it?
l$ /usr/lib/jvm/java-17-openjdk-amd64/lib/jspawnhelper
Incorrect number of arguments: 1
jspawnhelper version 17.0.13+11-Ubuntu-2ubuntu120.04
This command is not for general use and should only be run as the result of a call to
ProcessBuilder.start() or Runtime.exec() in a java application
If I run it, I get a similar “Incorrect number of arguments” message. So, naturally, the next question is “what do the arguments look like”?
One quick Google DuckDuckGo later, and I am aware of the strace
utility, and after a bit of faff, I discover that I can find what happens when my simple Java application tries to call this jspawnhelper
thing:
$ strace -s 255 -f -etrace=execve -everbose=execve java RunProcess
execve("/usr/bin/java", ["java", "RunProcess"], 0x7ffe88a48a80 /* 26 vars */) = 0
...
[pid 2418068] execve("/usr/lib/jvm/java-17-openjdk-amd64/lib/jspawnhelper", ["/usr/lib/jvm/java-17-openjdk-amd64/lib/jspawnhelper", "17.0.13+11-Ubuntu-2ubuntu120.04", "9:10:12"], 0x7ffdd659f6d0 /* 26 vars */) = 0
[pid 2418068] execve("/usr/local/bin/wkhtmltopdf", ["/usr/local/bin/wkhtmltopdf", "--print-media-type", "--no-background", "http://127.0.0.1:9000/document/2018000893/2018ewhc818fam_TNA/html?displayIndexCardOverlay=false&pdfGenerate=true&name=REDACTES", "/tmp/daniel.pdf"], 0x7ffd72d56088 /* 26 vars */) = 0
...
We see in the execve
call that there are 3 arguments; the zeroth (in UNIX tradition) is the name of the binary used to call it, the first looks like a Java version number, and I’m not sure what the third is. But it means I can experiment by calling it:
$ /usr/lib/jvm/java-17-openjdk-amd64/lib/jspawnhelper first second
Incorrect Java version: first
jspawnhelper version 17.0.13+11-Ubuntu-2ubuntu120.04
It seems that the first version is supposed to be the build version reported when running java
without parameters.
$ /usr/lib/jvm/java-17-openjdk-amd64/lib/jspawnhelper 17.0.13+11-Ubuntu-2ubuntu120.04 second
Incorrect FD array data: second
jspawnhelper version 17.0.13+11-Ubuntu-2ubuntu120.04
So the last parameter is presumably a set of file descriptors that tell this utility what it’s supposed to be doing. I’m not concerned about them, but let’s try to see what the failing application is doing. It turns out one can strace
a running process, so if I try sudo strace -s 255 -f -etrace=execve -everbose=execve -p 1593594
and download the PDF, I see:
strace: Process 2414425 attached
[pid 2414425] execve("/usr/lib/jvm/java-17-openjdk-amd64/lib/jspawnhelper", ["/usr/lib/jvm/java-17-openjdk-amd64/lib/jspawnhelper", "307:311:313"], 0x7ffe34173620 /* 17 vars */) = 0
[pid 2414425] +++ exited with 1 +++
Only two arguments! What happened to the Java version?
At this point, a sixth sense kicked in, and I thought about checking to see if there had been any shenanigans with updating the installed Java version. Another quick Bing DuckDuckGo and I find that apt
stores history at /var/log/apt/history.log
so I take a look at that and see:
Start-Date: 2024-11-13 06:08:43
Commandline: /usr/bin/unattended-upgrade
Upgrade: openjdk-17-jre-headless:amd64 (17.0.12+7-1ubuntu2~20.04, 17.0.13+11-2ubuntu1~20.04), openjdk-17-jre:amd64 (17.0.12+7-1ubuntu2~20.04, 17.0.13+11-2ubuntu1~20.04)
End-Date: 2024-11-13 06:08:51
Interesting; I presume this is an update from version 17.0.12+7...
to 17.0.3+11...
on 13th November. How long has the server been running?
$ ps -ef | grep java
ubuntu 1593594 1 2 Nov12 ? 04:51:23 java -Duser.dir=REDACTED -XX:MaxRAMPercentage=65 -Dhttp.address=0.0.0.0 -Dhttp.port=9000 -Dplay.http.secret.key=REDACTED -Dconfig.file=REDACTED -cp REDACTED play.core.server.ProdServerStart
Since November 12th! So I’m guessing the java
binary has had its directory unceremoniously removed from under it, and my hypothesis is that one manifestation of this is that when it tries to get hold of the current version to pass to jspawnhelper
it is failing, but without raising an error (or, alternatively, maybe the .13
version added the requirement for the version to be passed as an argument). Either way, my hypothesis now is that killing the java
process will fix the issue; systemd
will note its demise and start a new one almost instantly, downtime will be unavoidable but minimal, and PDF downloads should start working again…
…and they did. And we all lived happily ever after.
I think the moral of this story is, as is so often the case, the solution really is to turn it off and on again.