There’s a saying in the OCR industry, “garbage in, garbage out”. In short, it is very challenging to get accurate OCR output from documents that even humans struggle to read. Thankfully, due to advancements in computer vision, there are now solutions for these challenges.
Recently, the U.S. Department of Justice (DOJ) released the 400+ page Mueller Report (140MB PDF document). If you are interested in the workings of the U.S. government, you may have downloaded it, expecting to be able to educate yourself. Unfortunately, your attempts at searching for specific topics with the trusty Control-F key sequence ended in utter frustration. You may have just given up and looked for a summary write up, but others may not have that luxury.