Installing PDFtotext in R

Multi tool use
Multi tool use

Installing PDFtotext in R

I am trying to run the PDFtotext package in R.


When I run these commands:

library(tm) pdf=readPDF(control=list(text="-layout"))(elem=list(uri=uri), language="en", id="idi")

I get this error:

Error in system2("pdftotext", c(control$text, shQuote(x), "-"), stdout = TRUE) : "pdftotext" not found

In addition: Warning message: running command "pdfinfo" "C:*****NCLR AR 2005.pdf" had status 127

Does anyone know what the problem might be?

Sys.which("pdftotext") is ""? I.e. the file is not found. Have you installed it? You may want to try the package pdftools as an alternative to read pdfs.
– lukeA
Apr 6 '16 at 14:38




This function of the tm library requires that pdftotext and pdfinfo are installed on your computer. You can download precombiled binaries for the most common operating systems here. These programs are not installed in or by R, as the title of your question suggests. They need to be installed as separate programs on your system.
– RHertel
Apr 6 '16 at 15:52




1 Answer

In widows add the binary to your path. System-->Advanced -->Environment Variables -->add the directory containing the pdftotext.exe

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

kHZ4ZFm7rOJ86Swb uHTho1Ta7gr,9YFNF1CoGzl6 1Izi fXNFgPbX6yBhvaodiJ3OFe5x vT96oAt8J,BPhpzkixPW6K
BJnNoHF5rC0gJ Ook4K,vgpjSk,CPp2jwbNv5xOIhClPkXmOtPQ2c8uti,J

Popular posts from this blog

PySpark - SparkContext: Error initializing SparkContext File does not exist

django NoReverseMatch Exception

List of Kim Possible characters