Archive for July, 2009|Monthly archive page
Read PDF and Word DOC Files Using PHP
Read PDF and Word DOC Files Using PHP
Reading PDF Files
To read PDF files, you will need to install the XPDF package, which includes “pdftotext.” Once you have XPDF/pdftotext installed, you run the following PHP statement to get the PDF text:
view sourceprint?
1.$content = shell_exec(‘/usr/local/bin/pdftotext ‘.$filename.’ -‘); //dash at the end to output content
Reading DOC Files
Like the PDF example above, you’ll need to download another package. This package is called Antiword. Here’s the code to grab the Word DOC content:
view sourceprint?
1.$content = shell_exec(‘/usr/local/bin/antiword ‘.$filename);
The above code does NOT read DOCX files and does not (and purposely so) preserve formatting. There are other libraries that will preserve formatting but in our case, we just want to get at the text.
Install Fonts in Fedora
- Log in as root or use su at command line
$ su - Go to the font storage directory:
# cd /usr/share/fonts - Create a subdirectory for the Arial fonts:
# mkdir arial - Copy the Arial fonts into this directory from font sites or windows FONTS folder.
- Make the font files accessible systemwide:
# chmod 0775 -R arial - Run fc-cache to cache the arial fonts on system:
# fc-cache arial
http://www.myvirtualdisplay.com/2009/06/28/installing-fonts-in-fedora/
Install the Alternative PHP Cache (APC)
The Alternative PHP Cache (APC) is a free, open, and robust framework for caching and optimizing PHP intermediate code.
yum install php-pear
yum install php-devel httpd-devel
yum groupinstall ‘Development Tools’
yum groupinstall ‘Development Libraries’
pecl install apc