PHP Classes

How to Implement a PHP DOCX to Text Converter to Access Microsoft Word Documents from PHP Applications - PHP DOCX to Text package blog

Recommend this page to a friend!
  All package blogs All package blogs   PHP DOCX to Text PHP DOCX to Text   Blog PHP DOCX to Text package blog   RSS 1.0 feed RSS 2.0 feed   Blog How to Implement a PH...  
  Post a comment Post a comment   See comments See comments (3)   Trackbacks (0)  

Author:

Updated on: 2021-11-18

Posted on: 2021-11-18

Viewers: 226 (November 2021)

Package: PHP DOCX to Text

DOCX is a popular document format used by the Microsoft Word program to save and load word processing documents.

DOCX documents can be complex because they can contain many types of documents, like text, images, and other styles.

If you need to extract text from a DOCX document, it may be a complex task.

Read this article to learn how to extract text from DOCX documents so that you can process that text in any PHP application.




Loaded Article

In this article you will learn:

Introduction to the PHP DOCX to Text Package

What the PHP DOCX to Text Package Does in Practice

How Can the PHP DOCX to Text Convert Microsoft Word Documents in Practice

Download and Install the PHP DOCX to Text Package Using PHP Composer


Introduction to the PHP DOCX to Text Package

Recently I developed and published a PHP class to parse DOCX to HTML with images.

Then I thought that some sections of the code would be very useful as standalone classes for other uses.

With Microsoft Word files being commonly used to transfer and store information, I thought that a class that enabled manipulation and searching on the text of a Word document could be useful.

What the PHP DOCX to Text Package Does in Practice

I created this PHP DOCX to Text class. This will extract all the text contained in a Microsoft Word DOCX document. The text extracted, includes all footnotes and endnotes together with list and paragraph numbering.

The output of this class is an array with each element containing a paragraph of text from the original document.

This array can be easily manipulated using PHP to enable it to carry text searches of Word documents, extract certain sections of text, or to save the text of a Word document to a database for subsequent use.

For convenience, the first element of the array shows the number of text elements contained in the array, together with the maximum length of an element of the array in the format 'number:length'.

Knowing the maximum length of a text element of the array could be useful if the text is being saved to a database.

How Can the PHP DOCX to Text Convert Microsoft Word Documents in Practice

The example textdemo.php that you can see below shows how use this this class to extract the text from a Word document with the resultant array then being processed to display each text element (paragraph) on screen along with its element number.

The example script file expects the DOCX file with the name sample.docx.

The number of elements and the maximum length of a text element are also displayed.

<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

</head>

<body>
<?php

require_once('wordtext.php');

$rt 
= new WordTEXT(false,'UTF-8');

$text 
$rt->readDocument('sample.docx');

$det explode(':',$text[0]);

echo 
"No of text elements in the array - ".$det[0]."<br>";
echo 
"Max length of a text element in the array - ".$det[1]."<br>&nbsp;<br>";

$LC 
1;
while (
$LC <= $det[0]){
    echo 
"Element ".$LC." : ".$text[$LC]."<br>";
    
$LC++;
}

?>
</body>

Download and Install the PHP DOCX to Text Package Using PHP Composer

You can download or install the PHP DOCX to Text package using PHP Composer tool by going to this download page to get the package code. That page also contains instructions on how to install package using PHP Composer from the PHP Classes site.




You need to be a registered user or login to post a comment

Login Immediately with your account on:

FacebookGmail
HotmailStackOverflow
GitHubYahoo


Comments:

1. Getting error message on Test - Charles Patton (2021-11-18 19:18)
Copied code from Documentation and tested - got an error... - 2 replies
Read the whole comment and replies



  Post a comment Post a comment   See comments See comments (3)   Trackbacks (0)  
  All package blogs All package blogs   PHP DOCX to Text PHP DOCX to Text   Blog PHP DOCX to Text package blog   RSS 1.0 feed RSS 2.0 feed   Blog How to Implement a PH...  
For more information send a message to info at phpclasses dot org.