PHP Classes

PHP Stanford NLP Datastore: Analyse text with NLP and stores in a database

Recommend this page to a friend!
  Info   View files Example   Screenshots Screenshots   View files View files (19)   DownloadInstall with Composer Download .zip   Reputation   Support forum (1)   Blog    
Ratings Unique User Downloads Download Rankings
Not enough user ratingsTotal: 136 All time: 9,257 This week: 555Up
Version License PHP version Categories
php-stanford-nlp-dat 1.0.3MIT/X Consortium ...5Databases, Text processing, Language, A...
Description 

Author

This package can analyse text with NLP and stores in a database.

It can perform text analysis using a local version of Stanford Natural Language processing .

The results are stored in a local SQLite database for further analysis.

Picture of Dennis de Swart
Name: Dennis de Swart <contact>
Classes: 3 packages by
Country: The Netherlands The Netherlands
Age: ???
All time rank: 300482 in The Netherlands The Netherlands
Week rank: 420 Up10 in The Netherlands The Netherlands Up
Innovation award
Innovation award
Nominee: 2x

Example

<?php

   
/**
     * Instantiate
     */
   
require_once __DIR__.'/bootstrap.php';
   
   
/**
     * Init template
     * Init CoreNLP Adapter
     */
   
$template = new Template();
   
$coreNLP = new CorenlpAdapter();
   
$datastore = new Datastore($db->conn);
   
   
/**
     * Init variables
     */
   
$text = '';
   
$search = '';
   
$enterButton = '';
   
$searchButton = '';
   
$helpButton = '';

   
/**
     * POST procedure
     */
   
if ($_SERVER["REQUEST_METHOD"] == "POST") {
       
       
// clean up the post array
       
$_POST = filter_input_array(INPUT_POST, FILTER_SANITIZE_STRING);

       
// check if "clear database" checkbox is set
       
if(array_key_exists("emptyDB", $_POST)){
           
$db->clearAllTables();
        }
       
       
// Analyze the text
       
if(!empty($_POST['text'])){
           
$text = $_POST['text'];
           
           
// runs the CoreNLP Adapter and saves result in "$coreNLP->serverMemory"
           
$coreNLP->getOutput($text);
           
           
// Save result to database
           
$datastore->storeNLP($coreNLP);
           
        } elseif(!empty(
$_POST['helpButton'])){
           
$helpButton = $_POST['helpButton'];
       
        } elseif(!empty(
$_POST['enterButton'])){
           
$enterButton = $_POST['enterButton'];
           
        } elseif(!empty(
$_POST['searchButton'])){
           
$searchButton = $_POST['searchButton'];
        } elseif(!empty(
$_POST['search'])){
           
$search = $_POST['search'];
        }
    }

   
// display the form
   
$template->getForm($text, $searchButton, $search);

    if(
$helpButton){
       
$template->getHelp();
        die;
    }
   
    if(!empty(
$text) || !empty($search) ||$searchButton == '1' || $enterButton == '1'){
     
       
?>
<!-- RESULTS -->
    <table>
    <th>
        <tr>
            <td>
                <?php
                    $oie
= new OpenIE($db->conn);
                   
                    if(
$search){
                       
$ieSearch = $oie->openieSearch($search);
                       
$searchWord = 'for text containing the word "'.$search.'"';
                    } else {
                       
$ieSearch = $oie->openieSearch();
                       
$searchWord = 'for all words';
                    }
                   
                   
$template->getTable($ieSearch, 'OpenIE', $searchWord);
               
?>
</td>
        </tr>
        <tr>
            <td>
                <?php
               
                    $ner
= new NER($db->conn);
               
                    if(
$search){
                       
$nerWords = $ner->searchEntities($search);
                       
$searchWord = 'for text containing the word "'.$search.'"';
                    } else {
                       
$nerWords = $ner->searchEntities();
                       
$searchWord = 'for all words';
                    }
               
                   
$template->getTable($nerWords, 'NER Entities', $searchWord);

                   
/**
                     * for the seperate tokens uncomment the two lines below:
                     */
                   
if($search){
                       
$nerTokens = $ner->searchTokens($search);
                       
$searchWord = 'for text containing the word "'.$search.'"';
                    } else {
                       
$nerTokens = $ner->searchTokens();
                       
$searchWord = 'for all words';
                    }
               
                   
$template->getTable($nerTokens, 'NER Tokens', $searchWord);
               
?>
</td>
        </tr>
        <tr>
            <td>
                <?php
                    $coref
= new Coreference($db->conn);
                   
                     if(
$search){
                       
$corefs = $coref->corefSearch($search);
                       
$searchWord = 'that refer to the word "'.$search.'"';
                    } else {
                       
$corefs = $coref->corefSearch();
                       
$searchWord = 'for all words';
                    }
                   
                   
$template->getTable($corefs, 'Corefs', $searchWord);
               
?>
</td>
        </tr>
    </table>
    <!-- END RESULTS -->
<?php
   
}
    echo
'</div></body></html>';


Details

PHP Stanford NLP Datastore

Version Total Downloads Maintenance Minimum PHP Version License

Stores NLP data from Stanford CoreNLP server.

What does it do?

It analyses a text using Stanford CoreNLP server, then stores the result.

Which data gets stored?

  • OpenIE: these are "Subject-Relation-Object" triples. The concept is similar to "Subject-Verb-Object" triples.
    http://stanfordnlp.github.io/CoreNLP/openie.html
    
  • Named-Entities: if a word is a "Named Entity", like a Location, Name or Time, it will store this data
    http://stanfordnlp.github.io/CoreNLP/ner.html
    
  • Coreference: if there is a reference to a word in another sentence.
    http://stanfordnlp.github.io/CoreNLP/coref.html
    

How does it work?

  • You submit a text.
  • The text is analyzed by the Stanford CoreNLP server
  • Results are stored in a SQLite file based database. The database file is called "datastore.db"
    https://sqlite.org/
    https://github.com/sqlitebrowser
    
  • The results are displayed on screen
  • There is also a search form to find data

This package depends on Stanford CoreNLP Server

http://stanfordnlp.github.io/CoreNLP/index.html#download

This package also depends on PHP-Stanford-CoreNLP-Adapter

https://github.com/DennisDeSwart/php-stanford-corenlp-adapter

Note: since this package contains a full version of the CoreNLP Adapter, you can use all of it's features with this package.

Installation

This package depends on these packages:

http://stanfordnlp.github.io/CoreNLP/index.html#download
https://github.com/DennisDeSwart/php-stanford-corenlp-adapter
https://github.com/doctrine/dbal
https://github.com/guzzle/guzzle

Install procedure using the ZIP files

  • Install Stanford CoreNLP Server. Check the "php-stanford-corenlp-adapter" package for an installation walkthrough
  • Download and unpack the files from this package.
  • Copy the files to your to your webserver directory. Usually "htdocs" or "var/www".
  • Run a Composer update to install the dependencies

Install as part of another project

  • Install Stanford CoreNLP Server. Check the "php-stanford-corenlp-adapter" package for an installation walkthrough
  • Add the following lines to your main project's "composer.json" require section:
    {
        "require": {
            "dennis-de-swart/php-stanford-nlp-datastore": "*"
        }
    }

  • Run a Composer update to install the dependencies
    Copy these files from "/vendor/dennis-de-swart/php-stanford-nlp-datastore" to your webserver directory. Usually "htdocs" or "var/www".
    
    
  • datastore.db
  • bootstrap.php
  • Example code for your main project:
    // instantiate constants and the database
    require_once __DIR__.'/bootstrap.php';
    
    // startup Corenlp Adapter
    $coreNLP = new CorenlpAdapter();
    $coreNLP->getOutput($yourText);
    print_r($coreNLP->serverMemory); // result from CoreNLP Adapter
    
    // Save result to database
    $datastore = new Datastore($db->conn);
    $datastore->storeNLP($coreNLP);
    

Requirements

  • PHP 5.6 or higher: it also works on PHP 7
  • Java SE Runtime Enviroment, version 1.8
  • Stanford CoreNLP Server 3.7.0
  • Windows or Linux/Unix 64-bit OS, 8Gb or more memory recommended.
  • Composer for PHP
    https://getcomposer.org/
    

SQLite Browser

If you need a SQLite browser check here:

http://sqlitebrowser.org/

Important notes

  • Starting the CoreNLP server for the first time, takes some time because it will load a large amount of data.
  • After the first startup, the server will be much faster.
  • In my experience the Stanford CoreNLP server runs best with 8Gb of memory or more. Start the server with "-mx8g" instead of "-mx4g".
  • Also use version 3.7.0 of the server, this gives you the best and quickest results.

Example output

See - "datastore_result_a.PNG" - "datastore_result_b.PNG" - "datastore_result_search.PNG"

and "example.db", this is how a filled database looks like

Any questions?

Let me know. You can create an issue on GitHub. Any bugs will be fixed ASAP.


Screenshots  
  • datastore_result_a_small.png
  • datastore_result_b_small.png
  • datastore_result_search_small.png
  Files folder image Files  
File Role Description
Files folder imagesrc (11 files)
Accessible without login Plain text file bootstrap.php Aux. Bootstrap script
Accessible without login Plain text file composer.json Data Auxiliary data
Accessible without login Image file datastore_result_a.PNG Data Auxiliary data
Accessible without login Image file datastore_result_b.PNG Data Auxiliary data
Accessible without login Image file datastore_result_search.PNG Data Auxiliary data
Accessible without login Plain text file index.js Data Auxiliary data
Accessible without login Plain text file index.php Example Example script
Accessible without login Plain text file README.md Doc. Documentation

  Files folder image Files  /  src  
File Role Description
  Plain text file Coreference.php Class Class source
  Plain text file Database.php Class Class source
  Plain text file Datastore.php Class Class source
  Plain text file NER.php Class Class source
  Plain text file Object.php Class Class source
  Plain text file OpenIE.php Class Class source
  Plain text file Relation.php Class Class source
  Plain text file Sentence.php Class Class source
  Plain text file Subject.php Class Class source
  Plain text file Template.php Class Class source
  Plain text file Word.php Class Class source

 Version Control Unique User Downloads Download Rankings  
 86%
Total:136
This week:0
All time:9,257
This week:555Up