PHP Classes

Improving the use of a MongoDB database with the help of Symfony Listeners

Recommend this page to a friend!
  Blog PHP Classes blog   RSS 1.0 feed RSS 2.0 feed   Blog Improving the use of ...   Post a comment Post a comment   See comments See comments (0)   Trackbacks (0)  

Author:

Viewers: 147

Last month viewers: 8

Categories: PHP Tutorials

Sometimes applications need to filter large amounts of information to show to the user a small subset of relevant data.

However, when the amount data to filter is too large, it may not be feasible to filter the information retrieving the whole data into memory.

Read this article to learn about an alternative approach using a MongoDB document and Symfony listeners to limit the amount of data that needs to be traversed in memory.




Loaded Article

The Problem

Recently, in a real project I have faced the challenge to handle large amounts of data in HTTP requests. To put this in context, imagine a large quantity of records categorized by two different taxonomies: catalog and bundle.

In order to give to the user the possibility to filter by catalog or bundle, the application has to read all the records and generate an array, and then send it to the view, for instance to put this information in the sidebar.

When the amount of data is too high, it is clearly not recommended to continue using this approach.

Using Listeners to Process Data

Thinking about how can this problem be solved I remembered what I learned from the Doctrine Listeners and Event subscribers. Now it is time to apply this knowledge.

A better solution consists of writing, updating and loading the data from a database. As I mentioned above, to get all the catalogs and bundles of a concrete project I have to read all the translations of this project and prepare an array to send to the view.

Translation load persist diagram

The first thing we can do is to create a new document in MongoDB that will contain the information that we need.

ProjectInfo document

To update this document when a new Translation is added or when existing Translations are updated, we need to prepare a Listener to do the hard work.

[src/Acme/SampleBundle/Resources/config/services.xml]

<?xml version="1.0" ?>
<container ...>
 <services>
  <service id="translations_listener"
   class="Acme\SampleBundle\Listener\SampleListener">
   <tag name="doctrine_mongodb.odm.event_listener" 
    event="postLoad" connection="default" />
   <tag name="doctrine_mongodb.odm.event_listener" 
    event="postPersist" connection="default" />
   <tag name="doctrine_mongodb.odm.event_listener" 
    event="postUpdate" connection="default" />
   <tag name="doctrine_mongodb.odm.event_listener" 
    event="preRemove" connection="default" />
  </service>
 </services>
</container>

The listener class below implements the life cycle events:

<?php

namespace Acme\SampleBundle\Listener;

use Doctrine\ODM\MongoDB\Event\LifecycleEventArgs;
use Doctrine\ODM\MongoDB\Event\PreFlushEventArgs;
use Acme\SampleBundle\Document\Summary;
use Acme\SampleBundle\Document\MainDocument;

class SampleListener
{
 /**
  * To maintain the category and catalog of the last viewed
  */
 protected $cache;
 /**
  * @param LifecycleEventArgs $eventArgs
  */
 public function postUpdate(LifecycleEventArgs $eventArgs)
 {
  $document = $eventArgs->getDocument();
  $dm    = $eventArgs->getDocumentManager();
  if ($document instanceof MainDocument) {
   /** @var Translation $document */
   $projectId   = $document->getProjectId();
   $projectInfo = $dm->getRepository('SampleBundle:ProjectInfo')
 ->getProjectInfo($projectId);
   if(!$projectInfo){
    $projectInfo = new ProjectInfo();
    $projectInfo->setProjectId($projectId);
   }
   if(isset($this->cache["id"]) && 
   ($this->cache["id"]==$document->getId())){
    $projectInfo->subBundle($this->cache['bundle']);
    $projectInfo->subCatalog($this->cache['catalog']);
   }
   $projectInfo->addBundle($document->getBundle());
   $projectInfo->addCatalog($document->getCatalog());
   $dm->persist($projectInfo);
   $dm->flush();
  }
 }
 /**
  * @param LifecycleEventArgs $eventArgs
  */
 public function postPersist(LifecycleEventArgs $eventArgs)
 {
  $document = $eventArgs->getDocument();
  $dm    = $eventArgs->getDocumentManager();
  if ($document instanceof Translation) {
   /** @var Translation $document */
   $projectId   = $document->getProjectId();
   $projectInfo = $dm->getRepository('SampleBundle:ProjectInfo')
 ->getProjectInfo($projectId);
   $projectInfo->addBundle($document->getBundle());
   $projectInfo->addCatalog($document->getCatalog());
   $dm->persist($projectInfo);
   $dm->flush();
  }
 }
 /**
  * @param LifecycleEventArgs $eventArgs
  */
 public function preRemove(LifecycleEventArgs $eventArgs)
 {
  $document = $eventArgs->getDocument();
  $dm    = $eventArgs->getDocumentManager();
  if ($document instanceof Translation) {
   /** @var Translation $document */
   $projectId   = $document->getProjectId();
   $projectInfo = $dm->getRepository('SampleBundle:ProjectInfo')
 ->getProjectInfo($projectId);
   $projectInfo->subBundle($document->getBundle());
   $projectInfo->subCatalog($document->getCatalog());
   $dm->persist($projectInfo);
   $dm->flush();
  }
 }
 /**
  * @param LifecycleEventArgs $eventArgs
  */
 public function postLoad(LifecycleEventArgs $eventArgs)
 {
  $document = $eventArgs->getDocument();
  if ($document instanceof Translation) {
   $this->cache = array(
    'id'   => $document->getId(),
    'bundle'  => $document->getBundle(),
    'catalog' => $document->getCatalog(),
   );
   return;
  }
  // if document ...
 }
}

Now, the only remaining thing we have to do is create a repository and read the new information from the new summarized document, thus preventing excessive amounts of data being read.

<?php

namespace Acme\SampleBundle\Document\Repository;
use Doctrine\ODM\MongoDB\DocumentRepository;

class ProjectInfoRepository extends DocumentRepository
{
 /**
  * @param $projectId
  * @param bool $sorted
  * @return mixed
  */

 public function getCatalogs($projectId, $sorted = true)
 {
  $projectInfo = $this->getProjectInfo($projectId);
  if(!$projectInfo instanceof ProjectInfo){
   return array();
  }
  $result = $projectInfo->getCatalogs();
  if ($sorted && is_array($result)) {
   ksort($result);
  }
  return $result;
 }

 /**
   * @param $projectId
   * @param bool $sorted
   * @return mixed
   */
 public function getBundles($projectId, $sorted = true)
 {
  $projectInfo = $this->getProjectInfo($projectId);
  if(!$projectInfo instanceof ProjectInfo){
   return array();
  }
  $result = $projectInfo->getBundles();
  if ($sorted && is_array($result)) {
   ksort($result);
  }
  return $result;
 }

 /**
   * @param $projectId
   * @return ProjectInfo
   */
 public function getProjectInfo($projectId)
 {
  $dm = $this->getDocumentManager();
  return $dm->getRepository('SampleBundle:ProjectInfo')
   ->findOneBy(array('projectId' => intval($projectId)));
 }
}

This diagram demonstrates what happens now when you persist one translation record:

And to get the bundles or catalogs of translations you no longer need to read all translations records and process their information in memory. You only need to read the ProjectInfo document records to get the updated information they contain.

All the code presented here is working in a real project called tradukoj. Recently I released a public version.

If you have questions or comments, just post a comment here.




You need to be a registered user or login to post a comment

Login Immediately with your account on:



Comments:

No comments were submitted yet.



  Blog PHP Classes blog   RSS 1.0 feed RSS 2.0 feed   Blog Improving the use of ...   Post a comment Post a comment   See comments See comments (0)   Trackbacks (0)