Help creating a flexible base 'find' metho

2020-06-06 01:31发布

问题:

For years now I've been reimplementing the same code over and over (with evolution) without finding some method of cleanly, and efficiently, abstracting it out.

The pattern is a base 'find[Type]s' method in my service layers which abstracts select query creation to a single point in the service, but supports the ability to quickly create easier to use proxy methods (see the example PostServivce::getPostById() method way below).

Unfortunately, so far, I've been unable to satisfy these goals:

  1. Reduce possibility for errors introduced by distinct re-implementation
  2. Expose valid/invalid parameter options to IDEs for autocompletion
  3. Follow the DRY principle

My most recent implementation usually looks something like the following example. The method takes an array of conditions, and an array of options, and from these creates and executes a Doctrine_Query (I mostly rewrote this out here today, so there may be some typos/syntax errors, it's not a direct cut and paste).

class PostService
{
    /* ... */

    /**
     * Return a set of Posts
     *
     * @param Array $conditions Optional. An array of conditions in the format
     *                          array('condition1' => 'value', ...)
     * @param Array $options    Optional. An array of options 
     * @return Array An array of post objects or false if no matches for conditions
     */
    public function getPosts($conditions = array(), $options = array()) {
        $defaultOptions =  = array(
            'orderBy' => array('date_created' => 'DESC'),
            'paginate' => true,
            'hydrate' => 'array',
            'includeAuthor' => false,
            'includeCategories' => false,
        );

        $q = Doctrine_Query::create()
                        ->select('p.*')
                        ->from('Posts p');

        foreach($conditions as $condition => $value) {
            $not = false;
            $in = is_array($value);
            $null = is_null($value);                

            $operator = '=';
            // This part is particularly nasty :(
            // allow for conditions operator specification like
            //   'slug LIKE' => 'foo%',
            //   'comment_count >=' => 1,
            //   'approved NOT' => null,
            //   'id NOT IN' => array(...),
            if(false !== ($spacePos = strpos($conditions, ' '))) {
                $operator = substr($condition, $spacePost+1);
                $conditionStr = substr($condition, 0, $spacePos);

                /* ... snip validate matched condition, throw exception ... */
                if(substr($operatorStr, 0, 4) == 'NOT ') {
                  $not = true;
                  $operatorStr = substr($operatorStr, 4);
                }
                if($operatorStr == 'IN') {
                    $in = true;
                } elseif($operatorStr == 'NOT') {
                    $not = true;
                } else {
                    /* ... snip validate matched condition, throw exception ... */
                    $operator = $operatorStr;
                }

            }

            switch($condition) {
                // Joined table conditions
                case 'Author.role':
                case 'Author.id':
                    // hard set the inclusion of the author table
                    $options['includeAuthor'] = true;

                    // break; intentionally omitted
                /* ... snip other similar cases with omitted breaks ... */
                    // allow the condition to fall through to logic below

                // Model specific condition fields
                case 'id': 
                case 'title':
                case 'body':
                /* ... snip various valid conditions ... */
                    if($in) {
                        if($not) {
                            $q->andWhereNotIn("p.{$condition}", $value);
                        } else {
                            $q->andWhereIn("p.{$condition}", $value);
                        }
                    } elseif ($null) {
                        $q->andWhere("p.{$condition} IS " 
                                     . ($not ? 'NOT ' : '') 
                                     . " NULL");
                    } else {
                        $q->andWhere(
                            "p.{condition} {$operator} ?" 
                                . ($operator == 'BETWEEN' ? ' AND ?' : ''),
                            $value
                        );
                    }
                    break;
                default:
                    throw new Exception("Unknown condition '$condition'");
            }
        }

        // Process options

        // init some later processing flags
        $includeAuthor = $includeCategories = $paginate = false;
        foreach(array_merge_recursivce($detaultOptions, $options) as $option => $value) {
            switch($option) {
                case 'includeAuthor':
                case 'includeCategories':
                case 'paginate':
                /* ... snip ... */
                    $$option = (bool)$value;
                    break;
                case 'limit':
                case 'offset':
                case 'orderBy':
                    $q->$option($value);
                    break;
                case 'hydrate':
                    /* ... set a doctrine hydration mode into $hydration */ 
                    break;
                default:
                    throw new Exception("Invalid option '$option'");
            }
        }

        // Manage some flags...
        if($includeAuthor) {
            $q->leftJoin('p.Authors a')
              ->addSelect('a.*');
        } 

        if($paginate) {
            /* ... wrap query in some custom Doctrine Zend_Paginator class ... */
            return $paginator;
        }

        return $q->execute(array(), $hydration);
    }

    /* ... snip ... */
}

Phewf

The benefits of this base function are:

  1. it allows me to quickly support new conditions and options as the schema evolves
  2. it allows me a means to quickly implement global conditions on the query (for example, adding an 'excludeDisabled' option with a default of true, and filtering all disabled = 0 models, unless a caller explictly says differently).
  3. it allows me to quickly create new, simpler to use, methods which proxy calls back to the findPosts method. For example:
class PostService
{
    /* ... snip ... */

    // A proxy to getPosts that limits results to 1 and returns just that element
    public function getPost($conditions = array(), $options()) {
        $conditions['id'] = $id;
        $options['limit'] = 1;
        $options['paginate'] = false;
        $results = $this->getPosts($conditions, $options);
        if(!empty($results) AND is_array($results)) {
            return array_shift($results);
        }
        return false;
    }

    /* ... docblock ...*/       
    public function getPostById(int $id, $conditions = array(), $options()) {
        $conditions['id'] = $id;
        return $this->getPost($conditions, $options);
    }

    /* ... docblock ...*/
    public function getPostsByAuthorId(int $id, $conditions = array(), $options()) {
        $conditions['Author.id'] = $id;
        return $this->getPosts($conditions, $options);
    }

    /* ... snip ... */
}

The MAJOR drawbacks with this approach are:

  • The same monolithic 'find[Model]s' method gets created in every model-accessing service, with mostly only the condition switch construct and base table names changing.
  • No simple way to perform AND/OR conditon operations. All conditions explicitly ANDed.
  • Introduces many opportunities for typo errors
  • Introduces many opportinuties for breaks in the convention-based API (for example a later service may require implementing a different syntax convention for specifying the orderBy option, which becomes tedious to back-port to all previous services).
  • Violates DRY principles.
  • Valid conditions and options are hidden to IDE auto-completion parsers and the options and conditions parameters require lengthy doc block explanation to track allowed options.

Over the last few days I've attempted to develop a more OO solution to this problem, but have felt like I'm developing TOO complex a solution which will be too rigid and restrictive to use.

The idea I was working towards was something along the lines of the following (current project will be Doctrine2 fyi, so slight change there)...

namespace Foo\Service;

use Foo\Service\PostService\FindConditions; // extends a common \Foo\FindConditions abstract
use Foo\FindConditions\Mapper\Dql as DqlConditionsMapper;

use Foo\Service\PostService\FindOptions; // extends a common \Foo\FindOptions abstract
use Foo\FindOptions\Mapper\Dql as DqlOptionsMapper;

use \Doctrine\ORM\QueryBuilder;

class PostService
{
    /* ... snip ... */
    public function findUsers(FindConditions $conditions = null, FindOptions $options = null) {

        /* ... snip instantiate $q as a Doctrine\ORM\QueryBuilder ... */

        // Verbose
        $mapper = new DqlConditionsMapper();
        $q = $mapper
                ->setQuery($q)
                ->setConditions($conditions)
                ->map();

        // Concise
        $optionsMapper = new DqlOptionsMapper($q);        
        $q = $optionsMapper->map($options);


        if($conditionsMapper->hasUnmappedConditions()) {
            /* .. very specific condition handling ... */
        }
        if($optionsMapper->hasUnmappedConditions()) {
            /* .. very specific condition handling ... */
        }

        if($conditions->paginate) {
            return new Some_Doctrine2_Zend_Paginator_Adapter($q);
        } else {
            return $q->execute();
        }
    }

    /* ... snip ... */
}

And lastly, a sample of the Foo\Service\PostService\FindConditions class:

namespace Foo\Service\PostService;

use Foo\Options\FindConditions as FindConditionsAbstract;

class FindConditions extends FindConditionsAbstract {

    protected $_allowedOptions = array(
        'user_id',
        'status',
        'Credentials.credential',
    );

    /* ... snip explicit get/sets for allowed options to provide ide autocompletion help */
}

Foo\Options\FindConditions and Foo\Options\FindOptions are really quite similar, so, for now at least, they both extend a common Foo\Options parent class. This parent class handles intializing allowed variables and default values, accessing the set options, restricting access to only defined options, and providing an iterator interface for the DqlOptionsMapper to loop through options.

Unfortunately, after hacking at this for a few days now, I'm feeling frustrated with the complexity of this system. As is, there is still no support in this for condition groups and OR conditions, and the ability to specify alternate condition comparison operators has been a complete quagmire of creating a Foo\Options\FindConditions\Comparison class wrap around a value when specifying an FindConditions value ($conditions->setCondition('Foo', new Comparison('NOT LIKE', 'bar'));).

I'd much rather use someone else's solution if it existed, but I've yet to come across anything that does what I'm looking for.

I'd like to get beyond this process and back to actually building the project I'm working on, but I don't even see an end in sight.

So, Stack Overflowers: - Is there any better way that provides the benefits I've identified without including the drawbacks?

回答1:

I think you're overcomplicating things.

I've worked on a project using Doctrine 2 which has quite a lot of entities, different uses for them, various services, custom repositories etc. and I've found something like this works rather well (for me anyway)..

1. Repositories for queries

Firstly, I don't generally do queries in services. Doctrine 2 provides the EntityRepository and the option of subclassing it for each entity for this exact purpose.

  • Whenever possible, I use the standard findOneBy... and findBy... style magic methods. This saves me from having to write DQL myself and works rather nicely out of the box.
  • If I need more complicated querying logic, I usually create use-case specific finders in the repositories. These are things like UserRepository.findByNameStartsWith and things like that.
  • I generally don't create a super fancy "I can take any args you give me!" type of magic finders. If I need a specific query, I add a specific method. While this may seem like it requires you to write more code, I think it's a much simpler and easier to understand way of doing things. (I tried going through your finder code and it was rather complicated looking in places)

So in other words...

  • Try to use what doctrine already gives you (magic finder methods)
  • Use custom repository classes if you need custom querying logic
  • Create a method per query type

2. Services for combining non-entity logic

Use services to combine "transactions" behind a simple interface you can use from your controllers or test easily with unit tests.

For example, let's say your users can add friends. Whenever a user friends someone else, an email is dispatched to the other person to notify. This is something you would have in your service.

Your service would (for example) include a method addNewFriend which takes two users. Then, it could use a repository to query for some data, update the users' friend arrays, and call some other class which then sends the email.

You can use the entitymanager in your services for getting repository classes or persisting entities.

3. Entities for entity-specific logic

Lastly you should try to put your business logic that is specific to an entity directly into the entity class.

A simple example for this case could be that maybe the email sending in the above scenario uses some sort of a greeting.. "Hello Mr. Anderson", or "Hello Ms. Anderson".

So for example you would need some logic to determine the appropriate greeting. This is something you could have in the entity class - For example, getGreeting or something, which could then take into account the user's gender and nationality and return something based on that. (assuming gender and nationality would be stored in the database, but not the greeting itself - the greeting would be calculated by the function's logic)

I should probably also point out that the entities should generally not know of either the entitymanager or the repositories. If the logic requires either of these, it probably doesn't belong into the entity class itself.

Benefits of this approach

I have found the approach I've detailed here works rather well. It's maintainable because it generally is quite "obvious" to what things do, it doesn't depend on complicated querying behavior, and because things are split clearly into different "areas" (repos, services, entities) it's quite straightforward to unit test as well.