Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long "simple search" queries get "stuck" in optimizer (using internal search) #120

Closed
drtjmb opened this issue Aug 13, 2013 · 1 comment
Closed

Comments

@drtjmb
Copy link
Member

drtjmb commented Aug 13, 2013

Internal simple search for N terms (eg. "foo bar baz" N = 3) produces an SQL query that performs multiple table joins (see below).

For N > 10 I have reports of the query remaining in the "statistics" state (ie. developing a query execution plan) for a very long time (hours).

MySQL chooses the optimum plan by looking at each possible plan; the number of plans is N! - so with a large number of joins there are a very large number of possible plans to be evaluated, in fact MySQL spends longer evaluating query plans than it would to execute the query.

Server setting optimizer_search_depth controls depth of plan evaluations (default = 62). Setting this to a low value limits the number of plans that will be evaluated - in the case of simple search setting this to 3 or 4 drastically reduces query times (to a few seconds). Obviously the trade off is that MySQL might miss the most optimal plan - but time saved by not looking for an optimal plan outweighed any performance benefit in this case.

Ideal solution is obviously to produce better SQL :)

Workaround is to set optimizer_search_depth at the global MySQL level or when EPrints connects to the DB - Database::connect():

--- perl_lib/EPrints/Database/mysql.pm.orig     2013-08-13 13:19:21.965667487 +0100
+++ perl_lib/EPrints/Database/mysql.pm  2013-08-13 13:20:35.421666314 +0100
@@ -239,6 +239,8 @@
        if( $rc )
        {
                $self->do("SET NAMES 'utf8'");
+               $self->do('SET @@session.optimizer_search_depth = 3;');
        }

        return $rc;

Simplified SQL extract from running a simple search with N terms:

SELECT `eprint`.`eprintid` FROM 
    `eprint` LEFT JOIN `eprint__ordervalues_en` ON `eprint`.`eprintid`=`eprint__ordervalues_en`.`eprintid`, 
    (SELECT `eprint`.`eprintid` AS `eprintid` FROM <UNIONs searching for "foo">) AS `and_140576402459664_0`, 
    (SELECT `eprint`.`eprintid` AS `eprintid` FROM <UNIONs searching for "bar">) AS `and_140576402459664_1`, 
    (SELECT `eprint`.`eprintid` AS `eprintid` FROM <UNIONs searching for "baz">) AS `and_140576402459664_2`, 
    ...
    (SELECT `eprint`.`eprintid` AS `eprintid` FROM <UNIONS searching for Nth term>) AS `and_140576402459664_N',

(Thanks to DL, DBA Administrator @ UEA)

@sebastfr
Copy link

Ideal solution is obviously to produce better SQL

Or use Xapian?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants