Sunday, October 28, 2007

MyISAM full-text indexed have weight

As exposed at Usenet I still haven't found the cause of MySQL reading MYD files when querying only index information, such as number of results, or id fields. I consider that next queries shouldn't force MySQL to extract information from the MYD file because the index MYI content should be enough:

SELECT COUNT(*) FROM articles WHERE MATCH (title,body) AGAINST ('keyword');
SELECT id FROM articles WHERE MATCH (title,body) AGAINST ('keyword');

But the fact is that on several versions of MySQL I have tested, MYD files are being reading exhaustively when performing these queries. If anybody could give me a hint on this issue, I would appreciate it.

Studying the MyISAM index structure when doing full-text matching, I have found that Jeremy D. Zawodny and Derek J. Balling hold that full-text indexes are implemented with a two-part MyISAM B-tree index. First field is the VARCHAR that stores the keyword and second one is a FLOAT that measures his weight.

I am confused. If the index is weighted, why does MySQL reads MYD file to find out the MATCHING order in a query like this:

SELECT id, MATCH (title,body) AGAINST ('keyword') AS score FROM articles ORDER BY score DESC LIMIT 0,10;

These kind of queries are slow under MySQL because MYD files are consulted, and that is the paradox. Why to consult them if it isn't required? According to Jeremy D. Zawodny and Derek J. Balling the index know the keyword and its weight. Isn't it enough?

1 comment:

dsims said...

try using IN BOOLEAN MODE on your searches and set a relevance score