Friday, October 19, 2007

First day - Introduction

MySQL is a powerful database engine. I had used it in several of my projects. It is fast and simple. That is what I need.

Nevertheless, while using MySQL in a big LAMP project I found slow queries (in a range from 30 seconds to 5 minutes long). And that is really weird if we are talking about MySQL. First explanation that comes to my head is that I have done something wrong. MySQL cannot be so slow.

I admit the situation is a bit uncommon. I have indexed 100.000 documents with 10.000 words each one. This means the index created to manage the full-text search is around 1 Gb in size.

I think I have discovered that the query response times are good (around 1 second) when the server RAM memory is greater than the index size. But when the index size is bigger that the RAM, the response times growth reaching benchmarks of 30 seconds and more per query.

I say again, that could be only a misconfiguration or a lack of knowledge in the MySQL internals for my part. I have been testing several queries and different versions of MySQL in diverse Linux servers. All these tests are leading me in one direction: index must be full cached in RAM in order to achieve fast query results.

I don't feel confortable in that situation. 30 seconds per query is unacceptable. What if I try to index bigger databases? I imagine the internal process of looking for a word in the index and then match it against the text that contains it; I don't understand why is it necessary the index to be cached.

I accept that random access reads to the disk are needed to find the matched document: one of them to find the index and another one to find the document. But two disk reads are not taking 30 seconds. That is sure.

So I need to understand this problem in-deph. I would like to know my error, or otherwise why MySQL Full-text is slow when the index is bigger than the amount of RAM available in the server.

Here, today I begin my investigation. If you have some idea about this problem, any hint that could help me to solve the issue, please contact me. Thanks in advance.

No comments: