Database Indexes: Selects Only!

Good day,

I have about 4GB of data separated in about 10 different tables. Each table has many columns, and each column can be a search term in a query. I'm not a DBA at all and I don't know much about indexes, but I want to speed up searches as much as possible. The important point is that there will be no update, insert or delete at any time (tables will be filled every 4 months). Is it appropriate to create an index on each column? Remember: no insert, update or delete, only picks! Also, if I can make all of these columns integer instead of varchar, could I make a difference in speed?

Many thanks!

+1


source to share


5 answers


Answer: No. Indexing each column separately is not a good design. In many cases, indexes must contain multiple columns, and there are different types of indexes for different requirements.

The setup wizard mentioned in other answers is a good first cut (especially for a learner).



Don't try to guess your way through it or hope you will understand complex analyzes - get advice specific to your situation. We seem to have multiple threads that are quite active for specific situations and query optimization.

+6


source


Have you looked at the Index Tuning Wizard? Will give you index suggestions based on workload.



+4


source


Absolutely not.

You need to understand how indexes work. If you have a table of, say, 1000 records, but it is BIT and can be one of two values, if you index only this column and this column, it will be useless because it will not be selective enough. When you index a column, be very aware of what types of selections will be performed on the table. When you create an index on a column, will the index be selective enough to use the optimizer effectively?

At this point, you may well find that a few carefully selected composite indexes will vastly outperform solving many separate indexes for each column. Golden Rule: How the database is queried will determine how you should do your indexes.

+3


source


Two pieces of missing information: how many different values ​​are in each column, and which DBMS you are using. If you are using Oracle and have fewer than a few thousand different values ​​for each column, you can create bitmap indexes. They are very efficient in terms of space and execution for exact matches.

Otherwise, it's a trade-off: each index will add roughly the same amount of space as a single-column name containing the same data, so you will substantially double (probably 2.5x) your space requirements. So maybe 10G, which isn't much data.

The question then becomes whether your DBMS would efficiently merge multiple selections based on an index. It is quite possible that this will not happen, unless you are joining to every column you select.

Better answer: try it on a smaller dataset (so you don't spend all your time creating indexes) and see how it works.

+1


source


If you select a set of columns from a table that is larger than those specified by the columns in the selected indexes, then you will inevitably run into bookmarked searches in the query plan, in which the query processor has to retrieve un- covered columns from the clustered index using the reference id from sheet rows in the associated nonclustered index.

In my experience, bookmark searches can really kill query performance due to the amount of extra reads needed and the fact that each row in a clustered index must be resolved individually. This is why I try to map NC indexes whenever possible, which is easier on smaller tables where the required query plans are well known, but if you have large tables with a lot of columns with expected queries, then this probably won't be possible.

This means you only get a blast for your dollar with any type of NC index if the index covers, or selects a small dataset that lowers the cost of bookmark searches - indeed, you might find that the query optimizer won't even look at your indexes if the cost is prohibitive compared to a clustered index scan where all columns are already available.

Thus, there is no point in creating an index unless you know the index will optimize the result of a given query. Therefore, the index value is proportional to the percentage of queries that it can optimize for a given table, and this can only be determined by analyzing the queries being executed, which is what the Index Tuning Wizard does for you.

so in summary:

1) Don't index every column. This is a classic premature optimization. You cannot optimize a large table with indexes for all possible query plans in advance.

2) Do not index any column until you have captured and run the underlying workload using the Index Tuning Wizard. This workload should be representative of your application's usage patterns so that the wizard can determine which indexes will actually help your queries.

0


source







All Articles