Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

SQL Index - Help needed

Options
  • 21-04-2005 10:31am
    #1
    Moderators, Recreation & Hobbies Moderators Posts: 10,912 Mod ✭✭✭✭


    I've inherited a database (MSSQL 2000) at work from someone who's just left and it looks like it needs a little cleaning up. It was at 45Gb before I reindexed it and now it's down to 32Gb, though I only have 5Gb of actual data.

    The rest of the space is due to the amount of indexes created.

    I have a table called 'numbers' and it has several indexes with different names but point to the same data;

    Index 1 : Phone Number
    Index 2 ; Name, Address
    Index 3 : Phone Number

    In this case can I just delete the Index 3 seing as there's already an index created which points to the same data or will I end up breaking something if I do that.

    Do procedures look for a specific index by name when SQL is running queries?


Comments

  • Registered Users Posts: 15,443 ✭✭✭✭bonkey


    Ponster wrote:
    The rest of the space is due to the amount of indexes created.
    Heh. I'm wondering if this is suggesting the classic indexing "strategy" of "if its a field, build an index on it".
    Index 1 : Phone Number
    Index 2 ; Name, Address
    Index 3 : Phone Number

    In this case can I just delete the Index 3 seing as there's already an index created which points to the same data or will I end up breaking something if I do that.
    It should be ok to drop them....but....

    ...there's an outside chance you could break something.

    I'll explain....
    Do procedures look for a specific index by name when SQL is running queries?
    By default - no. You can use what are called compiler hints, though, to force the use of a specific index. So first thing is to read up on compiler hints and make sure whatever application uses this DB isn't usign compiler hints in its queries (or, if it is, that your index isn't being referenced).

    Secondly, you could have stored procedures who's cached execution plans reference the index you drop. This is easy to get around, and you'll probably run a full database maintenance job when you're done (right?), which would remove that issue anyway.

    Also, the choice of which index to delete isn't always trivial. If there's a Primary Key or Unique Key Constraint on a field (or set of fields) there will be an index created "automagically" to enforce that. You don't want to drop these indexes (or, if you did, you'd actually drop the PKC / UKC which would then drop the index). In my experience, this is the most typical cause of duplicated indices...people define a field as PK/Unique, and then also define a Unique Index on it.

    My SQL Index Optimisation is a bit rusty, but I also seem to recall that composite indexes are a special case worth paying attention to. An Index on columns A, B, C will also serve as an index on A, B, and as an index on A (but not on any other mix or match). So you could have more needless duplication there. For example, if you have an index on Name, then its probably rendered obsolete by the index you have on Name & Address. If you have an index on Address, however, it is not rendered obsolete. Hopefully you're seeing how that all works?

    After that, you'll be getting into the more difficult questions - which index should be the clustered one....are indexes actually being used....is there need for / benefit from various indices....and so on and so forth.

    Its probbly worth reading up on the Index Tuning stuff which ships with MSSQL 2000. The wizard won't give you a perfect system, but its usually quite a good starting point.

    HTH,

    jc


  • Registered Users Posts: 2,781 ✭✭✭amen


    How are you determing that the indexs are causing your space issues ?

    The first thing I would do is look at the actucal space used by the data file and logs. You may find that the log file is very large. If so has it ever been backed up and has it been set to backup on a regulare basis and truncate after back up ? Is the db set to backup on a regular basis ?
    If there is a large amount of logs I would truncate the log (after backing up)
    backup log DBNAME with no_log
    once completed I would then shrink the data files and log
    truncate the log and then reshrink the data files and log
    Make sure you only do this when there is no one on the system as this may take a long time.

    As for the Full Database maintenace job this should be set to run at least once a week. I normally daily maintenance jobs which rebuild the indexs each night and also recompile the sps.

    I would be interested to know the number of tables in your system and the volume of data ( daily data growth)

    As for which indexes are need as bonkey says this is very subjective and requires some careful monitoring.
    A code thing to do before changing anything would be to create a test database and also script all databases objects and make the changes to the test database first.
    I would probally run a SQL profile and get it to trace all sps calls over a day and then run them on test database to ensure you get comparible results.
    Then as the indexes are changed rerun sps and verify in profiler to ensure you are not degrading db performance.

    As an ongoing item I would run profiler and trace any sps that take more than 0.5 secs ( or a time appropriate to your system) to complete and determine why they are slow. You could run the query in query analyaser with Execution Plan turned on.

    You should be careful changing indexes in particulare clustered as it is very easy to decrease db performance my choosing the wrong column/value.

    What are the fill factors like on the indexes ?

    How comfortable are you with SQL in generally ? Was the person who left a DBA ? I was just thinking 45GB seems very large to me what type of industry is the db for and how often is it updated ?

    I'm be interested in how you get on. If you don't want to post here pm me


  • Moderators, Recreation & Hobbies Moderators Posts: 10,912 Mod ✭✭✭✭Ponster


    amen wrote:
    How are you determing that the indexs are causing your space issues ?

    Microsoft told me :)

    One of their people came around last week and spent a day working on it to tell me that I had a defrag problem and too many indexes.

    There's a db backup every night. Simple mode. No logs.

    I would be interested to know the number of tables in your system and the volume of data ( daily data growth)

    Hmm, there are about 40 tables in all. Out of the 32Gb, about 2Gb is data that never changes and the other 30Gb is made up of stats with about 1Gb per day being added to the database. We keep about 32 days worth of info so it's normally around the 35Gb mark.



    I know hardly anything about SQL but that's more than the person that I'm replacing ! The database is all the telephone transactions between customers
    of SG, France's largest bank (I think). With about 250,000 calls per day going through the system it quickly fills up the tables :)


    Thanks for the help. I have a lot of reading to do before I start playing around but at least now I know the chapters that I should be looking at !


  • Closed Accounts Posts: 756 ✭✭✭Zaph0d


    You don't say why you want to make the database smaller. Is it causing performance problems? Are you running out of disk space? Is it just for the sake of neatness? Are the backups taking too long?
    I am going to assume it is performance.

    Have you taken a days worth of trace logs for analysis?

    Once you do this you can run the index tuning wizard against them in query analyser.

    Have you used a tool to reverse engineer a database map from the db? Maybe there is a schema around already. You should be aware of the top 10 tables, their keys and foreign keys. Other indexes are usually of minor importance.

    Do you know the slowest/most common queries on the system? As amen said, you should use profiler to discover these and then experiment with how to increase their performance, this may involve adding indexes and also rewriting sql for queries to work more efficiently. You should also speak to end users and discover their perception of the slowest running functions in the system and then see what can be done on the database to improve things.

    if you drop indexes you should run a search of syscomments to check that the index isn't explicitly referenced by a hint as mentioned by bonkey.
    SELECT so.name 
    FROM syscomments sc INNER JOIN sysobjects so ON so.id=sc.id
    WHERE sc.text like '%myIndex%'
    
    This SQL returns a list of database objects that reference myIndex.


  • Registered Users Posts: 2,781 ✭✭✭amen


    so your company is processing a lot of data for what is presumably a large customer but they are willing just give the database to someone to look after with no thought of training or support for that person. When you say MS staff did he work for MS or was he just a guy with the MS exams ? Did he have real world experience ? Did he have his own custom scripts for determining DB performance or did he just use a few wizards ?
    The rant is not aimed at you.
    <RANT>
    This is why SQL Server gets a bad name. Companies through staff at it and think ahh sure they are a great programmer they will figure SQL out in no time. Can't be that hard. SQL should be treated just like any other programming language taking the time to learn and understand it. I have seen large projects designed with weeks taken up with the presentation and business layers but scant regard to the database. Then when the app runs slow or is performing badly due to database access everyone wonders why.
    As for the standard of sql programming. It is very easy to write sql that works and the programmer thinks he is doing a great job. But often time the sql is inefficent and written in a procedural format instead of looking at the advantages of writing set based data. Pet hates include the use of Temp Tables, Cursors in SPS which are very prevelant. The other thing is searching for data on non indexed columns.
    </RANT>

    That said if you need any help suggestions continue to post away


  • Advertisement
  • Registered Users Posts: 7,468 ✭✭✭Evil Phil


    You don't like the use of cursors? How come?


  • Closed Accounts Posts: 7,230 ✭✭✭scojones


    amen wrote:
    so your company is processing a lot of data for what is presumably a large customer but they are willing just give the database to someone to look after with no thought of training or support for that person. When you say MS staff did he work for MS or was he just a guy with the MS exams ? Did he have real world experience ? Did he have his own custom scripts for determining DB performance or did he just use a few wizards ?
    The rant is not aimed at you.
    <RANT>
    This is why SQL Server gets a bad name. Companies through staff at it and think ahh sure they are a great programmer they will figure SQL out in no time. Can't be that hard. SQL should be treated just like any other programming language taking the time to learn and understand it. I have seen large projects designed with weeks taken up with the presentation and business layers but scant regard to the database. Then when the app runs slow or is performing badly due to database access everyone wonders why.
    As for the standard of sql programming. It is very easy to write sql that works and the programmer thinks he is doing a great job. But often time the sql is inefficent and written in a procedural format instead of looking at the advantages of writing set based data. Pet hates include the use of Temp Tables, Cursors in SPS which are very prevelant. The other thing is searching for data on non indexed columns.
    </RANT>

    That said if you need any help suggestions continue to post away


    Oh buddy, I feel the exact same as you do. I've had to deal with an un-indexed, un related database for the last 3 weeks with over 70 databases containing roughly 40 tables each. Everyone was wondering why queries were taking an entire weekend on a p4 machine. :mad:

    It's not just badly written SQL, it's giving the job of db designer to someone who is not right for the job.


  • Registered Users Posts: 15,443 ✭✭✭✭bonkey


    Evil Phil wrote:
    You don't like the use of cursors? How come?

    At a guess, what Amen dislikes (and I'd join him on this) is that so many people use cursors where there's absolutely no need. Most often, its caused because people want to code procedurally, and process each record in a classical loop.

    One typical example is the use of an if-statement inside a loop where CASE statements on a single block-operation would have done just as well.

    This approach (using cursors where you can avoid them) is almost always less efficient then using "direct" SQL (in MSSQL databases at least), and typically consumes more resources.

    Don't get me wrong. There are cases where cursors are the best or even the only solution, and there's nothing wrong with them when thats the case. However, they are all-too-often a significant factor in poorly-performing databases.

    The database-design issue that Amen went on about is - in my experience - by no means unique to MSSQL, but in my experience, the number of companies who have a programmer looking after their MSSQL databases instead of someone who knows MSSQL administration is one or more orders of magnitude greater than one would find for DB2, Sybase, or Oracle.

    Ponster...let me reiterate - this is not meant to be insulting or denigrating to you in any way. You are trying to learn how to do your new job right, which I applaud.

    You've also reminded me that I really should get back to writing that SQL manual...and why.

    jc


  • Closed Accounts Posts: 756 ✭✭✭Zaph0d


    2 other problems with cursors:

    less readable/maintainable than SQL: you need to declare, open, loop through each record, test for completion, close and deallocate as opposed to a single INSERTor UPDATE

    not a transaction: when a cursor driven sp fails midway through its loop, you're left with a half completed operation as opposed to an UPDATE, INSERT or DELETE that are all implicit transactions (you can get around this by wrapping your cursor code in an explicit transaction)


  • Registered Users Posts: 2,781 ✭✭✭amen


    bonkey thats exactly what I meant to say. Ponster this is not personal. Keep posting and we will help if we can.


  • Advertisement
  • Registered Users Posts: 7,468 ✭✭✭Evil Phil


    The reason I ask is I've just inherited a db as well. One of the batch jobs has a sproc which uses a cursor. People not liking cursors sent up a little flag over this particular procedure - I'll go and have a look.


  • Closed Accounts Posts: 756 ✭✭✭Zaph0d


    Evil Phil, post up the cursor code if you want any help converting it to SQL statements

    One reason people use cursors is to repeatedly call other sprocs. This can be avoided by using a WHILE loop. A WHILE loop is probably implemented internally in the same way as a cursor but at least its easier to read.


  • Registered Users Posts: 7,468 ✭✭✭Evil Phil


    Well if it ain't broken don't fix it. It's live code so it can stay as is until somebody complains - I might do the analysis out of interest though (if I get the time).


  • Moderators, Recreation & Hobbies Moderators Posts: 10,912 Mod ✭✭✭✭Ponster


    >You don't say why you want to make the database smaller. Is it causing performance problems? Are you running out of disk space? Is it just for the sake of neatness? Are the backups taking too long?
    I am going to assume it is performance.



    The database is on a RAID-5 made up of 3 60Gb disks and at the moment takes up about 35Gb of space but at the moment we're at 75% capacity (25% of the banks aren't hooked up yet). So I'll either have to reconfigure the disks replacing them with 140Gb models or figure out why the database is so big.

    > Have you taken a days worth of trace logs for analysis?


    I used the SQL profiler (I think that's what it's called in English) for 24 hours, ran every procedure I could find and then looked to see what indexes were being used/called. It seems that I have one table that's about 20Gb in size and 18Gb of that is made up of 6 indexes. Out of those 6, I can only see 1 that's actually ever used. I have no idea why the others exist.

    It is possible that as newer versions of the solution were delivered (we're on version 3) that new procedures were written and new indexes created without someone checking to see if there wasn't one already made that would have served the purpose.


    > When you say MS staff did he work for MS or was he just a guy with the MS exams?

    It was MS-France. they sent out 2 of their people who spent the day working on it. They gave us lots of help but they can't actually do the work themslves due to the contracts that we have with the company that actually furnished the database.


  • Registered Users Posts: 15,443 ✭✭✭✭bonkey


    Ponster wrote:
    Out of those 6, I can only see 1 that's actually ever used. I have no idea why the others exist.

    There are two main purposes for indexes, and several less obvious ones. The main purposes are :

    - Query Performance
    - Data Integrity (wrt. uniqueness)

    Primary Keys and Unique Key Constraints are enforced by creating UNIQUE INDEXes on the relevant fields. Even if these indexes are never used in query-optimisation, they are still vital.
    It is possible that as newer versions of the solution were delivered (we're on version 3) that new procedures were written and new indexes created without someone checking to see if there wasn't one already made that would have served the purpose.
    Its also possible that you had developers who simply said "Yeah...thats an index which is likely to be used" without checking that the optimiser can and will actually use it.

    If you have Profiler output, read up on the Index Tuning Wizard, which will let you take a saved Profiler session and analyses your system to check for indexes to drop/add based on what was captured in Profiler. For teh amount of data and so on that you have, odds are this will be slow to run, so ideally run it on a backup/spare machine....ideally you need one with the same number of processors and a similar/identical SQL configuration.

    jc


  • Moderators, Recreation & Hobbies Moderators Posts: 10,912 Mod ✭✭✭✭Ponster


    0k. Two Microsoft MCP exams in SQL 2000 and several weeks later I'd just like to say thanks for all the help you people offered.
    The database is down to a nice 17Gb, everything that should be indexed is and 15,000 users aer happy. :)


  • Registered Users Posts: 4,003 ✭✭✭rsynnott


    sjones wrote:
    Oh buddy, I feel the exact same as you do. I've had to deal with an un-indexed, un related database for the last 3 weeks with over 70 databases containing roughly 40 tables each. Everyone was wondering why queries were taking an entire weekend on a p4 machine. :mad:

    It's not just badly written SQL, it's giving the job of db designer to someone who is not right for the job.

    An un-indexed database? AND UN-INDEXED DATABASE??! Fire them immediately.


  • Closed Accounts Posts: 7,230 ✭✭✭scojones


    Yeah they were sacked a few weeks ago.


Advertisement