What is the optimal amount of data for a table?
I would suggest that you consider optimizing your database design:
- Consider what you want to accomplish with the database. Will you be doing many inserts into one table at very high rates? Or will you run reports and analytics on data?
- Once you have identified the purpose of the database, determine what data you need to store to perform any function.
- Normalize until it hurts. If you are doing transaction processing (the most common function for a database), you will need a highly normalized database structure. If you are doing analytical functions, you will need a more denormalized structure that should not rely on joins to generate report results.
- Typically, if you do normalize the structure until it hurts, you need to go back to your normalization one or two steps in order to have a data structure that is both normalized and functional.
- A normalized database is meaningless in most cases if you don't use keys. Make sure each table has a primary key. Don't use surrogate keys, just calling what you always see. Let's consider what natural keys can exist in any table. Once you are sure that you have the correct primary key for each table, you need to define the foreign key references. Establishing explicit relationships with foreign keys, rather than relying on implicit definition, will give you performance improvements, ensure the integrity of your data, and self-document the database structure.
- Look for other indexes that exist on your tables. Do you have a column or set of columns that you will look up often, like a username and password field? Indexes can be on a single column or multiple columns, so think about how you will query the data and create indexes as needed on the values ββyou are querying.
source to share
Agree that you must make sure your data is indexed correctly.
Also, if you are worried about the size of the table, you can always implement a strategy for archiving data of a particular type to follow that line later.
Don't worry too much about it until you see problems occurring and optimize prematurely.
source to share
It is clear that I do not know how to answer this question. An indexed table with 100,000 records is faster than a non-indexed table of 1000.
What are your requirements? How much data do you have? Once you know the answer to these questions, you can make indexing and / or partitioning decisions.
source to share
This is a very difficult question, so a very loose answer :-)
All in all, if you do the basics - sensible normalization, sensible primary key, and mill start requests - then on today's hardware you will go with most of the stuff in a small to medium sized database, i.e. one with the largest table having less than 50,000 records.
However, as soon as you go through lines 50k - 100k, which roughly corresponds to the point where rdbms will most likely be memory limited - then if you do not have your access paths (i.e. indexes) properly configured, then performance will start to fall catastrophically. This is in a mathematical sense - in such a scenario, there is often an order of magnitude or two degradation in performance to double the table size.
Obviously, the critical table size that you should look out for will depend on row size, machine memory, activity, and other environmental concerns, so there is no single answer, but it's good to know what performance usually does not intelligently degrade with table size and scheduling respectively.
source to share
I have to disagree with Cruachan about "50k - 100k lines ... roughly matches (before) where rdbms is likely to be memory constrained." This general text is simply misleading without two additional details: approx. string size and available memory. I am currently developing a database to find the longest common subsequence (a la bio-informatics) of strings in source files and was reaching millions of rows in a single table, even with a VARCHAR field close to 1000, before it became memory constrained ... So with proper indexing and enough RAM (a Gig or two), in relation to the original question, with strings of no more than 75 bytes, there is no reason why the proposed table could not contain tens of millions of records.
source to share
The correct amount of data is a function of your application, not a database. There are very few cases where the MySQL problem is solved by splitting the table into multiple subtases, if that is the intent of your question.
If you have a specific situation where queries are slow, it might be more helpful to discuss how to improve this situation by changing the query or the table design.
source to share