Hard limit on the number of tables in a BQ project

I have some very shared data that I would like to store in BigQuery where each section will get its own table. My question is that BQ will support the number of tables I will need.

With my dataset, I would create about 2000 new tables daily. All tables will have an expiration of 390 days (13 months), so this particular project will end up with a constant number of ~ 2000 tables * 390 days = ~ 780,000 tables.

I would have experienced this myself, but BQ only supports a maximum of 10,000 workloads per project per day.

Does anyone have any experience with this kind of table counting? Is there any official limit provided by Google?

+3


source to share


2 answers


There are projects today with so many different tables. There is currently no hard limit on the number of individual tables.

Some related considerations that come to mind when you consider the views that use this set of tables:



  • Query (including referenced views) can currently only reference 1000 tables .

  • Datasets with many tables can have problematic behavior when using group settings .

  • You may be too careful. Instead of a large number of individual tables, you can simply use a wider schema and fewer tables.

  • If you are heavily dependent on timing as a caveat, you might also look at table decorators as a way to limit the scope of the data scan.

  • You can also collapse data over time into smaller, larger tables as it ages and less frequently accessed. For example, copy jobs can add multiple source tables to one target table.

+6


source


Most of the restrictions can be raised in BigQuery if you exercise the BigQuery entitlement - the restrictions are there to prevent abuse and misuse.

The critical question here is how much data will each table handle? Having 780,000 tables with 10 rows is not a good idea.

How many tables do you want to process for each query? There's a hard limit of 1000 tables per query.



If you have an interesting use case that requires higher limits, getting a support contract and their advice is the best way to raise the default limits.

https://cloud.google.com/support/

+2


source







All Articles