A Simple Query — Or So We Thought
EXPORT DATA
OPTIONS (
uri = 'gs://xxxxx/*.json',
format = 'JSON',
overwrite = true)
AS (
SELECT *
FROM `bigquery-public-data.crypto_solana_xxxxx.Instructions`
LIMIT 1000000
);
This query exports 1,000,000 rows from the Instructions
table in the crypto_solana
dataset (hosted in BigQuery’s public datasets) to a Google Cloud Storage bucket in JSON format.
Three queries. 1,576.56 TB of data “scanned.”
The invoice for three queries shows 1,576.56 TB of data scanned, charged at $9,847.24 for those three queries!!!
The cost breakdown was even crazier:
- Total “scanned” data: 1,576.56 TB across three queries
- Each query, despite using
LIMIT
, was billed for 509.89 TB of scanned data - Queries ran in 22 seconds — which implies a 23 TB per second scan rate
SELECT * FROM huge_table LIMIT 100;
- Even if only 100 rows are returned, you’re charged as if you scanned the entire table.
- If the table is 1 PB in size, you’re billed for 1 PB of data scanned.
- Filtering doesn’t help — if you reference the table, you pay for it.
- Charges are based on total referenced data, not actual scanned data.
- LIMIT does not reduce the amount of data billed — if your query touches a large table, you’re billed for the entire thing.
- Partition pruning is unpredictable — queries may still scan and bill for full table sizes.