Rows Queried In-Depth

Overview

bit.io charges you for how much you use rather than how much you provision -- nobody wants to pay for compute and storage that the don't use! Instead, bit.io measures usage by the number of data rows that your query instructs bit.io to read read or scan -- we call this measurement "rows queried" because it's how many rows your query accesses.

📘

Rows Queried vs. Rows Returned

It is important to distinguish rows queried from rows returned. Rows queried is the number of rows that are read from bit.io's database over the course of a query execution, whereas rows returned is the number of rows you receive after executing a query. bit.io charges for rows queried.

Understanding rows queried

The number of rows accessed by a query is a way to capture how much work was required to execute the query, and by extension, how much any given user is using bit.io.

A quick example: Suppose a table my_table contains 1000 rows. Executing SELECT count(*) will only return 1 row — whose value is, the number of rows stored in my_table:

my_db=# SELECT count(*) FROM my_table;
 count
--------
 1000
(1 row)

However, in order to execute this query, PostgreSQL must scan through the entire table and count up the total number of rows it visits. Thus, the total number of rows queried in the course of executing the query is 1000.

Using EXPLAIN to estimate rows queried

We can see this more clearly by showing the query plan using the EXPLAIN statement:

my_db=# EXPLAIN SELECT count(*) FROM my_table;
                          QUERY PLAN
--------------------------------------------------------------
 Aggregate  (cost=17.50..17.51 rows=1 width=8)
   ->  Seq Scan on test  (cost=0.00..15.00 rows=1000 width=0)
(2 rows)

The output shows that in order to execute the query, PostgreSQL will do a sequential scan through the whole table, estimating 1000 rows, and then aggregate the result to output the number of rows in the table.

The output of EXPLAIN can be a good way to get an estimate of the total number of rows that will be accessed by a query execution, as long as your tables' statistics are up to date. bit.io keeps table statistics up to date automatically by making use of Autovacuum periodically and after major changes are made to a table.

A word of caution: EXPLAIN can be misleading in certain situations. The rows entry describes the number of rows that query plan node will return. In the case of seq scan, that will be the estimated rows in the table but that's not always the case. bit.io is working to extend postgres' built in EXPLAIN to provide better estimates.

See https://stackoverflow.com/questions/7943233/fast-way-to-discover-the-row-count-of-a-table-in-postgresql for other approaches to estimating the number of rows in a table without querying it.

Making indexes can help reduce rows queried

Executing lots of full table scans can be a quick way to incur high usage because all rows will be accessed. An easy way to reduce the number of rows queried for a given query execution is to employ a good indexing strategy. Database indexes are designed to provide a shortcut to the rows of interest.

As an example, suppose our table, my_table has two columns: name (text) and age (integer). If we run the following query without modifying the table:

SELECT * FROM my_table WHERE age > 20 AND age < 50;

The query executor will simply scan through all the rows of the table in order to find matching rows and return them, resulting in 1000 rows queried just like before. However, if we create an index:

CREATE INDEX ages_idx ON my_table(age);

The number of rows accessed while executing the above query will be equal to the number of rows returned, since the query executor can make use of the index to filter out non-matching rows without needing to read them from the table.

This is a very simplified explanation of how indexes can help reduce the overall usage of your queries. To gain a deeper understanding, read the PostgreSQL documentation on indexes. More generally, the more optimized your queries are, the less usage cost they will incur.

Checking your usage

You can always check the total number of rows queried in your account by opening the user dropdown menu in the top-right corner of the page. The number of rows queried in the current billing cycle should appear at the top of the menu:

273273

Note that this number may lag by a few minutes from actual usage

Setting Limits

As a user of bit.io Pro, you have the option of limiting the number of rows queried by your account within a given billing cycle. This can help control your usage if you're worried about it growing out of control. It is important to note that once you hit this limit, your queries will begin failing. To set a limit, open the user dropdown menu in the top-right corner of the page and click "set limit":

283283

The menu should change into a dialog that allows you to set the maximum number of rows queried per month for your account:

283283

Once you click "save", you will be prompted with a confirmation dialog. After confirming, the limit will be set.