This guide will walk you through the basic functionality and concepts of bit.io and will point you toward more detailed documentation on specific topics. After reading this guide, you will have a good understanding of the core features of bit.io, knowledge of what bit.io is and what differentiates it from other database tools, and ideas for where to find more detailed and specific guidance for particular use cases.
In short, bit.io is the fastest and easiest way to set up a PostgreSQL database. You can load data to a PostgreSQL database by dragging and dropping files, entering a data file's URL, using the bit.io command line interface tool, sending data from R or Python applications or analyses, or any other Postgres or HTTP client. You can then work with the data via the in-browser SQL editor or any of your favorite data analysis tools: SQL clients, R, Python, Jupyter notebooks, the command line, and more. You can even visualize your data right in the browser.
Once you've signed up and uploaded your data to a bit.io repository, you have a secure, private database schema where you can upload more data, query and join tables, create views, and add documentation. Want to share your data? You can make the repository public or share with any other user.
A bit.io repository is equivalent to a PostgreSQL schema. A schema can contain multiple tables. A single bit.io user can have multiple repositories (schemas), each containing multiple tables.
bit.io offers a full-featured PostgreSQL database that can be used in seconds with practically no configuration required and integrates with an ever-growing number of popular data tools. Any tool that works with Postgres will work with bit.io.
The following sections of this guide will walk you through the core functionality of bit.io.
While you don't need an account to upload and query your first data file, having an account unlocks additional features, such as higher storage and query limits; data that persists longer than 72 hours; and the ability to access data via the API or some of bit.io's many integrations.
To create an account, select "sign up" on the upper right-hand side of the bit.io home page. Signing up is free!
If you have a CSV file on your computer, you can use it. Otherwise, we recommend trying it out with the U.S. Federal Housing Finance Association state-level House Price Index data The url is: https://www.fhfa.gov/DataTools/Downloads/Documents/HPI/HPI_AT_state.csv. To upload, simply click "create repo," name the repository, and copy the URL into the "Paste a URL" field. If you already have a CSV, JSON, or Excel file on your computer you'd like to turn into a database table, you can drag and drop the file anywhere on bit.io to upload the data to an existing repository or to a new repository.
You can launch a file picker by clicking "choose file" at the top of the page.
When uploading data, you can select whether the presence of a table header should be auto-detected, the first row should be used as a table header, or there is no table header. In this case, auto-detection correctly concluded that there was no table header and assigned generic column names.
This results in the creation of a new repository called
my-new-repo with a single tabled titled after the CSV:
HPI_AT_state. The repository is private by default, meaning other people won't be able to access the data you've uploaded. You can change the repository to public by clicking the lock icon on the upper left and then clicking "public."
With that done, we can start querying the data right away.
Let's start simple and use the in-browser SQL query editor to figure out how many years of data are included in this dataset. We'll use the following SQL query:
SELECT count(DISTINCT column_1) FROM "<username>/my-new-repo"."HPI_AT_state.csv";
<username> to your username to try this yourself. Here's what that looks like in the bit.io editor:
Fully-Qualified Repo Names
As noted in the comments when you access the query editor, repositories must be fully qualified. The full syntax is: "owner/repo"."table". Note, the double quotes around "owner/repo" and around "table" are required whenever the names have special or a mix of upper and lower case.
Your query will not execute successfully if you do not use this fully-qualified table reference form.
Just about anything you can do with PostgreSQL is possible in bit.io, including more sophisticated queries. You're certainly not limited to counting rows!
It sure would be nice if our table had column names. Let's add them. We'll use the following query:
ALTER TABLE "<username>/my-new-repo"."HPI_AT_state.csv" RENAME COLUMN column_0 TO "state"; ALTER TABLE "<username>/my-new-repo"."HPI_AT_state.csv" RENAME COLUMN column_1 TO "year"; ALTER TABLE "<username>/my-new-repo"."HPI_AT_state.csv" RENAME COLUMN column_2 TO "quarter";
Note the double quotes around the new column names. Again, to run this yourself, replace
<username> with your bit.io username. Here's what our table looks like now:
You might notice that we haven't renamed the last column. That's because you can rename columns without using as SQL at all if you want. Click the little gear icon at the top right of the table (not the one at the top right of the screen, which includes repository options; we're interested in table options for now). From there, you can rename columns, rename the table, or delete the table.
Deletions are Permanent
Be careful! Deleting a repo, table, or column is permanent. Make sure you really want to delete it and that you have a backup if necessary.
We can use bit.io's REST API to upload another table of data to our repository. One way to do this is:
curl -i -X POST \ --header 'Authorization: Bearer $TOKEN' \ --header "Content-Disposition: attachment;filename='name'" \ --data-binary @"$PATH_TO_FILE" \ https://import.bit.io/$USERNAME/$REPO_NAME/$TABLE_NAME/
$TOKEN- an authentication token, which you can get from clicking "Connect" after you've logged in.
$PATH_TO_FILE- the path to the file you want to upload on your local filesystem,
$USERNAME- the username of the owner of the repo.
$REPO_NAME- the name of the repo.
$TABLE_NAME- the name of the table to upload the data to. If this table does not exist, it will be created.
We'll upload another table from the FHFA: https://www.fhfa.gov/DataTools/Downloads/Documents/HPI/HPI_AT_us_and_census.csv. After downloading the file to our machine, we can push it to the repo with the API as follows:
curl -i -X POST \ --header 'Authorization: Bearer $TOKEN' \ --header "Content-Disposition: attachment;filename='hpi2.csv'" \ --data-binary @"/Users/bitdotio/downloads/hpi2.csv" \ https://import.bit.io/bitdotio/my-new-repo/hpi_census_div
See here for more details on uploading data with the API.
API Keys and PostgreSQL Connection Details
On any repository page, when you're signed in, you can click on the "connect" button on the upper left to access API keys and PostgreSQL connection details.
You can click the "Download Repo" button to download one or more tables from your repository as CSV, JSON, or Excel files. You can find much more information on downloading tables—and query results—here.
bit.io offers the capability to document your repositories and tables with an in-browser markdown editor. You can document specific columns, provide citations and license information, and embed images—anything that will help you keep track what your data means or will provide value to other accessing your data (if public).
The neiss repository provides a good example of the capabilities of bit.io's documentation editor:
Thus far, we've only accessed our data through the browser interface. But we can do so much more. Do you have a preferred SQL client? You can connect it to bit.io in the same way as you'd connect to any other PostgreSQL database. The necessary credentials are under the "POSTGRES" section of the "Connect" menu referenced above. Here are some details on connecting via SQL clients: Connecting via SQL Clients.
For example, here's how the bit.io credentials map onto the PostgresSQL connection in PopSQL.
After reading this guide, you can upload data to bit.io via the browser or the API; query data using the in-browser query editor or your favorite SQL client; document your data; and modify your tables with the table settings. While this is enough to get started, there's much more you can do with bit.io. You can use a bit.io repository as in your data analyses with R or Python; upload and access your data with the command line tool, and set up automated pipelines to keep your data up to date.
Our documentation also includes a wealth of example projects which serve as great jumping-off points for your own projects. Two great examples are:
Have you run into an issue not covered by these docs? Have an error message you can't decode? Want to give us some feedback? Help is always one click away!
Updated about a month ago
This has been a whirlwind tour of the core features of bit.io. There's so much more to explore. Browse through the docs for topics relevant to your use case, or check out some of these articles on popular integrations with bit.io.
|Connecting via R with RPostgres|
|The python SDK|
|Using bit, the command line tool|
|Connecting via PowerBI|