Use jupyter notebooks for dbt development

dbt (data build tool) is revolutionizing the way we do the process of extracting value from our data.

This package allows Jupyter notebooks to be used for developing dbt models and analyses in complement with other dbt command line tools and VS Code extensions.

Install

pip install git+https://github.com/butchland/nbdbt.git

How to use

The %%dbt cell magic allows you to create models and analyses in your dbt project.

To use the %%dbt cellmagic in your notebook, you have to load the dbt cellmagic module first via %load_ext or %reload_ext line magics

%reload_ext nbdbt.dbt_cellmagic

The %dbtconfig line magic configures a default project (and optionally the dbt profiles directory with -d flag as well as the notebook path with the -n flag).

%dbtconfig -p ../my_dbt_project -n notebooks/index.ipynb

The next cell uses the %%dbt cell magic which will create a new model my_third_model and compile it as well.

%%dbt -a my_fourth_model models/my_fourth_model.sql
select *
from {{ ref('my_second_dbt_model') }}

We then assigned the result of the compilation to the my_third_model variable, which is a Dbt (cell) magic object

my_fourth_model
<nbdbt.dbt_cellmagic.DbtMagicObject at 0x7f869b4b7890>

The ref method on DbtMagicObject allows us to run the query and save the results into a dataframe.

results = my_fourth_model.ref()
results  # dataframe
id
0 1

The dbt magic object also has access to other useful properties (like the compiled sql used to create the results)

print(my_fourth_model._compiled_sql)
-- AUTOGENERATED! DO NOT EDIT! File to edit: notebooks/index.ipynb (unless otherwise specified).
select *
from `sample-dbt-learn-project`.`jaffle_shop`.`my_second_dbt_model`

We can then run the usual dbt commands to generate the model

%cd ../my_dbt_project
! dbt run --select my_fourth_model
%cd ../nbs
/home/butch2/play/experiments/nbdbt/nbs
10:20:23  Running with dbt=1.1.1
10:20:23  Found 3 models, 4 tests, 0 snapshots, 3 analyses, 191 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
10:20:23  
10:20:25  Concurrency: 1 threads (target='dev')
10:20:25  
10:20:25  1 of 1 START view model jaffle_shop.my_fourth_model ............................ [RUN]
10:20:26  1 of 1 OK created view model jaffle_shop.my_fourth_model ....................... [OK in 1.28s]
10:20:26  
10:20:26  Finished running 1 view model in 2.97s.
10:20:26  
10:20:26  Completed successfully
10:20:26  
10:20:26  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
/home/butch2/play/experiments/nbdbt/nbs
import nbdbt.dbt_cellmagic as nbc

nbc.clear_cache()  # clears nbdtcache