You know how some days you just bang your head against the wall for hours over something that turns out to be deceptively simple? Yeah, that was me.
I spent a whole day wrestling with a databricks.yml
file. The mission, should I choose to accept it (and I did, albeit unknowingly at the start of the day), was to get our Databricks Asset Bundle – which was all nicely configured to build, run, and test our DBT models – to play nice with a GitHub Actions pipeline.
We’d already done our homework, or so I thought. The bundle was validated, we did a test deployment using the local Databricks CLI, and everything looked golden. The goal was CI/CD automation: push code, and let GitHub Actions handle the deployment to Databricks. Standard MLOps, data engineering goodness. You can find more about Databricks Asset Bundles here – they’re Databricks’ way of packaging up all your project files. And DBT (Data Build Tool), of course, is awesome for transforming data in your warehouse.
The Error That Sent Me Down a Rabbit Hole
Alright, so let’s get to the villain of our story. Every time the GitHub Action kicked off the deployment, we’d get smacked with this error from the Databricks CLI:
186 Starting resource deployment
187 Error: terraform apply: exit status 1
188 Error: cannot create job: An environment is required for serverless task dbt_marts. Please define one using `environments` and `environment_key`.For more details, please refer to the API documentation at [https://docs.databricks.com/api/workspace/jobs/create](https://docs.databricks.com/api/workspace/jobs/create)
189 with databricks_job.dbt_marts_job,
190 on bundle.tf.json line 39, in resource.databricks_job.dbt_marts_job:
191 39: }
Now, if you’ve dabbled with Databricks Bundles, you’ll know they often compile down to Terraform under the hood for the deployment part. So, seeing terraform apply: exit status 1
and the reference to bundle.tf.json
immediately makes you think, “Ah, a Terraform issue!” The message itself, “An environment is required for serverless task dbt_marts. Please define one using environments
and environment_key
,” seems pretty darn explicit, doesn’t it? It’s pointing directly at the job definition for a task named dbt_marts
.
My brain went straight to the databricks.yml
file. This is where you define your resources, jobs, and, importantly, environments
for things like serverless compute.
The Wild Goose Chase (aka Debugging Steps)
Based on that error, we dove headfirst into troubleshooting the databricks.yml
and the Terraform configuration:
- Checked
environments
: We meticulously ensured theenvironments
block in ourdatabricks.yml
was correctly populated. We double-checked, triple-checked. I had my lunch with environment and environment keys today. - Name Game: You know how sometimes special characters or reserved words can throw a wrench in the works? We started renaming things – jobs, environments – just in case some obscure naming convention was tripping us up.
- CLI Update Dance: “Maybe it’s an outdated Databricks CLI version?” I mused. So, we updated the CLI on the GitHub Actions runner. Always a good sanity check, though it didn’t solve this particular mystery. You can always find the latest CLI info here.
- Inlining for Simplicity: Our bundle was structured with resources defined in separate files, referenced in the main
databricks.yml
. To rule out any issues with how the bundle was stitching these files together during thedatabricks bundle deploy
(which internally runsterraform apply
), we even tried moving everything directly into thedatabricks.yml
. Desperate times, right?
Let me tell you, after all that, the error message remained stubbornly the same. We were barking up the completely wrong tree, led astray by what seemed like a very specific error message from the Databricks CLI. It felt like the CLI was gaslighting me!
The “Aha!” Moment (Or More Like “Are You Kidding Me?!”)
So, what was the actual culprit after this day-long debugging marathon? Drumroll, please…
We needed to explicitly pip install dbt-databricks
, dbt-core
, and (for our setup) dbt-coverage
in the GitHub Actions workflow before running the bundle deployment.
# Example snippet for your GitHub Actions workflow file
# .github/workflows/your-workflow.yml
# ...
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4 # Or your preferred version
- name: Set up Python
uses: actions/setup-python@v5 # Or your preferred version
with:
python-version: '3.9' # Or your dbt compatible version
- name: Install DBT dependencies
run: |
pip install dbt-databricks dbt-core dbt-coverage
# Add any other dbt adapters or python dependencies your project needs
- name: Deploy Databricks Bundle
env: # Make sure to set your Databricks host and token
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
run: |
databricks bundle deploy -t your_target_workspace # Replace with your target# ...
Yes. That. Was. It.
The GitHub Actions runner, being a fresh environment each time, didn’t have the necessary DBT Python packages installed. The Databricks CLI, when it tried to parse or validate the DBT tasks within the bundle, presumably couldn’t find the dbt
command or the necessary adapter. But instead of saying something helpful like, “Hey, I can’t find dbt
or dbt-databricks
, is it installed and in your PATH?”, it threw that misleading error about missing environments and environment_key
for the serverless job.
It seems the underlying tooling assumed DBT was present and, when it wasn’t, the error cascaded into something that looked like a configuration problem within the bundle definition itself.
The Moral of the Story
Honestly, the only reason we burned an entire day on what should have been a straightforward YAML debugging session was because the error message sent us on a wild goose chase. The CLI was essentially saying, “I can’t create this job because your environment and environment keys are missing,” when the real issue was, “I don’t even know what a dbt is because the Python packages aren’t here!”
This whole ordeal really underscores a point: when you’re working with tools that are rapidly evolving or are wrappers around other tools (like the Databricks CLI orchestrating Terraform and interacting with DBT), sometimes the error messages aren’t as mature or direct as they could be. A production-ready tool, ideally, should give you error stack traces that point you closer to the actual root cause.
So, if you find yourself in a similar boat, scratching your head over a Databricks bundle deployment failing in CI/CD with weird Terraform-esque errors related to job definitions, especially when DBT is involved: double-check that your DBT dependencies are explicitly installed in your pipeline environment!
Hopefully, sharing this saves someone else the headache.