IPython/Jupyter notebooks have built-in "pretty" formatting of dictionary (and related) constructs. For example, take this messy, nested dictionary construct:
Programmatically creating clients and users in Keycloak
How to create clients and users programmatically in Keycloak, using Python.
git-annex and Homebrew version woes (or how to spend your evenings with open source software)
The other night I wanted to upload some photo's to my
git-annex remote
on our home NAS (a Synology DS416, let's call it mallorca
),
but reality decided otherwise:
$ git annex copy --to mallorca path/to/nice-picture-01.jpg
fatal: Run with no arguments or with -c cmd
git-annex-shell: git-shell failed
(unable …
Bunch o' cheat sheets
You're probably very familiar with the tools you use daily and operate them from muscle memory. But there are also these setup or maintenance tasks you only do every X months and their practical details are a bit hazy.
This is a random, work-in-progress collection of cheat sheets for these …
Disable pytest's log/print capturing
Yet another note to self.
You're working on some unit tests in pytest and its default log/print capturing is getting a bit in the way. You want to see print or logging calls immediately when they happen and not in some captured/delayed fashion.
Add these command line options …
Yet another solution to dig you out of a circular import hole in Python
The circular import problem in Python.
Some module foo
imports module bar
, but bar
also imports foo
.
On itself, it's not necessarily a problem. Python allows it.
Depending on how both modules interact,
you might not even notice there is cycle in the dependency chain.
However, when you have a …
Step by Step OAuth 2.0 Authorization Code Flow with PKCE
In this notebook, I will dive into the OAuth 2.0 Authorization Code flow with PKCE step by step in Python, using a local Keycloak setup as authorization provider. The focus lies on practical, step by step low-level HTTP operations. We wont even use an actual browser nor need an actual HTTP server for the redirect URL.
Header duplication in Spark partitioned CSV files
You are writing a Spark DataFrame to a CSV file with header line on HDFS.
df.write.csv('output_folder', header=True)
Because your DataFrame is partitioned, you get multiple CSV files in your output folder. Each file will get a header with column names.
So far so good, but …
HiveServer2 User Impersonation Issues
While setting up Apache Hive, HiveServer2 and Beeline (using vanilla packages instead of some kind of prepackaged Hadoop distribution), I struggled with some permission/user related problems. The error message I got stuck with was something like this:
org.apache.hadoop.security.authorize.AuthorizationException
User: hive is not allowed to …
"Native-Hadoop" Library Load Issues with Spark
While setting up a new cluster with Hadoop (3.1.1) and Spark (2.4.0), I encountered these warnings when running spark:
19/02/05 13:06:43 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
To debug this issue, I used …