Morning Edition


Supercharge Your Bash History

Words: bbatsov - - 15:27 07-07-2020

For some weird reason I’ve decided to abandon my insane ZShell

setup for a while and switch to a vanilla Bash. Guess I needed a bit of simplicity in my life.

One thing about my initial Bash experience that drove me nuts is the way it handles the

history out-of-the-box:

Obviously there’s no right way to do shell history and what you consider right or optimal depends on how exactly are using your shell.

I know many people who hate sharing data between shell sessions, as they want to keep them isolated for various reason.

On the other hand - I don’t know a single person who likes their shell history to be constantly overwritten.

Let’s teach Bash to append to the history file instead

of overwriting it! Just add the following to your .bashrc:

# append to the history file, don't overwrite it

shopt -s histappend

Note that some Linux distros (e.g. Ubuntu) might be enabling this shell option

in the default .bashrc they copy to each user’s home folder.

Now, let’s increase the history size and teach Bash to ignore duplicate entries in the history:

# don't put duplicate lines or lines starting with space in the history.

# See bash(1) for more options

HISTCONTROL =ignoreboth

# for setting history length see HISTSIZE and HISTFILESIZE in bash(1)

HISTSIZE =100000


You can obviously go really big here, but unless you have a very fast SSD I would not recommend

it, as reading the history can add a bit of latency to new shell sessions.

Now, we’re moving to the crux of it - let’s teach Bash to update the history after each

command we invoke and to reload it. The reloading is what ensures that different shell

sessions are synced in terms of history.

# append and reload the history after each command

PROMPT_COMMAND = "history -a; history -n"

history -a writes to the history file and history -n reloads the history from the file, but with a twist - it loads only the new entries that were added there. This makes it way more efficient

than another similar (and quite popular) approach - namely using history -a; history -c; history -r. Full

reloads of a huge history file would exhibit themselves as slight delays after each command you run. The solution I’ve

suggested should largely avoid them.

You might also want to remove the use of certain commands from your history, whether for privacy or readability reasons. This can be done with the $HISTIGNORE variable. It’s common to use this to exclude ls (and similar) calls, job control built-ins like bg and fg, and calls to history itself:

HISTIGNORE = 'ls:ll:cd:pwd:bg:fg:history'

Feel free to add here any other commands that you don’t want to store in the history.

Note that here you’re specifying exact matches for exclusion - the above config would

exclude ls, but it won’t exclude ls projects. Most of the time you’d be using this

with commands invoked without arguments.

So, putting it all together, that’s my magic recipe to supercharge Bash’s history:

# place this in your .bashrc

# don't put duplicate lines or lines starting with space in the history.

# See bash(1) for more options

HISTCONTROL =ignoreboth

# append to the history file, don't overwrite it

shopt -s histappend

# append and reload the history after each command

PROMPT_COMMAND = "history -a; history -n"

# ignore certain commands from the history

HISTIGNORE = "ls:ll:cd:pwd:bg:fg:history"

# for setting history length see HISTSIZE and HISTFILESIZE in bash(1)

HISTSIZE =100000


That’s all I have for you today! Keep hacking!

(read more)

Text Encoding: The Thing You Were Trying to Avoid

Words: calvin - - 02:45 09-07-2020

Programmers tend to treat text encoding like the office bore. You weren’t

planning on a 2-hour conversation about your co-worker’s lawnmower today?

Well too bad, because it’s happening now. Text encoding is much the same, it’s

always around, but we’d rather avoid it. From time to time it pops over anyway

to mess up your day and fill you in about the latest nuance between a code

point and a character, or something equally dull. I used to pay about as much

attention as that thrilling tale of a recent blade height adjustment.

But I’ve been tinkering with text encoding lately, and it’s not really so

awful. In fact, text encoding can be interesting. Not quite fun, let’s be

honest, but definitely interesting. UTF-8, in particular, is elegant and well

designed. If nothing else, text encoding is everywhere and you can’t avoid it

forever, so we might as well try to understand it.

ASCII is an obsolete 1960s-era character set. Or, at least, it would be

obsolete if it weren’t extended multiple times. ASCII only has 128 characters.

Just enough for English, and some teletype control characters. ASCII characters

fit in one byte, so the “encoding” is simple: store the character code in a

single byte. This is terribly convenient. Bytes are characters, characters are

bytes, and no one talks about code points.

As an added bonus, since computer memory comes in 8-bit bytes, not 7, there’s a

leftover bit. It could be used to extend ASCII with another 128 characters.

After all, there are people in the world who don’t speak English and would like

to see their own language on the screen.

These extended ASCII character sets were eventually standardized as ISO-8859.

ISO-8859-1 covers English and most Western European languages. ISO-8859-5

covers Cyrillic. ISO-8859-6 is Arabic. You get the idea.

This system at least allowed languages other than English to be represented.

But you had to pick a character set and stick with it. You couldn’t just throw

a Cyrillic or Greek letter into your English document. And this system

would never allow enough characters for Chinese, Japanese, or Korean.

Unicode is an effort to define one character set for all the world’s

characters. When Unicode was first being planned it was clear that it was going

to be huge. One byte was obviously not enough, so a second byte was added to

allow for 65,536 code points. That probably seemed like enough at the time.

That two-byte encoding is UCS-2.

It’s a simple scheme, to encode it you write the Unicode value in a 16-bit


Run that, and you’ll probably see this:

Intel architectures use a little-endian byte order so I get 48 00. If you run

it on an IBM mainframe or an old Motorola processor or any other big-endian

machine, the bytes would be reversed: 00 48.

If text always stays on the same computer this is fine, but if you want to send

text to another computer it has to know that you meant U+203D (‽) and not

U+3D20 (㴠). The convention is to write U+FEFF at the beginning of a document.

This is the “byte order mark”. A decoder knows if it sees FF FE to use

little-endian, and if it sees FE FF to use big-endian.

Of course, if you want to make a general UCS-2 encoded or decoder you have to

write all your code twice:

Unfortunately for UCS-2, Unicode outgrew two bytes. Sure, Unicode characters

through U+FFFF (the “Basic Multilingual Plane”) can be encoded in UCS-2, and

that’s enough sometimes. But if you want more Chinese characters, or Klingon, or the fancy

Emojis you can’t use UCS-2.

In UTF-16 each code point takes either two or four bytes. The two-byte version

is the same as UCS-2. The four-byte version contains a “high surrogate” in the

first two bytes and a “low surrogate” in the last two bytes. The surrogates can be

combined into a code point value. In case you’re wondering, the high and low

surrogate ranges are defined in Unicode so they don’t conflict with any other


Let’s look at the UTF-16BE encoding for U+1F407 (🐇):

The code point value is in the lower 10 bits of each surrogate pair, so we can apply a bit mask:

The decoder takes that result, shifts and ORs and adds 0x10000 to get the

code point value.

That’s a basic a UTF-16 decoder. Naturally, it inherited the big-endian and

little-endian variants from UCS-2, along with the necessary byte order mark.

UTF-16 does the job. But it feels like what you get when you chip away every

objection until you find a compromise everyone can barely tolerate.

ASCII and UCS-2 are fixed width, which is easy to work with. But if you want to

hold the whole range of Unicode with a fixed width you need four bytes, and

that is UTF-32. Every code point, 4 bytes.

UTF-32 will be faster for some operations, so as a trade-off of space for time

it has it’s place. But as a general-purpose encoding, it’s wasteful. For

instance, ASCII characters are common in the real world, but each one wastes 25

bits in UTF-32.

It’s even worse than that. The largest assigned Unicode code point is U+10FFFF,

which requires 21 bits. Consequently, there are at least 11 unused bits in every

code point. That’s right, there’s always at least one completely unused byte in

every UTF-32 encoded code point.

Just like the other multi-byte encodings, UTF-32 comes in big-endian and

little-endian versions as well. One nice thing about UTF-32’s wasted space is

that you don’t usually need a byte order mark. Let’s look at U+1F407 (🐇)


There’s a zero byte on one side or the other, so the decoder can find the byte

order for any code point.

UTF-8 is another variable-length encoding. Each code point takes one to four

bytes. ASCII characters (that is, Unicode code points below U+80) take just one

byte. Every other byte in UTF-8 will have it’s high bit set (b & 0x80 != 0).

For a multi-byte code point, the total number of bytes is encoded in the first

byte, as the number of 1 bits before the first zero. Convert it to binary and

it’s easy to see:

The bits after the length bits make up the beginning of the code point. Subsequent

bytes always begin with a 1 and a 0, followed by six bits of the value.

Here’s the full scheme:

If you count the xs on the four-byte version you’ll find 21. Exactly enough

to store the highest Unicode code point, 0x10FFFF. Here are some examples:

After the first byte, a decoder has to mask the low six bits of each byte, then

shift it onto the code point value.

Here’s a Go decoder.

That’s silly, of course, since Go has excellent support for UTF-8. It’s built-in

strings are already UTF-8. Which shouldn’t be any wonder, since Ken Thompson

designed UTF-8 with

Rob Pike and then they both worked on Go.

Everyone has mostly settled on UTF-8 in the past few years. But I think it’s

worth examining this decision. Let’s recap:

So fixed-width encodings are out. The only options left are either UTF-8 or

UTF-16. Let’s compare.

Did you notice how every ASCII characters only take one byte? That means that

ASCII documents are already UTF-8. That actually covers a lot of real-world text.

The English front page of Wikipedia, for instance, 99% of characters are ASCII.

The French version is still 98% ASCII, and even the Japanese version is 91% ASCII.

UTF-8 also never has a NULL byte. Which means this actually works on my computer:

If that doesn’t work, this definitely will:

Trying the same trick with UTF-16 and it won’t get past the first character:

It prints an “H” then quits. The \x00 is interpreted as the end of the string.

In fact, much of the C standard library (strcmp, strlen) works fine with UTF-8.

Not so with UTF-16, you can’t embed the encoded bytes in 8-bit numbers. Your

best bet is probably to convert it wide chars and use the wide versions of those


UTF-16’s byte order handling complicates everything. Here’s a simple C


that decodes UTF-16 and prints the code point. The only thing complicated

about that program is handling the byte order mark. Everything you want to do

with a UTF-16 string has to consider the byte order.

But UTF-8 is read one byte. In fact, there’s no other way to do it.

Consequently, there is no byte order to worry about. Here’s the UTF-8

version of

the earlier program.

A decoder can always tell where a code point starts in UTF-8. This is not the

case for UTF-16.

Let’s say you want to fill your home with classic literature and decide to

start with the WiFi:

iconv converts the text to UTF-16 and nc sends it via UDP multicast to all

hosts on your home network (presumably over WiFi, because otherwise what’s the

point?). On some other host on your network you can read it:

Or just grab a sample:

Anna Karenina uses the Cyrillic alphabet, and the Art of War is in ancient

Chinese. There’s no telling what you’ll get. Here’s one sample:

Looks like we got Alice in Wonderland that time, since to is more likely than

琀漀. But we didn’t tell iconv explicitly what byte order to use and

there’s nothing in the data to tell us.

This begins with two new lines and Д, so we’re probably in Anna Karenina.

UTF-16 over UDP works better than I thought it would. I suspect that even-sized

packet sizes keep the characters lined up. If it lost a byte everything would

shift and we wouldn’t be able to tell where a code point begins.

Contrast this with a sampling of the UTF-8 version of the same stream:

The first byte is 0xb0, which is 0b10110000. Since it starts with 0b10,

we know it’s not the first byte in a sequence and we can skip it. Same with the

next two bytes 0x2x and 0x20 both begin with 0b10.

The fourth byte, however, is 0xd1, or 0b11010001 which is the first byte of

a two-byte sequence for с U+0441. We did miss a code point, but there was no

ambiguity and after that, we’re in sync.

Thanks for taking the time to read this. If you’re interested I put some text

encoding code up on github:

There are better tools than these, probably with fewer bugs. But since I was

just tinkering these are unconcerned with the real world. Which makes them

fairly simple projects and hopefully easy to read.

(read more)

Barebones WebGL in 75 lines of code

Words: calvin - - 02:41 09-07-2020

Jul 8, 2020 • Avik Das

Modern OpenGL, and by extension WebGL, is very different from the legacy OpenGL I learned in the past. I understand how rasterization works, so I’m comfortable with the concepts. However, every tutorial I’ve read introduced abstractions and helper functions that make it harder for me to understand which parts are truly core to the OpenGL APIs.

To be clear, abstractions like separating positional data and rendering functionality into separate classes is important in a real-world application. But, these abstractions spread code across multiple areas, and introduce overhead due to boilerplate and passing around data between logical units. The way I learn best is a linear flow of code where every line is core to the subject at hand.

First, credit goes to the tutorial I used. Starting from this base, I stripped down all the abstractions until I had a “minimal viable program”. Hopefully, this will help you get off the ground with modern OpenGL. Here’s what we’re making:

A slightly more colorful version of the black triangle

With WebGL, we need a canvas to paint on. You’ll definitely want to include all the usual HTML boilerplate, some styling, etc., but the canvas is the most crucial. Once the DOM has loaded, we’ll be able to access the canvas using Javascript.

id= "container" width= "500" height= "500" >


document . addEventListener ( ' DOMContentLoaded ' , () => {

// All the Javascript code below goes here


With the canvas accessible, we can get the WebGL rendering context, and initialize its clear color. Colors in the OpenGL world are RGBA, with each component between 0 and 1. The clear color is the one used to paint the canvas at the beginning of any frame that redraws the scene.

const canvas = document . getElementById ( ' container ' );

const gl = canvas . getContext ( ' webgl ' );

gl . clearColor ( 1 , 1 , 1 , 1 );

There’s more initialization that can, and in real programs should, be done. Of particular note is enabling the depth buffer, which would allow sorting geometry based on the Z coordinates. We’ll avoid that for this basic program consisting of only one triangle.

OpenGL is at its core a rasterization framework, where we get to decide how to implement everything but the rasterization. This entails running at minimum two pieces of code on the GPU:

A vertex shader that runs for each piece of input, outputting one 3D (really, 4D in homogeneous coordinates) positions per input.

A fragment shader that runs for each pixel on the screen, outputting what color that pixel should be.

In between these two steps, OpenGL takes the geometry from the vertex shader and determines which pixels on the screen are actually covered by that geometry. This is the rasterization part.

Both shaders are typically written in GLSL (OpenGL Shading Language), which is then compiled down to machine code for the GPU. The machine code is then sent to the GPU, so it can be run during the rendering process. I won’t spend much time on GLSL, as I’m only trying to show the basics, but the language is sufficiently close to C to be familiar to most programmers.

First, we compile and send a vertex shader to the GPU. Here, the source code for the shader is stored in a string, but it can be loaded from other places. Ultimately, the string is sent to the WebGL APIs.

const sourceV = ` attribute vec3 position; varying vec4 color; void main() { gl_Position = vec4(position, 1); color = gl_Position * 0.5 + 0.5; }` ;

const shaderV = gl . createShader ( gl . VERTEX_SHADER );

gl . shaderSource ( shaderV , sourceV );

gl . compileShader ( shaderV );

if ( ! gl . getShaderParameter ( shaderV , gl . COMPILE_STATUS )) {

console . error ( gl . getShaderInfoLog ( shaderV ));

throw new Error ( ' Failed to compile vertex shader ' );


Here, there are a few variables in the GLSL code worth calling out:

An attribute called position. An attribute is essentially an input, and the shader is called for each such input.

A varying called color. This is an output from the vertex shader (one per input), and an input to the fragment shader. By the time the value is passed to the fragment shader, the value will be interpolated based on the properties of the rasterization.

The gl_Position value. Essentially an output from the vertex shader, like any varying value. This one is special because it’s used to determine which pixels need to be drawn at all.

There’s also a variable type called uniform, which is will be constant across multiple invocations of the vertex shader. These uniforms are used for properties like the transformation matrix, which will be constant for all vertices on a single piece of geometry.

Next, we do the same thing with fragment shader, compiling and sending it to the GPU. Notice the color variable from the vertex shader is now read by the fragment shader.

const sourceF = ` precision mediump float; varying vec4 color; void main() { gl_FragColor = color; }` ;

const shaderF = gl . createShader ( gl . FRAGMENT_SHADER );

gl . shaderSource ( shaderF , sourceF );

gl . compileShader ( shaderF );

if ( ! gl . getShaderParameter ( shaderF , gl . COMPILE_STATUS )) {

console . error ( gl . getShaderInfoLog ( shaderF ));

throw new Error ( ' Failed to compile fragment shader ' );


Finally, both the vertex and fragment shader are linked into a single OpenGL program.

const program = gl . createProgram ();

gl . attachShader ( program , shaderV );

gl . attachShader ( program , shaderF );

gl . linkProgram ( program );

if ( ! gl . getProgramParameter ( program , gl . LINK_STATUS )) {

console . error ( gl . getProgramInfoLog ( program ));

throw new Error ( ' Failed to link program ' );


gl . useProgram ( program );

We tell the GPU that the shaders we defined above are the ones we want to run. So, now what’s left is to create the inputs and let the GPU loose on those inputs.

The input data will be stored in the GPU’s memory and processed from there. Instead of making separate draw calls for each piece of input, which would transfer the relevant data one piece at a time, the entire input is transferred to the GPU and read from there. (Legacy OpenGL would transfer data one piece at at time, leading to worse performance.)

OpenGL provides an abstraction known as a Vertex Buffer Object (VBO). I’m still figuring out how all of this works, but ultimately, we’ll do the following using the abstraction:

Store a sequence of bytes in the CPU’s memory.

Transfer the bytes to the GPU’s memory using a unique buffer created using gl.createBuffer() and a binding point of gl.ARRAY_BUFFER.

We’ll have one VBO per input variable (attribute) in the vertex shader, though it’s possible to use a single VBO for multiple inputs.

const positionsData = new Float32Array ([

- 0.75 , - 0.65 , - 1 ,

0.75 , - 0.65 , - 1 ,

0 , 0.65 , - 1 ,


const buffer = gl . createBuffer ();

gl . bindBuffer ( gl . ARRAY_BUFFER , buffer );

gl . bufferData ( gl . ARRAY_BUFFER , positionsData , gl . STATIC_DRAW );

Typically, you’ll specify your geometry with whatever coordinates are meaningful to your application, then use a series of transformations in the vertex shader to get them into OpenGL’s clip space. I won’t go into the details of clip space (they have to do with homogeneous coordinates), but for now, X and Y vary from -1 to +1. Because our vertex shader just passes along the input data as is, we can specify our coordinates directly in clip space.

Next, we’ll also associate the buffer with one of the variables in the vertex shader. Here, we:

Get a handle to the position variable from the program we created above.

Tell OpenGL to read data from the gl.ARRAY_BUFFER binding point, in batches of 3, with particular parameters like an offset and stride of zero.

const attribute = gl . getAttribLocation ( program , ' position ' );

gl . enableVertexAttribArray ( attribute );

gl . vertexAttribPointer ( attribute , 3 , gl . FLOAT , false , 0 , 0 );

Note that we can create the VBO and associate it with the vertex shader attribute this way because we do both one after another. If we separated these two functions (for example creating all the VBOs in one go, then associating them to individual attributes), we would need to call gl.bindBuffer(...) before associating each VBO with its corresponding attribute.

Finally, with all the data in the GPU’s memory set up the way we want, we can tell OpenGL to clear the screen and run the program on the arrays we set up. As part of the rasterization (determining which pixels are covered by the vertices), we tell OpenGL to treat the vertices in groups of 3 as triangles.

gl . clear ( gl . COLOR_BUFFER_BIT );

gl . drawArrays ( gl . TRIANGLES , 0 , 3 );

The way we’ve set this up in a linear fashion does mean the program runs in one shot. In any practical application, we’d store the data in a structured way, send it to the GPU whenever it changes, and perform the drawing every frame.

Putting everything together, the diagram below shows the minimal set of concepts that go into showing your first triangle on the screen. Even then, the diagram is heavily simplified, so your best bet is to put together the 75 lines of code presented in this article and study that.

The hard part of learning OpenGL for me has been the sheer amount of boilerplate needed to get the most basic image on the screen. Because the rasterization framework requires us to provide 3D rendering functionality, and communicating with the GPU is verbose, there are many concepts to learn right up front. I hope this article shows the basics are simpler than other tutorials make them out to be!

(read more)

"Interface smuggling", a Go design pattern for expanding APIs

Words: calvin - - 05:04 09-07-2020

"Interface smuggling", a Go design pattern for expanding APIs

July 8, 2020

Interfaces are one of the big ways of creating and defining APIs

in Go. Go famously encourages these

interfaces to be very minimal; the widely used and implemented

io.Reader and io.Writer are each one method. Minimal APIs

such as this have the advantage that almost anything can implement them,

which means that Go code that accepts an io.Reader or io.Writer can

work transparently with a huge range of data sources and destinations.

However, this very simplicity and generality means that these APIs

are not necessarily the most efficient way to perform operations.

For example, if you want to copy from an io.Reader to an io.Writer,

such as io.Copy() does, using

only the basic API means that you have to perform intermediate data

shuffling when in many cases either the source could directly write

to the destination or the destination could directly read from the

source. Go's solution to this is what I will call interface


In interface smuggling, the actual implementation is augmented with

additional well known APIs, such as io.ReaderFrom and io.WriterTo . Functions that want to

work more efficiently when possible, such as io.Copy() ,

attempt to convert the io.Reader or io.Writer they obtained to

the relevant API and then use it if the conversion succeeded:

if wt, ok := src.(WriterTo); ok { return wt.WriteTo(dst)}if rt, ok := dst.(ReaderFrom); ok { return rt.ReadFrom(src)}[... do copy ourselves ...]

I call this interface smuggling because we are effectively smuggling

a different, more powerful, and broader API through a limited one.

In the case of types supporting io.WriterTo and io.ReaderFrom,

io.Copy completely bypasses the nominal API; the .Read() and .Write()

methods are never actually used, at least directly by io.Copy (they may

be used by the specific implementations of .WriteTo() or .ReadFrom(), or

more interface smuggling may take place).

(Go code also sometimes peeks at the concrete types of interface

API arguments. This is how under the right circumstances, io.Copy

will wind up using the Linux splice(2) or sendfile(2) system


There is also interface smuggling that expands the API, as seen in

things like io.ReadCloser

and io.ReadWriteSeeker .

If you have a basic io.Reader, you can try to convert it to see if it

actually supports these expanded APIs, and then use them if it does.

PS: There's probably a canonical Go term for doing this as part

of your API design, either to begin with or as part of expanding

it while retaining backward compatibility. If so, feel free

to let me know in the comments.

(read more)

Turn any website into a live wireframe

Words: mooreds - - 04:28 09-07-2020




Turn any website into a live wireframe.

Chrome Extension

npm Package

You can use it in your site by importing the CSS from unpkg and adding the placeholdify class somewhere:

< html >

< head >

< link




head >

< body >

< h1 >Hello World h1 >

< h1 class="placeholdify">Hello World h1 >

body >

html >

Try it in JS Bin


Turn any website into a live wireframe











MIT License

Sponsor this project

Learn more about GitHub Sponsors








(read more)

git commit accepts several message flags (-m) to allow multiline commits

Words: calvin - - 02:43 09-07-2020

git commit accepts several message flags (-m) to allow multiline commits Published at Jul 07 2020 Updated at Jul 08 2020 Reading time 2 min

This post is part of my

Today I learned

series in which I share all my learnings regarding web development.

When you use git on the command line you might have used the message flag (-m). It allows developers to define commit messages inline when calling git commit.

git commit -m "my commit message" I'm not the biggest fan of this approach because I prefer to edit the commit message in vim (which I only use for writing commit messages). It gives me the opportunity to double-check the files I'm committing.

Today I learned that the git commit command accepts multiple message flags. 😲

It turns out that you can use the -m option multiple times. The git documentation includes the following paragraph:

If multiple -m options are given, their values are concatenated as separate paragraphs

If you run the following command

git co -m "commit title" -m "commit description" it will result in this commit.

Author: stefan judis

Date: Tue Jul 7 21:53:21 2020 +0200

commit title

commit description

test.txt | 0

1 file changed, 0 insertions(+), 0 deletions(-) You can use multiple -m flags to create "multiline commits", and I have to admit that this can be very handy in some cases.

Edited: Several people pointed out that you can achieve the same commit structure including a title and body (multiple lines) by opening quotes, pressing enter and closing the commit with quotes again.

git commit -m "commit title


> commit description"

[master 2fe1ef8] commit title

1 file changed, 0 insertions(+), 0 deletions(-)

create mode 100644 test-2.txt

If you want to see this command in action, I shared a short terminal session on Twitter with a little video.

And thanks to Stephan Schneider who shared that little git tip in our company slack. 🙇‍♂️

Related Topics


Share article on Twitter

... or say Hi! 👋 Related posts

(read more)

Reddit's website uses DRM for fingerprinting

Words: 0x70532007 - - 02:27 09-07-2020

Recently, I was using a page on Reddit (i.e. the main redesign domain, not, when I saw a yellow bar from Firefox:

Why did Reddit want to use DRM? This pop-up was appearing on all pages, even on pages with no audio or video. To find out, I did a bunch of source code analysis and found out.

Reddit’s source code uses bundling and minification, but I was able to infer that in ./src/reddit/index.tsx, a script was conditionally loaded into the page. If the show_white_ops A/B test flag was set, then it loaded another script: That script loads (although it appears to test for a browser bug involving running JSON.parse with null bytes, and sometimes loads instead, but I haven’t analyzed this file (it looks fairly similar though), and also does nothing if due to another browser bug, !("a" == "a"[0]) evaluates to true).

The purpose of all of this appears to be both fingerprinting and preventing ad fraud. I’ve determined that belongs to White Ops. I have infered this from the name of Reddit’s feature flag, and mentions of White Ops which is a “global leader in bot mitigation, bot prevention, and fraud protection”. They appear to do this by collecting tons of data about the browser, and analyzing it. I must say, their system is quite impressive.

Back to the DRM issue, it appears that the script is checking what DRM solutions are available, but not actually using them. However, just checking is enough to trigger Firefox into displaying the DRM popup. Specfically, it looks for Widevine, PlayReady, Clearkey, and Adobe Primetime.

main.js does a bunch of other interesting things, but there’s so many that I’ve written a whole seperate blog post about all of the ones I found. Here are some highlights:

(read more)

jklp: a 36-key ergonomic keyboard

Words: tomb - - 22:01 08-07-2020

jklp (pronounced like "jökulhlaup") is an ergonomic keyboard with just 36 keys. It's designed around the natural resting position of your hands.

Compared to other minimal ergonomic keyboards, its unusual features are:

This repository contains the files you need to make a functioning jklp from scratch, including the case, wiring, and firmware.

The design requires ALPS-style switches and a Pololu A-Star controller. You won't need a PCB or hot glue.

The case is a stack of laser-cut layers fastened together by bolts. The layers are:

The base is single piece that holds the halves of the keyboard together and sets their angle with respect to one another. It can be of any thickness or material. I used 1/4" plywood.

The base included in case.svg sets an angle of 20°, but you can make another base with a different angle and swap it in anytime, even after wiring.

The frame creates a cavity for the wiring, contacts, and diodes. It can be of any material but should be at least 5 mm thick.

The plates hold the switches firmly in place, avoiding the need for a PCB or hot glue. The plates have two layers:

The crystal is an optional shield over the controller (not pictured in the photo at top). Like the base, this piece needs to be replaced in order to change the split angle, and the included one is for a 20° angle.

The wiring of the key matrix and controller is specified by firmware.json, which can be edited and compiled to corresponding firmware in Keyboard Firmware Builder.

If you don't understand what that means or need help with wiring technique, the QMK Hand Wiring Guide is good.

To play with the layout or redesign the key matrix from scratch, start by opening up layout.json in the Keyboard Layout Editor.

(read more)

Git Credential Manager Core: Building a universal authentication experience

Words: mooreds - - 23:15 08-07-2020

Authentication is a critical component to your daily development. When working in open source, you need to prove that you have rights to update a branch with git push. Additionally when working on proprietary software, you need a way to prove that you even have read permission to access your code during git fetch or git pull.

Git currently supports two authentication mechanisms for accessing remotes. When using HTTP(S), Git sends a username and password, or a personal access token (PAT) via HTTP headers. When using SSH, Git relies on the server knowing your machine’s public SSH key. Though SSH-based authentication is considered most secure, setting it up correctly can often be a challenge. On the other hand, PATs are often much easier to set up, but also far less secure.

To manage all of this, Git relies on tools called credential managers which handle authentication to different hosting services. When first designed, these tools simply stored usernames and passwords in a secure location for later retrieval (e.g., your keychain, in an encrypted file, etc). These days, two-factor authentication (2FA) is commonly required to keep your data secure. This complicates the authentication story significantly since new and existing tools are required to meet the demands of these stricter authentication models.

Even though authentication is so critical, building a new authentication feature is hard. Hard to debug, hard to test, hard to get right. If you’re going to do something, then it is best to do it right. Even better, it is helpful to do it once. We’ve been hard at work laying the foundation for a single tool to unify the Git authentication experience across platforms and hosting services.

I’m pleased to announce a new credential manager is available for Windows and macOS: Git Credential Manager (GCM) Core ! GCM Core is a free, open-source, cross-platform credential manager for Git, and currently supports authentication to GitHub, Bitbucket, and Azure Repos. We built this tool from the ground up with cross-platform and cross-host support in mind. We plan to extend this tool to include support for Linux platforms and authentication with additional hosting services.

But wait? Doesn’t this just mean we’ve made yet another credential helper?

xkcd on Standards. Source: – License

Well yes, but actually no. GCM Core is in beta today, which means that we won’t be retiring GCM for Windows. Also without Linux support we won’t be retiring GCM for Mac & Linux, just yet.

However, once GCM Core has had some time in the wild, we will move to deprecate and retire both GCM for Windows and GCM for Mac & Linux.

To install GCM Core, follow these instructions for each platform:

GCM Core is distributed as a standalone installer which you can find from the releases page on GitHub. The next version of the official Git for Windows installer will include GCM Core as an experimental option, and eventually will become installed by default.

GCM Core installs side-by-side with existing Git Credential Manager for Windows installations and will re-use any previously stored credentials. This means that you do not need to re-authenticate! Credentials created by GCM Core are also backwards compatible with GCM for Windows, should you wish to return to the older credential manager.

If you installed GCM Core via the Git for Windows installer, you can run the following in an admin command-prompt to switch back to using GCM for Windows:

If you installed GCM Core via the standalone installer, simply uninstall GCM Core from the Control Panel or Settings app.

GCM Core is available from the custom Microsoft Homebrew Tap and can be installed and configured for the current user easily by running the following commands with Homebrew installed:

We intend for GCM Core to be helpful for all users, on all platforms, using any hosting service. There is room to grow here, especially our plans to make GCM Core available on Linux.

We are pleased our first release has support for authenticating with GitHub, Azure Repos, and Bitbucket. In particular, we would like to thank @mminns for helping us get the Bitbucket authentication mechanism working! We are excited to similarly extend support for other hosting services, including planned support for GitLab.

While authentication is critical to user success, it isn’t something that should take a lot of user attention. We streamlined the authentication flow to ensure that you are prompted for new credentials only when absolutely necessary. This flow includes interactive sessions that allow a variety of 2FA mechanisms.

On Windows, our authentication model uses a graphical user interface (GUI) system. The authentication windows are custom to your Git hosting service, as seen in the figure below.

On macOS, the authentication process uses a combination of terminal inputs and browser windows.

We are working on updating this terminal-based approach with a cross-platform GUI approach. This again will help unify the authentication user experience across platforms.

After completing the GUI steps to create a security token, these credentials are securely stored. On Windows, the tokens are stored in the Windows Credential Manager. This is backwards compatible with any existing GCM for Windows credentials. On macOS, credentials are securely stored in the user’s login Keychain.

I mentioned earlier that we are laying a foundation for a unified authentication experience. It may help to understand the fractured world of Git authentication before GCM Core.

The Git Credential Manager for Windows (GCM for Windows) was created back in 2015 primarily to address the combined problem of a lack of SSH support in Azure Repos, then named Visual Studio Online, and a hard requirement for 2FA for many Azure Active Directory or Microsoft Account users – the authentication providers supported by Azure Repos. Over time GCM for Windows also gained support for GitHub and Bitbucket authentication through open-source contributions. Git for Windows initially shipped only with a C-based credential helper named wincred which just persisted a username/password, and did nothing regarding 2FA.

At the same time, Git Credential Manager for Mac and Linux (GCM for Mac & Linux) was created, focused on non-traditional Microsoft developers. That is, those not on Windows and those using non-Microsoft languages, runtimes, or toolchains.

These two codebases are completely separate, with GCM for Windows being written in C# and GCM for Mac & Linux being written in Java. GCM for Mac & Linux is also limited to Azure Repos and never got any support for GitHub or Bitbucket. Both projects have had their fair share of issues (remember: auth is hard). With the number of different authentication topologies typically present in enterprises means there’s been a number of dirty hacks added over the years to work around problems quickly.

After seeing the success of moving the Windows OS monorepo to Git, the Microsoft Office team approached our team with a desire to do the same with their monorepo. The catch: they have developers using macOS to build macOS and iOS clients. This means that they need cross-platform tools. As part of that, you can read about our journey to transition from the Windows-only VFS for Git to Scalar as a cross-platform solution for monorepo performance.

Scalar and VFS for Git are extensions to Git that make it easier to work with very large monorepos. Both of these technologies rely in part of the GVFS Protocol to receive information such as file sizes and individual objects on-demand from a remote repository. This mechanism only uses HTTP REST endpoints, and is not available via SSH. This means that it is even more important to have a proper credential manager on macOS.

We examined this landscape of credential managers and decided that they needed something better, and more sustainable. Thus, the idea of GCM Core was born.

With the release and introduction of .NET Core and .NET Standard, creating applications that work across Windows, macOS, and Linux is easy. The ability to bundle the .NET runtime with your application when publishing means you can distribute without worrying about runtime dependencies or mismatched versions.

Today is just the beginning. This first launch is a small, but important step toward unifying the authentication experience. Come along with us on this journey, and contribute to the open-source project by creating issues when you have a problem, or contributing a pull request if you can.

We are working on getting GCM Core to Linux users of various distributions. The groundwork is already in place, and we’re just evaluating options for persisting credentials in a safe place. Consult this issue for the latest updates on Linux support.

Currently only Windows has GUIs for all the current Git host providers. macOS has a GUI only for Azure Repos. We are evaluating options such as Avalonia or native helper apps for this, and would happily welcome any contributions in this space. Consult this issue for the latest updates on cross-platform UI.

(read more)

Why Go’s Error Handling is Awesome

Words: enxio - - 19:56 08-07-2020

Published on 06 Jul 2020

Go’s infamous error handling has caught quite the attention from outsiders to the programming language, often touted as one of the language’s most questionable design decisions. If you look into any project on Github written in Go, it’s almost a guarantee you’ll see the lines more frequently than anything else in the codebase:

if err != nil { return err


Although it may seem redundant and unnecessary for those new to the language, the reason errors in Go are treated as first-class citizens (values) has a deeply-rooted history in programming language theory and the main goal of Go as a language itself. Numerous efforts have been made to change or improve how Go deals with errors, but so far, one proposal is winning above all others:

- Leave if err != nil alone!

Go’s philosophy regarding error handling forces developers to incorporate errors as first class citizens of most functions they write. Even if you ignore an error using something like:

func getUserFromDB () ( * User , error ) { ... }

func main () { user , _ := getUserFromDB ()


Most linters or IDEs will catch that you’re ignoring an error, and it will certaintly be visible to your teammates during code review. However, in other languages, it may not be clear that your code is not handling a potential exception in a try catch code block, being completely opaque about handling your control flow.

If you handle errors in Go the standard way, you get the benefits of:

Not only is the syntax of func f() (value, error) easy to teach to a newcomer, but also a standard in any Go project which ensures consistency.

It’s important to note Go’s error syntax does not force you to handle every error your program may throw. Go simply provides a pattern to ensure you think of errors as critical to your program flow, but not much else. At the end of your program, if an error occurs, and you find it using err != nil, and your application doesn’t do something actionable about it, you’re in trouble either way - Go can’t save you. Let’s take a look at an example:

if err := criticalDatabaseOperation (); err != nil { log . Errorf ( "Something went wrong in the DB: %v" , err )


if err := saveUser ( user ); err != nil { log . Errorf ( "Could not save user: %v" , err )


If something goes wrong and err != nil in calling criticalDatabaseOperation(), we’re not doing anything with the error aside from logging it! We might have data corruption or an otherwise unexpected issue that we are not handling intelligently, either via retrying the function call, canceling further program flow, or in worst-case scenario, shutting down the program. Go isn’t magical and can’t save you from these situations. Go only provides a standard approach for returning and using errors as values, but you still have to figure out how to handle the errors yourself.

In something like the Javascript Node.js runtime, you can structure your programs as follows, known as throwing exceptions:

try {

criticalOperation1 ();

criticalOperation2 ();

criticalOperation3 ();

} catch ( e ) {

console . error ( e );


If an error occurs in any of these functions, the stack trace for the error will pop up at runtime and will be logged to the console, but there is no explicit, programmatic handling of what went wrong.

Your criticalOperation functions don’t need to explicitly handle error flow, as any exception that occurs within that try block will be raised at runtime along with a stack trace of what went wrong.

A benefit to exception-based languages is that, compared to Go, even an unhandled exception will still be raised via a stack trace at runtime if it occurs. In Go, it is possible to not handle a critical error at all, which can arguably be much worse. Go offers you full control of error handling, but also full responsibility.

EDIT: Exceptions are definitely not the only way other languages deal with errors. Rust, for example, has a good compromise of using option types and pattern matching to find error conditions, leveraging some nice syntactic sugar to achieve similar results.

The Zen of Go mentions two important proverbs:

Using the simple if err != nil snippet to all functions which return (value, error) helps ensure failure in your programs is thought of first and foremost. You don’t need to wrangle with complicated, nested try catch blocks which appropriately handle all possible exceptions being raised.

With exception-based code, however, you’re forced to be aware of every situation in which your code could have exceptions without actually handling them, as they’ll be caught by your try catch blocks. That is, it encourages programmers to never check errors, knowing that at the very least, some exception will be handled automatically at runtime if it occurs.

A function written in an exception-based programming language may often look like this:

item = getFromDB ()

item . Value = 400

saveToDB ( item )

item . Text = 'price changed'

This code does nothing to ensure exceptions are properly handled. Perhaps the difference between making the code above become aware of exceptions is to switch the order of saveToDB(item) and item.Text = 'price changed, which is opaque, hard to reason about, and can encourage some lazy programming habits. In functional programming jargon, this is known as the fancy term: violating referential transparency. This blog post from Microsoft’s engineering blog in 2005 still holds true today, namely:

My point isn’t that exceptions are bad.

My point is that exceptions are too hard and I’m not smart

enough to handle them.

A superpower of the pattern if err != nil is how it allows for easy error-chains to traverse a program’s hierarchy all the way to where they need to be handled. For example, a common Go error handled by a program’s main function might read as follows:

[2020-07-05-9:00] ERROR: Could not create user: could not check if user already exists in DB: could not establish database connection: no internet

The error above is (a) clear, (b) actionable, (c) has sufficient context as to what layers of the application went wrong. Instead of blowing up with an unreadable, cryptic stack trace, errors like these that are a result of factors we can add human-readable context to, and should be handled via clear error chains as shown above.

Moreover, this type of error chain arises naturally as part of a standard Go program’s structure, likely looking like this:

// In controllers/user.go

if err := database . CreateUser (); err != nil { log . Errorf ( "Could not create user: %v" , err )


// In database/user.go

func CreateUser () error { if err := db . SQLQuery ( userExistsQuery ); err != nil { return fmt . Errorf ( "could not check if user already exists in db: %v" , err ) } ...


// In database/sql.go

func SQLQuery () error { if err := sql . Connected (); err != nil { return fmt . Errorf ( "could not establish db connection: %v" , err ) } ...


// in sql/sql.go

func Connected () error { if noInternet { return errors . New ( "no internet connection" ) } ...


The beauty of the code above is that each of these errors are completely namespaced by their respective function, are informative, and only handle responsibility for what they are aware of. This sort of error chaining using fmt.Errorf("something went wrong: %v", err) makes it trivial to build awesome error messages that can tell you exactly what went wrong based on how you defined it.

On top of this, if you want to also attach a stack trace to your functions, you can utilize the fantastic library, giving you functions such as:

errors . Wrapf ( err , "could not save user with email %s" , email )

which print out a stack trace along with the human-readable error chain you created through your code. If I could summarize the most important pieces of advice I’ve received regarding writing idiomatic error handling in Go:

Add stack traces when your errors are actionable to developers

Do something with your returned errors, don’t just bubble them up to main, log them, and forget them

Keep your error chains unambiguous

When I write Go code, error handling is the one thing I never worry about, because errors themselves are a central aspect of every function I write, giving me full control in how I handle them safely, in a readable manner, and responsibly.

“if …; err != nil” is something you’ll probably type if you write go. I don’t think it’s a plus or a negative. It gets the job done, it’s easy to understand, and it empowers the programmer to do the right thing when the program fails. The rest is up to you.

- From Hacker News

(read more)

GNU: A Heuristic for Bad Cryptography

Words: asymptotically - - 18:41 08-07-2020

If you see the letters GNU in a systems design, and that system intersects with cryptography, I can almost guarantee that it will be badly designed to an alarming degree.

This is as true of GnuPG (and PGP in general) as it is of designs like the proposed GNU Name System (IETF draft) and cryptographic libraries like GnuTLS and libgcrypt. In fact, I cannot recall single GNU-branded cryptography project that isn’t a roaring dumpster fire.

The GNS (GNU Name System) uses an unconventional construction for zones:

A zone in GNS is defined by a public/private ECDSA key pair (d,zk), where d is the private key and zk the corresponding public key. GNS employs the curve parameters of the twisted edwards representation of Curve25519 [RFC7748] (a.k.a. edwards25519) with the ECDSA scheme ([RFC6979]).

This is beyond weird: Going out of your way to use the edwards25519 curve from RFC 7748, but not use the Ed25519 signature algorithm, but still choosing to use deterministic ECDSA (RFC 6979). If you’re lost, I wrote about digital signature algorithms in a previous blog post.

The authors acknowledge the unconventional nature of their design choice in section 9.1 of the RFC draft:

GNS uses ECDSA over Curve25519. This is an unconventional choice, as ECDSA is usually used with other curves. However, traditional ECDSA curves are problematic for a range of reasons described in the Curve25519 and EdDSA papers. Using EdDSA directly is also not possible, as a hash function is used on the private key which destroys the linearity that the GNU Name System depends upon. We are not aware of anyone suggesting that using Curve25519 instead of another common curve of similar size would lower the security of ECDSA. GNS uses 256-bit curves because that way the encoded (public) keys fit into a single DNS label, which is good for usability.

The bold statement (my emphasis) is nonsense: In any design that uses digital signature algorithms, your system should map a private key (some opaque byte string) to a public key (some other opaque byte string) and signatures should also be opaque byte strings. The inclusion of a hash function under the hood of the signature algorithm is a moot point, especially since RFC 6979 also uses HMAC-SHA2 to generate deterministic nonces, thereby rendering their choice of RFC 6979 a contradiction of their stated goal.

Using Ed25519 with a 32-byte private key (instead of a 64-byte private key) is also trivial. To wit: Libsodium offers crypto_sign_seed_keypair() for this purpose.

But even worse: ECDSA is less secure and slower than EdDSA, even when you use the same curves, due to how the algorithms are implemented. The authors of the RFC do not defend this design choice beyond this hash function non sequitur.

I can’t be the only one feeling this way right now. Art by Khia.

After I initially posted this, Redditor Steve132 informed me that I overlooked the reason they made this design decision.

Take a look at Section 6.1

From here, the following steps are recursively executed, in order: Extract the right-most label from the name to look up. Calculate q using the label and zk as defined in Section 4.1.

So then if you go to section 4.1, they do h=H(<address string>), (r,R) is some root keypair, then they do C (a child public key), C=hR, then q=H(C).

the idea behind the calculation of q is to use the root public key to derive a child public key from ONLY the root public key, exploiting the linearity property that in elliptic curves, if bG=B, then (b+s)G=(sG+B)

This allows a third party to derive child public keys without any knowledge of the private keys for the root. This technique is also used in bitcoin’s bip32 ( for ‘unhardened’ derivation scheme.

I fully admit, I didn’t absorb this detail in my first pass of the RFC draft. It wasn’t clearly spelled out in Section 9 (which aims to justify their cryptography decisions), and I didn’t read the other sections as carefully. This was my mistake.

However, my general point that this design choice is both unconventional and unnecessary still stands, because BIP32-Ed25519 already exists (albeit, it still needs a carefully designed implementation to be secure against active attackers). The GNU Name System developers didn’t need to roll their own.

Furthermore, trying to push through an implementation of ECDSA over edwards25519 isn’t just unnecessary and weird, it’s also dangerous:

While I don't agree that ECDSA is worse than Ed25519 – both have pros and cons — it takes courage to implement ECDSA over Edward25519. Do you know if they published any code? This unfortunate marriage may introduce fun and unique bugs

— thaidn (@XorNinja) July 9, 2020

Thai Duong–author of the BEAST attack against SSL/TLS, among other things

Of course, all cryptography development can be said to be dangerous, but there are other problems fundamental to their design.

The GNU Name System project doesn’t stop there. It further throws IND-CCA2 security out the window and specifies encrypting with AES and TwoFish in a cipher cascade, using Cipher Feedback (CFB) mode.

The authors do not even attempt to defend this decision. I sincerely doubt they’ve heard the words “adaptive chosen-ciphertext attack” in the course of their self-study.

Because, y’know, attackers will surely never be able to replay UDP traffic if a runtime exception occurs because of corrupted data.

Cipher cascades are usually the result of “we want to defend against a backdoored or broken cipher”. Bear in mind, the cipher itself is rarely the first part of a cryptosystem to be broken.

TwoFish isn’t the worst choice of a cascade partner for AES, but I’d prefer a design that employed a different paradigm (since AES is a SPN permutation block cipher, an ARX-based stream cipher like Salsa20 or ChaCha seems reasonable).

AES is a boring choice, because it’s the industry standard. I’m not particularly fond of AES (due to it not being fast and constant-time in pure software implementations), but if you use it in an authenticated mode (AES-GCM, AES-CCM, AES-EAX, AES-OCB3, … I dunno, Poly1305-AES? Just use an AEAD mode!), it’s fine.

Cipher Feedback (CFB) mode is not an authenticated mode.

If you’re publishing a cryptography design in 2020 that fails the Cryptographic Doom Principle, you need to go back to the drawing board.


Art by Swizz.

If you want to learn about why GnuPG (and the PGP ecosystem in general) is terrible, I recommend Latacora’s takedown.

GnuTLS is an SSL/TLS library created by the same people who created (and then abandoned) libmcrypt, which was the scourge of bad cryptography in the PHP ecosystem for many years (until it was finally excised in PHP 7.2). Consequently, the project’s CVE history should be no surprise.

Quick story: Many years ago, a few timing attacks were discovered in libgcrypt by regular chatters in Freenode’s ##crypto channel. This led a lot of us to look at libgcrypt for more bugs.

The general consensus of the ensuing IRC discussion was, roughly, “We probably shouldn’t try to fix them all, because a) that’s way too much effort because there’s too much badness and b) this library will be a ripe target for upcoming cryptanalysis researchers to get their first papers published for many years”. And, indeed, the attack papers that have come out over the years that affect libgcrypt haven’t disappointed.

To be clear, at the time this happened, I was garbage at writing C (and somehow even less confident than capable) and barely making ends meet, so “drop everything and volunteer to fix all the libgcrypt badness” wasn’t a tenable option for me. And since the world is largely moving away from GnuPG and libgcrypt, it honestly isn’t worth the effort trying to fix all the bad when an easier fix is “use something good instead”.

If you see the letters GNU anywhere in a project that intersects with cryptography–except for its public license–it’s almost certainly an error-prone cryptographic design.

Or, as my friend Kye calls it:

— Kye Fox (@KyeFox) July 8, 2020

The Dunning-GNUger Effect.

To replace GPG, you want age and minisign.

To replace GnuTLS or libgcrypt, depending on what you’re using it for, you want one of the following: s2n, OpenSSL/LibreSSL, or Libsodium.

For embedded systems, BearSSL and libhydrogen are both good options.

Header image, like the GnuNet logo found here, is available under the Creative Commons Attribution-Share Alike 4.0 International license.

Share this:

h3 Loading...

(read more)