Citus: Distributed PostgreSQL for Data-Intensive Applications paper can be downloaded here. Recently, our team got a request to provide a solution to shard Postgres. One of the solutions that we discussed was Citus. I have heard about the product and seen their blogs related to Postgres in the past but never used it. I thought it would be fun to read about its internal workings. If you find something wrong on the notes, please send a pull request. Before digging their white paper, let’s take a step back and ask what is sharding and why do we need sharding? Certainly, I haven’t used the word “shard” in my day-to-day life. . There two ways one can scale their systems: Vertical Scaling - Acquiring more resources (eg. CPU, Memory, Disk) on the same hardware Horizontal Scaling - Acquiring more resources by adding additional hardware Sharding comes from the concept of Horizontal Scaling. Say, the maximum disk space of servers that you have is 11 TB. What if we want to store more than that in a single table/database? Traditional approach is vertical scaling i.e) trying to add more disks on the server, but at some point it will hit the ceiling. Nothing one can do other than growing horizontally. A good introduction about database sharding from Digital Ocean. What is Citus? It is a PostgreSQL extension to store data, query (which includes transactions) acorss a cluster of PostgreSQL servers Open sourced [1] Postgres core itself doesn’t come with features for horizontal scaling. Postgres’ wiki on sharding and Gitlab’s experiment using FDW are good resources. Alternate approaches are: Build the database engine from scratch and write a layer to provide over-the-wire SQL compatibility - YugaByte, Cockroachdb etc. Fork an open source database systems and build new features on top of it - Orioledb, Neondatabase Provide new features through a layer that sits between the application and database, as middleware - ShardingSphere . I couldn’t find a lot of options for horizontal scaling Postgres. Looks like many agrees. MySQL has Vitess The types of applications that requires distributed postgres is broadly divided into four categories: Mul
(read more)
Introduction In this two-part tutorial we will learn how to build a speech controlled robot using Tensil open source machine learning (ML) acceleration framework and Digilent Arty A7-100T FPGA board. At the heart of this robot we will use the ML model for speech recognition. We will learn how Tensil framework enables ML inference to be tightly integrated with digital signal processing in a resource constrained environment of a mid-range Xilinx Artix-7 FPGA. Part I will focus on recognizing speech commands through a microphone. Part II will focus on translating commands into robot behavior and integrating with the mechanical platform. Let’s start by specifying what commands we want the robot to understand. To keep the mechanical platform simple (and inexpensive) we will build on a wheeled chassis with two engines. The robot will recognize directives to move forward in a straight line (go!), turn in-place clockwise (right!) and counterclockwise (left!), and turn the engines off (stop!)
(read more)
I’ve been thinking a lot recently about Single-Page Apps (SPAs) and Multi-Page Apps (MPAs). I’ve been thinking about how MPAs have improved over the years, and where SPAs still have an edge. I’ve been thinking about how complexity creeps into software, and why a developer may choose a more complex but powerful technology at the expense of a simpler but less capable technology. I think this core dilemma – complexity vs simplicity, capability vs maintainability – is at the heart of a lot of the debates about web app architecture. Unfortunately, these debates are so often tied up in other factors (a kind of web dev culture war, Twitter-stoked conflicts, maybe even a generational gap) that it can be hard to see clearly what the debate is even about. At the risk of grossly oversimplifying things, I propose that the core of the debate can be summed up by these truisms: The best SPA is better than the best MPA. The average SPA is worse than the average MPA. The first statement should be clear to most seasoned web developers. Show me an MPA, and I can show you how to make it better with JavaScript. Added too much JavaScript? I can show you some clever ways to minimize, defer, and multi-thread that JavaScript. Ran into some bugs, because now you’ve deviated from the browser’s built-in behavior? There are always ways to fix it! You’ve got JavaScript. Whereas with an MPA, you are delegating some responsibility to the browser. Want to animate navigations between pages? You can’t (yet). Want to avoid the flash of white? You can’t, until Chrome fixes it (and it’s not perfect yet). Want to avoid re-rendering the whole page, when there’s only a small subset that actually needs to change? You can’t; it’s a “full page refresh.” My second truism may be more controversial than the first. But I think time and experience have shown that, whatever the promises of SPAs, the reality has been less convincing. It’s not hard to find examples of poorly-built SPAs that score badly on a variety of metrics (performance, accessibility, reliability), and which could have been built better and more cheaply as a bog-standard MPA. Example: subsequent navigations To illustrate, let’s consider one of the main value propositions of an SPA: making subsequent navigations faster. Rich Harris recently offered an example of using the SvelteKit website (SPA) compared to the Astro website (MPA), showing that page navigations on the Svelte site were faster. Now, to be clear, this is a bit of an unfair comparison: the Svelte site is preloading content when you hover over links, so there’s no network call by the time you click. (Nice optimization!) Whereas the Astro site is not using a Service Worker or other offlining – if you throttle to 3G, it’s even slower relative to the Svelte site. But I totally believe Rich is right! Even with a Service Worker, Astro would have a hard time beating SvelteKit. The amount of DOM being updated here is small and static, and doing the minimal updates in JavaScript should be faster than asking the browser to re-render the full HTML. It’s hard to beat element.innerHTML = '...'. However, in many ways this site represents the ideal conditions for an SPA navigation: it’s small, it’s lightweight, it’s built by the kind of experts who build their own JavaScript framework, and those experts are also keen to get performance right – since this website is, in part, a showcase for the framework they’re offering. What about real-world websites that aren’t built by JavaScript framework authors? Anthony Ricaud recently gave a talk (in French – apologies to non-Francophones) where he analyzed the performance of real-world SPAs. In the talk, he asks: What if these sites used standard MPA navigations? To answer this, he built a proxy that strips the site of its first-party JavaScript (leaving the kinds of ads and trackers that, sadly, many teams are not allowed to forgo), as well as another version of the proxy that doesn’t strip any JavaScript. Then, he scripted WebPageTest to click an internal link, measuring the load times for both versions (on throttled 4G). So which was faster? Well, out of the three sites he tested, on both mobile (Moto G4) and desktop, the MPA was either just as fast or faster, every time. In some cases, the WebPageTest filmstrips even showed that the MPA version was faster by several seconds. (Note again: these are subsequent navigations.) On top of that, the MPA sites gave immediate feedback to the user when clicking – showing a loading indicator in the browser chrome. Whereas some of the SPAs didn’t even manage to show a “skeleton” screen before the MPA had already finished loading. Screenshot from Anthony Ricaud’s talk. The SPA version is on top (5.5s), and the MPA version is on bottom (2.5s). Now, I don’t think this experiment is perfect. As Anthony admits, removing inline s removes some third-party JavaScript as well (the kind that injects itself into the DOM). Also, removing first-party JavaScript removes some non-SPA-related JavaScript that you’d need to make the site interactive, and removing any render-blocking inline s would inherently improve the visual completeness time. Even with a perfect experiment, there are a lot of variables that could change the outcome for other sites: How fast is the SSR? Is the HTML streamed? How much of the DOM needs to be updated? Is a network request required at all? What JavaScript framework is being used? How fast is the client CPU? Etc. Still, it’s pretty gobsmacking that JavaScript was slowing these sites down, even in the one case (subsequent navigations) where JavaScript should be making things faster. Exhausted developers and clever developers Now, let’s return to my truisms from the start of the post: The best SPA is better than the best MPA. The average SPA is worse than the average MPA. The cause of so much debate, I think, is that two groups of developers may look at this situation, agree on the facts on the ground, but come to two different conclusions: “The average SPA sucks? Well okay, I should stop building SPAs then. Problem solved.” – Exhausted developer “The average SPA sucks? That’s just because people haven’t tried hard enough! I can think of 10 ways to fix it.” – Clever developer Let’s call these two archetypes the exhausted developer and the clever developer. The exhausted developer has had enough with managing the complexity of “modern” web sites and web applications. Too many build tools, too many code paths, too much to think about and maintain. They have JavaScript fatigue. Throw it all away and simplify! The clever developer is similarly frustrated by the state of modern web development. But they also deeply understand how the web works. So when a tool breaks or a framework does something in a sub-optimal way, it irks them, because they can think of a better way. Why can’t a framework or a tool fix this problem? So they set out to find a new tool, or to build it themselves. The thing is, I think both of these perspectives are right. Clever developers can always improve upon the status quo. Exhausted developers can always save time and effort by simplifying. And one group can even help the other: for instance, maybe Parcel is approachable for those exhausted by Webpack, but a clever developer had to go and build Parcel first. Conclusion The disparity between the best and the average SPA has been around since the birth of SPAs. In the mid-2000s, people wanted to build SPAs because they saw how amazing GMail was. What they didn’t consider is that Google had a crack team of experts monitoring every possible problem with SPAs, right down to esoteric topics like memory leaks. (Do you have a team like that?) Ever since then, JavaScript framework and tooling authors have been trying to democratize SPA tooling, bringing us the kinds of optimizations previously only available to the Googles and the Facebooks of the world. Their intentions have been admirable (I would put my own fuite on that pile), but I think it’s fair to say the results have been mixed. An expert dev
(read more)
As some of you may know, on May 4th Jack Huey opened a PR to stabilize an initial version of generic associated types. The current version is at best an MVP: the compiler support is limited, resulting in unnecessary errors, and the syntax is limited, making code that uses GATs much more verbose than I’d like. Nonetheless, I’m super excited, since GATs unlock a lot of interesting use cases, and we can continue to smooth out the rough edges over time. However, folks on the thread have raised some strong concerns about GAT stabilization, including asking whether GATs are worth including in the language at all. The fear is that they make Rust the language too complex, and that it would be better to just use them as an internal building block for other, more accessible features (like async functions and [return position impl trait in traits][RPITIT]). In response to this concern, a number of people have posted about how they are using GATs. I recently took some time to deep dive into these comments and to write about some of the patterns that I found there, including a pattern I am calling the “many modes” pattern, which comes from the chumsky parser combinator library. I posted about this pattern on the thread, but I thought I would cross-post my write-up here to the blog as well, because I think it’s of general interest. General thoughts from reading the examples I’ve been going through the (many, many) examples that people have posted where they are relying on GATs and look at them in a bit more detail. A few interesting things jumped out at me as I read through the examples: Many of the use-cases involve GATs with type parameters. There has been some discussion of stabilizing “lifetime-only” GATs, but I don’t think that makes sense from any angle. It’s more complex for the implementation and, I think, more confusing for the user. But also, given that the “workaround” for not having GATs tends to be higher-ranked trait bounds (HRTB), and given that those only work for lifetimes, it means we’re losing one of the primary benefits of GATs in practice (note that I do expect to get HRTB for types in the near-ish fut
(read more)
The first concerted effort to support accessibility under Linux was undertaken by Sun Microsystems when they decided to use GNOME for Solaris. Sun put together a team focused on building the pieces to make GNOME 2 fully accessible and worked with hardware makers to make sure things like Braille devices worked well. I even heard claims that GNOME and Linux had the best accessibility of any operating system for a while due to this effort. As Sun started struggling and got acquired by Oracle this accessibility effort eventually trailed off with the community trying to pick up the sl
(read more)
09 Mar '19 emacs I’m an Emacs guy, and so if I’ve got some simple tabular data I’d much rather keep it in an org-mode table than have to fire up Excel. Here’s an example: #+NAME: pap-table | first name | last name | yearly-income | |-------------|-----------|---------------| | Mr | Bennett | 2000 | | Fitzwilliam | Darcy | 10000 | | Charles | Bingley | 5000 | If the prospect of having to keep all those | characters manually aligned is freaking you out, don’t worry—orgtbl-mode does it all for you automa
(read more)
Almost every open source project uses MySQL as its database. It is supported by all hosting providers, is easy to administer, and free. However, MySQL servers often face performance issues, leading to many websites looking for alternate high performance databases. Percona server started gaining popularity in 2013, as a high performance, high availability alternative to MySQL that has features comparable to the MySQL Enterprise version. Today, we’ll see how Percona differs from MySQL, and if you should choose Percona for your website. We support several websites and web hosting providers on whose systems we’ve replaced MySQL with Percona. Here are the top reasons why we used Percona in those servers. MySQL needs a lot of memory – Percona doesn’t MySQL with MyISAM engine can be a memory hog. The reason for this lies in the way MySQL stores data. Even for small data size, MySQL assigns a fixed storage size for memory. As the queries on a database increase, the required memory increases.  With high memory usage, more disk-based swap is used, leading to high I/O wait. Percona, on the other hand use something called “Dynamic row format”, where data fields are given just enough memory they need. This reduces the overall memory usage, which translates to lower I/O bottlenecks, and lower instances of high server load. Percona executes queries in parallel – MySQL doesn’t When MySQL uses MyISAM storage engine and executes a query, it locks all the tables needed by that query, so that data is not modified by other queries. This leads to other queries waiting in a queue for the lock to be released, and causes significant delays when the query volume is high. Percona avoid this issue by locking only a single row (aka fine-grained locking) when executing queries. Further, it uses a technology called “Binary log Group commit” where multiple transactions can be written at the same time. Taken together, these two features allow fast execution of database transactions in multi-user environments. Unlike MyISAM engine, things become really different when MySQL uses InnoDB. Here, it supports binary log group commit feature too. Percona has diagnostics metrics for fast troubleshooting There are many reasons such as slow queries, unoptimized tables, etc. that can cause MySQL to become slow, or even crash. To troubleshoot these issues, many external tools s
(read more)
The open source Git project just released Git 2.37, with features and bug fixes from over 75 contributors, 20 of them new. We last caught up with you on the latest in Git back when 2.36 was released. To celebrate this most recent release, here’s GitHub’s look at some of the most interesting features and changes introduced since last time. Before we get into the details of Git 2.37.0, we first wanted to let you know that Git Merge is returning this September. The conference features talks, workshops, and more all about Git and the Git ecosystem. There is still time to submit a proposal to speak. We look forward to seeing you there! A new mechanism for pruning unreachable objects In Git, we often talk about classifying objects as either “reachable” or “unreachable”. An object is “reachable” when there is at least one reference (a branch or a tag) from which you can start an object walk (traversing from commits to their parents, from trees into their sub-trees, and so on) and end up at your destination. Similarly, an object is “unreachable” when no such reference exists. A Git repository needs all of its reachable objects to ensure that the repository is intact. But it is free to discard unreachable objects at any time. And it is often desirable to do just that, particularly when many unreachable objects have piled up, you’re running low on disk space, or similar. In fact, Git does this automatically when running garbage collection. But observant readers will notice the gc.pruneExpire configuration. This setting defines a “grace period” during which unreachable objects which are not yet old enough to be removed from the repository completely are left alone. This is done in order to mitigate a race condition where an unreachable object that is about to be deleted becomes reachable by some other process (like an incoming reference update or a push) before then being deleted, leaving the repository in a corrupt state. Setting a small, non-zero grace period makes it much less likely to encounter this race in practice. But it leads us to another problem: how do we keep track of the age of the unreachable objects which didn’t leave the repository? We can’t pack them together into a single packfile; since all objects in a pack share the same modification time, updating any object drags them all forward. Instead, prior to Git 2.37, each surviving unreachable object was written out as a loose object, and the mtime of the individual objects was used to store their age. This can lead to serious problems when there are many unreachable objects which are too new and can’t be pruned. Git 2.37 introduces a new concept, cruft packs, which allow unreachable objects to be stored together in a single packfile by writing the ages of individual objects in an auxiliary table stored in an *.mtimes file alongside the pack. While cruft packs don’t eliminate the data race we described earlier, in practice they can help make it much less likely by allowing repositories to prune with a much longer grace period, without worrying about the potential to create many loose objects. To try it out yourself, you can run: $ git gc --
(read more)
June 26, 2022 nullprogram.com/blog/2022/06/26/ Prompted by a 20 minute video, over the past month I’ve improved my debugger skills. I’d shamefully acquired a bad habit: avoiding a debugger until exhausting dumber, insufficient methods. My first choice should be a debugger, but I had allowed a bit of friction to dissuade me. With some thoughtful practice and deliberate effort clearing the path, my bad habit is finally broken — at least when a good debugger is available. It feels like I’ve leveled up and, like touch typing, this was a skill I’d neglected far too long. One friction point was the less-than-optimal assert feature in basically every programming language implementation. It ought to work better with debuggers. An assertion verifies a program invariant, and so if one fails then there’s undoubtedly a defect in the program. In other words, assertions make programs more sensitive to defects, allowing problems to be caught more quickly and accurately. Counter-intuitively, crashing early and often makes for more robust and reliable software in the long run. For exactly this reason, assertions go especially well with fuzzing. assert(i >= 0 && i < len); // bounds check assert((ssize_t)size >= 0); // suspicious size_t assert(cur->next != cur); // circular reference? They’re sometimes abused for error handling, which is a reason they’ve also been (wrongfully) discouraged at times. For example, failing to open a file is an error, not a defect, so an assertion is inappropriate. Normal programs have implicit assertions all over, even if we don’t usually think of them as assertions. In some cases they’re checked by the hardware. Examples of implicit assertion failures: Out-of-bounds indexing Dereferencing null/nil/None Dividing by zero Certain kinds of integer overflow (e.g. -ftrapv) Programs are generally not intended to recover from these situations because, had they been anticipated, the invalid operation wouldn’t have been attempted in the first place. The program simply crashes because there’s no better alternative. Sanitizers, including Address Sanitizer (ASan) and Undefined Behavior Sanitizer (UBSan), are in essence additional, implicit assertions, checking invariants that aren’t normally checked. Ideally a failing assertion should have these two effects: Execution should immediately stop. The program is in an unknown state, so it’s neither safe to “clean up” nor attempt to recover. Additional execution will only make debugging more difficult, and may obscure the defect. When run under a debugger — or visited as a core dump — it should break exactly at the failed assertion, ready for inspection. I should not need to dig around the call stack to figure out where the failure occurred. I certainly shouldn’t need to manually set a breakpoint and restart the program hoping to fail the assertion a second time. The whole reason for using a debugger is to save time, so if it’s wasting my time then it’s failing at its primary job. I examined standard assert features across various language implementations, and none strictly meet the
(read more)
OPA (Open Policy Agent) is a policy enforcement engine that can be used for a variety of purposes. OPA's access policies are written in a language called rego. A CNCF-graduated project, it's been incorporated into a number of different products.You can see the list of adopters here.We chose OPA to enforce database access policies because of its flexibility to write polices as per policy author's need and familiarity in the cloud-native ecosystem.OPA gives three options to enforce access polices:go library rest serviceWASMThe inspektor dataplane is written in rust, so we cannot use the go library to enforce policies in inspektor. For the simplicity, we decided to use WASM to evaluate access p
(read more)
Photo: Nick Fewings.Since publishing our post and video on APIs, I’ve talked with a few people on the topic, and one aspect that keeps coming up is the importanc
(read more)
2022-06-26 - Programming - ZigHeap allocation failure is something that is hard or impossible to account for in every case in most programming languages. There are either hidden memory allocations that can’t be handled, or it’s seen as too inconvenient to handle every possible allocation failure so the possibility is ignored. For example, when concatenating two strings with the + operator (where there is an implicit allocation that’s needed to store the result of the concatenation): In garbage collected languages like JavaScript, the possible failure of the hidden allocation can’t be handled by the user In languages with exceptions like C++, it’s possible to catch e.g. std::
(read more)
It’s very common to have a somewhere whose contents are the output of a markdown file. This can result in a flat structure of elements whose hierarchy has semantic meaning — the
(read more)
When programming in Python, I spend a large amount of time using IPython and its powerful interactive prompt, not just for some one-off calculations, but for significant chunks of actual programming and debugging. I use it especially for exploratory programming where I’m unsure of the APIs available to me, or what the state of the system will be at a particular point in the code. I’m not sure how widespread this method of working is, but I rarely hear other people talk about it, so I thought it would be worth sharing. Setup You normally need IPython installed into your current virtualenv for it to work properly: pip install ipython Methods There are basically two ways I open an IPython prompt. The first is by running it directly from a terminal: $ ipython Python 3.9.5 (default, Jul 1 2021, 11:45:58) Type 'copyright', 'credits' or 'license' for more information IPython 8.3.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: In a Django project project, ./manage.py shell can also be used if you have IPython installed, with the advantage that it will properly initialise Django for you. This works fine if you want to explore writing some “top level” code — for example, a new bit of functionality where the entry points have not been created yet. However, most code I write is not like that. Most of the time I find myself wanting to write code when I am already 10 levels of function calls down — for example: I’m writing some view code in a Django application, which has a request object — an object you could not easily recreate if you started from scratch at an IPython prompt. or, model layer code such as inside a save() method that is itself being called by some other code you have not written, like the Django admin or some signal. or, inside a test, where the setup code has already created a whole bunch of things that are not available to you when you open IPython. For these cases, I use the second method: Find the bit of code I want to modify, explore or debug. This will often be my own code, but could equally be a 3rd party library. I’m always working in a virtualenv, so even with 3rd party libraries
(read more)
Abstract¶Brainfuck is a very simple but Turing-complete programming language (hereafter referred to as 'BF' for the sake of brevity and propriety), consisting of only eight commands. These posts are a record of my attempt to build a simple processor capable of correctly executing BF programs directly/natively (ie: in hardware, ultimately running on an FPGA). Why the hell would you do such a thing?¶You've never met me or worked with me, have you? ;) Knowledge - this little mini-project ties together a whole bunch of different things that pique my interest, including: hardware, at the diy-eeng level digital circuit design (something I knew, and still know, very little about) FPGAs / ASICs MyHDL (python library for hardware simulation/design, converting software to FPGA/ASIC designs, etc) iPython Notebook (ie: this format you're reading/in which all the software work was done) etc Fun - yes, my mind is warped enough by long years of programming, and I have more than enough OCD/other neuroses, that this kind of thing drops lots of dopamine into my brain :) Who are you?¶I'm sandbender on github and SandbenderCa on Twitter... Rudy X. Desjardins for those still lost in meatspace ;) Enough preamble, where's the meat?¶Right. Let's get the links out of the way first, do a quick
(read more)
Let’s say you want to check if localStorage is full before inserting an item: how would you do it? Well, there’s only one way browsers tell you if the storage is full: they throw an error (commonly referred as QuotaExceededError) when you try to store an item that doesn’t fill in localStorage. So, to handle this specific use case, you must wrap localStorage.setItem in a try & catch to detect if there’s enough space in localStorage to store the item: (function app() { try { localStorage.setItem(keyName, keyValue); } catch (err) { } })(); Although this approach works, you should bear in mind that localStorage doesn’t throw only when there’s no available space. It also throws support errors (e.g., because the localStorage API is not supported in the browser) and security errors (e.g., because the localStorage API is being restricted when browsing in private mode in some browsers). To differentiate between such errors and the errors about quota, you can try to explicitly detect QuotaExceededError and behave accordingly: function isQuotaExceededError(err: unknown): boolean { return ( err instanceof DOMException && (err.code === 22 || err.code === 1014 || err.name === "QuotaExceededError" || err.name === "NS_ERROR_DOM_QUOTA_REACHED") ); } (function app() { try { localStorage.setItem(keyName, keyValue); } catch (err) { if (isQuotaExceededError(err)) { } else { } } }); However, there’s an even better approach than checking the error type every time we store something in localStorage. You see, the only case where browsers throw non-quota-related errors is when the localStorage API is not supported. So, instead of accounting for them each time we invoke setItem, we can detect the localStorage availability (once) separately before we start using it. The complete snippet below is a slight variation of MDN’s Feature-detecting localStorage snippet. It can be used to check the support of any API implementing the Web Storage API — so it works on both localStorage and sessionStorage function isQuotaExceededError(err: unknown): boole
(read more)
In this post I explore a couple of new (to me) operators in jq's arsenal: JOIN and INDEX, based on an answer to a question that I came across on Stack Overflow.The answer was in response to a question (JQ: How to join arrays by key?) about how to merge two arrays of related information. I found it interesting and it also introduced me to a couple of operators in jq that I'd hitherto not come across. There's a section in the manual titled SQL-Style Operators that describe them.I could have sworn I'd never seen this section before, so had instead looked to see if they were defined in the builtin.jq file, where jq functions, filters and operators are defined ... in jq. I did come across them there, and their definitions helped me understand them too. I thought I'd explore them in this blog post, "out loud", as it were.Test dataThroughout this post I'm going to use the data described in the Stack Overflow question, which (after a bit of tidying up) looks like this (and which I've put into a file called data.json):{ "weights": [ { "name": "apple", "weight": 200 }, { "name": "tomato", "weight": 100 } ], "categories": [ { "name": "apple", "category": "fruit" }, { "name": "tomato", "category": "vegetable" } ]}Starting with the definitions in builtin.jqI want to start by staring at the definitions of the two operators in builtin.jq. Here's the section of code, with a few empty lines added for readability:def INDEX(stream; idx_expr): reduce stream as $row ({}; .[$row|idx_expr|tostring] = $row);def INDEX(idx_expr): INDEX(.[]; idx_expr);def JOIN($idx; idx_expr): [.[] | [., $idx[idx_expr]]];def JOIN($idx; stream; idx_expr): stream | [., $idx[idx_expr]];def JOIN($idx; stream; idx_expr; join_expr): stream | [., $idx[idx_expr]] | join_expr;The first thing I see is that there are multiple definitions of both INDEX and JOIN, each with different numbers of parameters. In various discussions, I've seen this represented in the way folks refer to them. For example, there are three definitions of JOIN, one with two parameters (JOIN($idx; idx_expr)), one with three (JOIN($idx; stream; idx_expr)) and one with four (
(read more)
Back in December 2014 I spent a lot of time hanging out in IRC. While spending time in various channels, I found myself wanting to share images with other users. Unlike Discord—which came on the scene the following year—IRC did not have the same support for easily sharing images inline; all images needed to be shared as links to somewhere on the internet. This posed a problem for any image that wasn't sourced directly from a website, such as screenshots or locally-produced images. These would first need to be uploaded to a service, like Imgur, before they could be linked in an IRC channel. Being a college student on winter break, my solution to this problem was to write my own image sharing service. What resulted was a program called drop. Humble beginnings drop (stylized as 滴)
(read more)
Computer programs organize bits and bytes into “data structures”. In software of any import, the data structures are usually more interesting than the code around them. This part of the Quamina Diary takes a close look at a very simple data structure that I have greatly enjoyed using to build finite automata, and which I think has lessons to teach; it’s called smallTable. The problem · As described in a (pretty short) previous episode of this Diary, Quamina matches Patterns to Events, which are flattened into field-name/value pairs for the purpose. We try to match field names and values from Events to those offered in a Pattern. Matching names is easy, they’re immutable strings both in Events and Patterns, and thus the following suffices. tran
(read more)
multi-gitter allows you to make changes in multiple repositories simultaneously. This is achieved by running a script or program in the context of multiple repositories. If any changes are made, a pull request is created that can be merged manually by the set reviewers, or automatically by multi-gitter when CI pipelines has completed successfully. Are you a bash-guru or simply prefer your scripting in Node.js? It doesn't matter, since multi-gitter support any type of script or program. If you can script it to run in one place, you can run it in all your repositories with one command! Some examples: Syncing a file (like a PR-template) Programmatic refactoring Updating a dependency Automatically fixing linting issues Search and replace Anything else you are able to script! Demo Example Run with file $ multi-gitter run ./my-script.sh -O my-org -m "Commit message" -B branch-name Make sure the script has execution permissions before running it (chmod +x ./my-script.sh) Run code through interpreter If you are running an interpreted language or similar, it's important to specify the path as an absolute value (since the script will be run in the context of each re
(read more)
I'm overdue to write up some of my vague electric vehicle conversion project, so here's a small part of it. Note: This post is a journey through the awkward process of reverse engineering. If you're here to find out what CAN messages control a BMW F series GWS module, skip this post and read the later parts. Shifting Gears My goal was to find a gear selector to suit a car converted to an EV-style "fixed reduction gear" gearbox. This is basically an "automatic transmission" but without any different gear ratios, only a fixed gear ratio and a motor that drives forward and reverse: (That motor and reduction gear is from a Mitsubishi Outlander PHEV.) Most older auto transmission gear selectors aren't great for this because they depend on a mechanical linkage to the transmission. On newer models the physical linkage may only be to get in and out of Park Lock, but on older models it can be how every "gear" is selected (Neutral, Drive, Reverse, etc.). After looking at a few options that wouldn't fit, I stumbled across these really nice looking gear selectors from BMW: (BMW calls this a "GWS", please comment if you know what this stands for in German.) They are fully electronic, basically a glorified computer joystick with electro-mechanical features to unlock and re-lock manual shift mode, and even to move out of that mode. BMW North America made this video about how to use them: Many of the "F" series model code BMWs (2008 and newer), plus some "G" series, have a version of this gear selector but the specifics vary. The particular one I have is part number "GW 9 296 899-01"/"100999952-00": It's from an F20 model: a 2014 BMW 125i LCI that was crashed. The person parting
(read more)
Over the past year and a half, we have made a number of improvements to MPL. In addition to various bug-fixes and performance improvements, there are three major changes: entanglement detection, CGC-chaining, and a new block allocator. Entanglement detection In v0.3, the MPL runtime system now dynamically monitors reads and writes to check for entanglement. If entanglement is detected, then MPL terminates the program with an error message. Entanglement detection ensures that MPL is safe for all programs: if it type-checks, then it is safe to run. Specifically, when you run a program, MPL v0.3 guarantees that either (a) the program runs to completion in a disentangled manner, or (b) the program terminates with an entanglement error. The entanglement detector is precise: no false alarms! With entanglement detection, you can now safely use in-place updates to improve efficiency. You can even use in-place updates in a non-deterministic manner, which is essential for good performance in some cases. In particular, many well-known parallel programming techniques improve efficiency by interleaving atomic in-place updates and accesses to shared memory. These techniques are inherently non-deterministic, but desirable for performance optimization. A key advantage of disentanglement (in comparison to purely functional programming, where in-place updates are disallowed) is that it allows for in-place updates to be utilized in this manner. For example, with v0.3, you can safely use atomic
(read more)
Take a number, square it, the result is non-negative. Because positive * positive = positive and negative * negative is positive. Or $0^2 = 0^2$. But someone wanted to take square roots of negative numbers, so they did, and called it 'i'. $\sqrt{-1} = i$. A lot of people were frustrated upon learning this "You can't do that!", "How do you know that it doesn't lead to contradictions". The solution, to put imaginary and complex numbers on a solid foundation is something called a ring quotient. What you do is you start with the ring (meaning number system) of polynomials ov
(read more)
defaultable-map: An Applicative wrapper for Maps I’m announcing a small utility Haskell package I created that can wrap arbitrary Map-like types to provide Applicative and Alternative instances. You can find this package on Hackage here: defaultable-map: Applicative maps I can motivate why the Applicative and Alternative instances matter with a small example. Suppose that I define the following three Maps which are sort of like database tables: import Defaultable.Map firstNames :: Defaultable (Map Int) String firstNames = fromList [(0, "Gabriella"), (1,
(read more)
Arrays in several programming languages refer to lists of items. As others have said those items could be any sorts of values like names or in this case colors.Not sure of the exact logic going on here and I sure wouldn't be able to do it myself but the sorting here is probably based on RGB values which range from zero to two hundred fifty five. If you have a bunch of these in an array, like so["0,8,255","0,8,254",...]You could use a bunch of different methodologies to sort that, or really any array out. There's a myriad of sorting algorithms out there that vary by use case and efficiency and
(read more)
I had a bug in my OpenGL program. Here was the original code: for (Orbitor o : orbitors){ o.calculate_position(); } and here was the working version for (std::vector<Orbitor>::iterator it = orbitors.begin() ; it != orbitors.end(); ++it) { it->calculate_position(); } The bug was that the o.calculate_position(); call was supposed to update the internal state of the Orbitor structure, but was called on a copy of the instance in the original structure, and not on the original structure itself. Thus, when a later call tried to show the position, it was working with the version that had not updated the position first, and thus was showing the orbitors in the wrong position. The reason for this bug makes sense to me: the
(read more)
I consider myself to be pretty experienced in Rust. I’ve been using Rust in personal projects since late 2016 and professionally since early 2021. I’ve prepared internal presentations to teach coworkers about borrow checking and about async programming. However, even I still occasionally run into issues fighting the Rust compiler, and the following is a series of particularly unfortunate issues I ran into at work this week. Background Note: The following code examples are simplified in order to demonstrate the essence of the problem. Obviously, our actual code is a lot more complicated. We have an async trait Foo with a method update_all to process a list of items, which then calls the per-implementation update method on the trait. #[async_trait(?
(read more)
Interactive Redis: A Cli for Redis with AutoCompletion and Syntax Highlighting. IRedis is a terminal client for redis with auto-completion and syntax highlighting. IRedis lets you type Redis commands smoothly, and displays results in a user-friendly format. IRedis is an alternative for redis-cli. In most cases, IRedis behaves exactly the same as redis-cli. Besides, it is safer to use IRedis on production servers than redis-cli: IRedis will prevent accidentally running dangerous commands, like KEYS * (see Redis docs / Latency generated by slow commands). Features Advanced code completion. If you run command KEYS then run DEL, IRedis will auto-complete your command based on KEYS result. Command validation. IRedis will validate command while you are typing, and h
(read more)
Moore’s Law is the famous prognostication by Intel co-founder Gordon Moore that the number of transistors on a microchip would double every year or two. This prediction has mostly been met or exceeded since the 1970s — computing power doubles about every two years, while better and faster microchips become less expensive. This rapid growth in computing power has fueled innovation for decades, yet in the early 21st century researchers began to sound alarm bells that Moore’s Law was slowing down. With standard silicon technology, there are physical limits to how small transistors can get and how many can be squeezed onto an affordable microchip. Neil Thompson, an MIT research scientist at the Computer Science and Artificial Intelligence
(read more)
Preface Threat modelling provides important context to security and privacy advice. Measures necessary to protect against an advanced threat are different from those effective against unsophisticated threats. Moreover, threats don’t always fall along a simple one-dimensional axis from “simple” to “advanced”. I appreciate seeing communities acknowledge this complexity. When qualifying privacy recommendations with context, I think we should go further than describing threat models: we should acknowledge different types of privacy. “Privacy” means different things to different people. Even a single person may use the word “privacy” differently depending on their situation. Understanding a user’s unique situation(s), including their threat models,
(read more)
I recently got in the need of splitting quite large amount of audio files into smaller equal parts. The first thought that came to my mind was – probably thousand or more people had similar problem in the past so its already solved – so I went directly to the web search engine. The found solutions seem not that great or work partially only … or not work like I expected them to work. After looking at one of the possible solutions in a bash(1) script I started to modify it … but it turned out that writing my own solution was faster and easier … and simpler. Today I will share with you my solution to automatically split audio files into small equal parts. In my search for existing solutions I indeed found some tools that will allow me to achieve what I need. I will not try to t
(read more)
Background The summer of 2022 I had the chance to explain Go programming patterns I - and I assume others - use regularly in writing concurrent systems to my three interns who had never seen Go before. Although many university courses will talk about threading, they do not necessarily discuss concurrency, rarely if ever Go, and rarely go in depth on polymorphism. The state of affairs is unfortunate, it means that although Go and plenty of other ‘current generation’ programming languages have been publicly available for years, Go since at least 2012, the present situation in many computer science programs is not very different from the state of affairs ten years ago. Fortunately, Go comes with a broad set of thorough documentation with plenty of examples, which is great, but wh
(read more)
Receive notifications whenever a new program connects to the network, or when it's modified Monitors your bandwidth, breaking down traffic by executable, hash, parent, domain, port, or user over time Can optionally check hashes or executables using VirusTotal Executable hashes are cached based on device + inode for improved performance, and works with applications running inside containers Uses BPF for accurate, low overhead bandwidth monitoring and fanotify to watch executables for modification Since applications can call others to send/receive data for them, the parent executable and hash is also logged for each connection Pragmatic and minimalist design focussing on accurate detection with clear error reporting when it isn't possible AUR for Arch and derivatives
(read more)
The Plan 9 CD-ROM needed about 100MB for the full distribution, if that. We hatched a plan to fill up the rest with encoded music and include the software to decode it. (We wanted to get the encoder out too, but lawyers stood in the way. Keep reading.) Using connections I had with folks in the area, and some very helpful friends in the music business, I got permission to distribute several hours of existing recorded stuff from groups like the Residents and Wire. Lou Reed gave a couple of pieces too - he was very interested in Ken and Sean's work (which, it should be noted, was built on groundbreaking work done in the acoustics center at Bell Labs) and visited us to check it out. Debby Harry even recorded an original song for us in the studio. We had permission for all this of course, and
(read more)
From the May 1993 issue of Byte MagazineThe ProductUnix was originally created to run on mainframes, but eventually moved on to other systems. One of the first companies outside of AT&T to sell Unix was named Interactive Systems Corporation. Interactive Systems Corporation was founded by Peter G. Weiner in 1977. (Weiner was Brian Kernighan’s Ph.D. advisor. Kernighan was one of the creators of Unix.)ISC produced a number of Unix based products, including a port of Unix to the IBM PC named PC/IX. PC/IX was followed in 1985 by 386/ix. While PC/IX was based on UNIX System III, 386/ix was based on UNIX System V, Release 3. Later, the operating system was renamed INTERACTIVE UNIX System V/386 and rebased on UNIX System V, Release 3.2.In 1988, the Eastman Kodak Company purchased ISC. Three year
(read more)
Git is hard: messing up is easy, and figuring out how to fix your mistakes is impossible. Git documentation has this chicken and egg problem where you can't search for how to get yourself out of a mess, unless you already know the name of the thing you need to know about in order to fix your problem. So here are some bad situations I've gotten myself into, and how I eventually got myself out of them in plain english. Dangit, I did something terribly wrong, please tell me git has a magic time machine!?! git reflog # you will see a list of every thing you've # done in git, across all branche
(read more)
Hey, I wrote a thing. Thing being a piece of software. I have a collection of photos & documents that I really care about. I synch them between computers using syncthing and also run backups regularly. What I didn’t have was a way to quickly detect bitrot.Enter legdurlegdur is a simple CLI program to compute hashes of large sets of files in large directory structures and compare them with a previous snapshot. Think having your photo collection you acquired over time and worrying about bitrot.Installationcargo install legdur --force should get you there on a system that has Rust installed already.Usagelegdur path/to/a/directory/working:legdur ~/documents 2022-06-25T06:45:51.000214Z INFO legdur: scanning '/home/cyryl/documents' 2022-06-25T06:45:51.044471Z INFO legdur: list of files acqui
(read more)
Self identifying hashes Multihash is a protocol for differentiating outputs from various well-established cryptographic hash functions, addressing size + encoding considerations. It is useful to write applications that future-proof their use of hashes, and allow multiple hash functions to coexist. See jbenet/random-ideas#1 for a longer discussion. Table of Contents Example Format Implementations: Table for Multihash Other Tables Disclaimers Visual Examples Maintainers Contribute License Example Outputs of .encode(multihash(, )): # sha1 - 0x11 - sha1("multihash") 111488c2f11fb2ce392acb5b2986e640211c4690073e # sha1 in hex CEKIRQXRD6ZM4OJKZNNSTBXGIAQRYRUQA47A==== # sha1 in base32 5dsgvJGnvAfiR3K6HCBc4hcokSfmjj # sha1 in base58 ERSIwvEfss45KstbKYbmQCEcRpAHPg== # sha1 in base64 # sha2-256 0x12 - sha2-256("multihash") 12209cbc07c3f991725836a3aa2a581ca2029198aa420b9d99bc0e131d9f3e2cbe47 # sha2-256 in hex CIQJZPAHYP4ZC4SYG2R2UKSYDSRAFEMYVJBAXHMZXQHBGHM7HYWL4RY= # sha256 in base32 QmYtUc4iTCbbfVSDNKvtQqrfyezPPnFvE33wFmutw9PBBk # sha256 in base58 EiCcvAfD+ZFyWDajqipYHKICkZiqQgudmbwOEx2fPiy+Rw== # sha256 in base64 Note: You should consider using multibase to base-encode these hashes instead of base-encoding them directly. Format Binary example (only 4 bytes for simplicity): fn code dig size hash digest -------- -------- ----------------------------------- 00010001 00000100 10110110 11111000 01011100 10110101 sha1 4 bytes 4 byte sha1 digest Why have digest size as a separate number? Because otherwise you end up with a function code really meaning "function-and-digest-size-code". Makes using custom digest sizes annoying, and is less flexible. Why
(read more)
I made it off the DALL-E waiting list a few days ago and I’ve been having an enormous amount of fun experimenting with it. Here are some notes on what I’ve learned so far (and a bunch of example images too). (For those not familiar with it, DALL-E is OpenAI’s advanced text-to-image generator: you feed it a prompt, it generates images. It’s extraordinarily good at it.) First, a warning: DALL-E only allows you to generate up to 50 images a day. I found this out only when I tried to generate image number 51. So there’s a budget to watch out for. I’ve usually run out by lunch time! How to use DALL-E DALL-E is even simpler to use than GPT-3: you get a text box to type in, and that’s it. There are no advanced settings to tweak. It does have one other mode: you can upload your own photo, crop it to a square and then erase portions of it and ask DALL-E to fill them in with a prompt. This feature is clearly still in the early stages—I’ve not had great results with it yet. DALL-E always returns six resulting images, which I believe it has selected as the “best” from hundreds of potential results. Tips on prompts DALL-E’s initial label suggests to “Start with a detailed description”. This is very good advice! The more detail you provide, the more interesting DALL-E gets. If you type “Pelican”, you’ll get an image that is indistinguishable from what you might get from something like Google Image search. But the more details you ask for, the more interesting and fun the result. Fun with pelicans Here’s “A ceramic pelican in a Mexican folk art style with a big cactus growing out of it”: Some of the most fun results you can have come from
(read more)
Why Chameleon? Human-Centered Methods We improve each feature in Chameleon based on the feedback from the Haskell community. Debugging idioms in Chameleon has been tested individually and in combinations. Multi-location type errors While many type systems try to pinpoint one exact error location in the code, the accuracy is often hit or miss. Chameleon tries to narrow down to a few suspects and asks the programmer to identify the real culprit. While both approaches have pros and cons, we believe Chameleon is more flexible and catches bugs faster. Unbiased type errors Instead of assuming one type is "Expected" and one type is "Actual", Chameleon will report two equally possible alternatives that type errors can happen. Many techniques have been proposed to solve this problem (Known as left-right bias) on type solver level. Chameleon combines the type solver capable of eliminating this bias and smart visual cues to distinguish the evidence for one type and the other. Deduction step The deduction step is a tool to peek inside the type checking engine. It shows step-by-step reasoning that explains why one type cannot reconcile with another in simple language. Chameleon's interactive interface allows users to make incremental assumptions and see how that affects the typing of the whole program. More are coming! Hang tigh
(read more)
The Domain Name System (DNS) is the glue that holds the Internet together by providing the essential mapping from names to IP addresses. However, over time, the DNS has evolved into a complex and intricate protocol, spread across numerous RFCs. This has made it difficult to write efficient, high-throughput, multithreaded implementations that are bug-free and compliant with RFC specifications. To assist, my colleagues and I at the University of California, Los Angeles, and Microsoft developed Ferret: A tool that automatically finds RFC compliance bugs in DNS nameserver implementations. Key points:Ferret uses the Small-scope Constraint-driven Automated Logical Execution (SCALE) approach to guide the search toward different logical behaviours. Using Ferret, we identified 30 unique bugs across all DNS implementations, including at least two bugs each in popular implementations like Bind, Knot, NSD, and PowerDNS. One of the bugs is a critical vulnerability in Bind (high-severity rated CVE-2021-25215) that attackers could easily exploit to crash Bind DNS resolvers and nameservers remotely. Use of SCALE, in this case, could be extended to finding RFC compliance issues in other protocols. Existing approaches are not suitable for RFC complian
(read more)
An STL-style container’s performance can be dramatically affected by minor changes to the underlying data structure’s invariants, which in turn can be dramatically constrained by the container’s chosen API. Since the Boost.Unordered performance project has put std::unordered_foo back in the spotlight, I thought this would be a good week to talk about my favorite little-known STL trivia tidbit: std::unordered_multiset’s decision to support .equal_range dramatically affects its performance! Background First some background on the original std::multiset, which was part of C++98’s STL. Both set and multiset are represented in memory as a sorted binary search tree (specifically, on all implementations I’m aware of, it’s a red-black tree); the only difference is that multiset is allowed to contain duplicates. std::set s = {33,11,44,11,55,99,33}; std::multiset ms = {33,11,44,11,55,99,33}; assert(std::ranges::equal(s, std::array{11,33,44,55,99})); assert(std::ranges::equal(ms, std::array{11,11,33,33,44,55,99})); Because set and multiset are stored in sorted order, they have the “special skills” .lower_bound(key), .upper_bound(key), and .equal_range(key): auto [lo, hi] = ms.equal_range(33); assert(std::distance(lo, hi) == 2); assert(std::count(lo, hi, 33) == 2); assert(lo == ms.lower_bound(33)); assert(hi == ms.upper_bound(33)); When C++11 added std::unordered_set and std::unordered_multiset, part of the idea was that (API-wise) they should be drop-in replacements for the tree-based containers. The only difference is that each unordered container is represented in memory as a hash table: an array of “buckets,” each bucket being a linked list of elements with the same hash modulo bucket_count(). Since the order of the elements depends on the order of the buckets, and the order of the buckets depends on hashing, not less-than, the elements in an unordered container aren’t intrinsically stored in sorted order. On libc++, for example, I see this: std::unordered_set us = {33,11,44,11,55,99,33}; std::unordered_multiset ums = {33,11,44,11,55,99,33}; assert(std::ranges::equal(us, std::array{55,99,44,11,33})); assert(std::ranges::equal(ums, s
(read more)
This episode is all about the Lisp family of programming languages! Ever looked at Lisp and wondered why so many programmers gush about such a weird looking programming language style? What's with all those parentheses? Surely there must be something you get out of them for so many programming nerds to gush about the language! We do a light dive into Lisp's history, talk about what makes Lisp so powerful, and nerd out about the many, many kinds of Lisps out there!Announcement: Christine is gonna give an intro-to-Scheme tutorial at our next Hack & Craft! Saturday July 2nd, 2022 at 20:00-22:00 ET! Come and learn some Scheme with us!Links:Various histories of Lisp:History of Lisp by John McCarthyThe Evolution of Lisp by Guy L. Steele and Richard P. GabrielHistory of LISP by Paul McJonesWilliam Byrd's The Most Beautiful Program Ever Written demonstrates just how easy it is to write lisp in lisp, showing off the kernel of evaluation living at every modern programming language!M-expressions (the original math-notation-vision for users to operate on) vs S-expressions (the structure Lisp evaluators actually operate at, in direct representational mirror of the typically, but not necessarily, parenthesized representation of the same).Lisp-1 vs Lisp-2... well, rather than give a simple link and analysis, have a thorough one.Lisp machinesMIT's CADR was the second iteration of the lisp machine, and the most influential on everything to come. Then everything split when two separate companies implemented it...Lisp Machines, Incorporated (LMI), founded by famous hacker Richard Greenblatt, who aimed to keep the MIT AI Lab hacker culture alive by only hiring programmers part-time.Symbolics was the other rival company. Took venture capital money, was a commercial success for quite a while.These systems were very interesting, there's more to them than just the rivalry. But regarding that, the book Hackers (despite its issues) captures quite a bit about the AI lab before this and then its split, including a ton of Lisp history.Some interesting things happening over at lisp-machine.orgThe GNU manifestio mentions Lisp quite a bit, including that the plan was for the system to be mo
(read more)
Notice: the default Send host is provided by @timvisee (info). Please consider to donate and help keep it running. Easily and securely share files from the command line. A Send client. Easily and securely share files and directories from the command line through a safe, private and encrypted link using a single simple command. Files are shared using the Send service and may be up to 1GB. Others are able to download these files with this tool, or through their web browser. No demo visible here? View it on asciinema. All files are always encrypted on the client, and secrets a
(read more)
If you keep an eye on German automotive companies, you will see a pattern. Their software is a huge mess and they have trouble finding and keeping developers, which leads to a high turnover rate, that amplifies existing issues. In my opinion their failure is deeply rooted in the way these traditional companies are structured and how they see the role of developers. Actually solving the problem, e.g. through autonomous product teams, would require a massive change in company structure and culture. Note: The post might be a bit of rant due to personal experience 😅 Traditional comp
(read more)
↩ ↪ December 19, 2012 beta code language oop Over the weekend, I was reading one of the shagadelic papers on Self, Parents are Shared Parts of Objects: Inheritance and Encapsulation in SELF. What can I say, I have a weird idea of fun. If you’re interested in prototypes, or you’re a Javascripter—but I repeat myself—you owe it to yourself to read these papers. They are gems. But this post isn’t about prototypes, it’s about something the Self folks mention in passing: In BETA, virtual functions are invoked from least specific to most specific, with the keyword inner being used to invoke the next more specific method. This mechanism is a product of the philosophy in BETA that subclasses should be behavioral extensions to their superclasses and therefore specialize the behavior of their superclasses at well-defined points (i.e. at calls to inner). It took me a while to tease out what this is saying, but once I did, it was like a dim little light bulb flickered on in my head. What’s BETA? Before I get into the lightbulb part, a bit of history. BETA is a language that came out of the “Scandinavian School” in Denmark, the same people that brought you Simula and kicked off the object-oriented revolution. Alan Kay may have coined “object-oriented programming”, but it was Simula that gave him the idea. Chances are, the language you should be coding in right now instead of slacking off reading my blog was directly inspired by these guys. So after Simula, they went off and made BETA. I think this is more or less equivalent to “famous rock band goes into hiding for ten years and emerges with avant garde free jazz album”. BETA was used as a teaching language, I think, and there were some papers about it, but I don’t know if many people seriously used it in anger. (Trivia time! Some of the guys who made V8, the famously-fast JavaScript engine in Chrome did use BETA. “V8” got its name because it’s the eighth virtual machine that Lars Bak created. His first VM? A BETA one.) Part of the reason BETA didn’t flourish may have to do with terminology. Instead of classes and methods, BETA has patterns which subsume both, somehow, and aren’t related to other uses of the term in other languages. I wrote that sentence, and I don’t even know what the hell that means. The BETA book is a bit… dense. Or maybe it’s
(read more)
One of the main pain points of using SQLite in production deployments or VMs is managing the database. There are lots of database GUIs, but only work with local SQLite databases. Managing an SQLite database remotely requires: Adding a new service to the deployment (like Adminer, sqlite-web or postlite) Giving the new service permissions to access the volume with the database Exposing a port to access the service The alternative is usually SSH’ing to the remote VM and use the sqlite3 CLI to manage or explore the database. With this in mind, I decided to build a remote SQLite management GUI that does not require running any service in the remote VM and only needs an SSH connection between you and the remote machine. Turning a database into an API To access the remote database, we need some communication format between the app’s code and the database data. Luckily for us, the sqlite3 CLI has a -json flag that will turn the output into JSON. We can use that to send CLI commands over the SSH connection and read the JSON output as the response. One thing to note is that some pre-installed sqlite3 CLIs have not been compiled with the -json flag enabled. You may need to install/scp/compile another sqlite3 binary in the remote to do this. ssh [email protected]$SSH_HOST "sqlite3 -json chinook.sqlite3 'select * from Artist limit 10'" The output looks like: [{"ArtistId":1,"Name":"AC/DC"}, {"ArtistId":2,"Name":"Accept"}, {"ArtistId":3,"Name":"Aerosmith"}, {"ArtistId":4,"Name":"Alanis Morissette"}, {"ArtistId":5,"Name":"Alice In Chains"}, {"ArtistId":6,"Name":"Antônio Carlos Jobim"}, {"ArtistId":7,"Name":"Apocalyptica"}, {"ArtistId":8,"Name":"Audioslave"}, {"ArtistId":9,"Name"
(read more)
| source | all docs for | all versions | oilshell.org Hay lets you use the syntax of the Oil shell to declare data and interleaved code. It allows the shell to better serve its role as essential glue. For example, these systems all combine Unix processes in various ways: local build systems (Ninja, CMake, Debian package builds, Docker/OCI builds) remote build services (VM-based continuous integration like sourcehut, Github Actions) local process supervisors (SysV init, systemd) remote process supervisors / cluster managers (Slurm, Kubernetes) Slogans: Hay Ain't YAML. It evaluates to JSON + Shell Scripts. We need a better control plane language for the cloud. Oil adds the missing declarative part to shell. This doc describes how to use Hay, with motivating examples. As of 2022, this is a new feature of Oil, and it needs user feedback. Nothing is set in stone, so you can influence the language and its features! Example Hay could be used to configure a hypothetical Linux package manager: hay define Package/TASK Package cpython { version = '3.9' url = 'https://python.org' TASK build { ./configure make } } This program evaluates to a JSON tree, which you can consume from programs in any language, including Oil: { "type": "Package", "args": [ "cpython" ], "attrs": { "version": "3.9", "url": "https://python.org" }, "children": [ { "type": "TASK", "a
(read more)
This is a little story about standards, technology, civilisation, and the modern world. I know it is tempting to only talk about the various ways technology disappoints us, but sometimes it can be quite magical living in the future. A few week ago, I took a trip to a foreign country...I waved a rectangle of black-and-white squares in the vicinity of an optical scanner. The tiny computer's eye caught a fleeting glimpse of the barcode, de-skewed, rotated, and deciphered it - then checked its contents against a database. After a few milliseconds of deep thought, it opened the gates for me.I handed over my passport to the border guard. They verified the cryptographic signature embedded in a chip, nestled deep within the document. Seeing it was valid, they waved me through the border.I jumped on a train which sped 160 km/h underneath the sea! Although there's no WiFi 45m under the seabed, the rest of the time my various devices happily slurped up the bits from the æther.I emerged blinking into a new city. My phone immediately latched on to a dozen satellites 20,000Km above my head. Within a few seconds it had pinpointed me to within 10 metres. I knew where I was. I received a text on my phone - my wife had tracked my journey and knew I'd arrived safe and sound.My phone complies with all modern standards and frequencies; it spotted a 4G signal straight away. My SIM card did the usual authentication and negotiation dance with the local networks and they quickly granted me access. I have an IP address, therefore I am.An army of volunteers had already mapped the city - down to the last restaurant, bench, and fire-hydrant. I knew where I wanted to go, but not the quickest way. Luckily, several decades of route-finding algorithm research kicked in and presented me with walking options.The city's public transport timetables were all in a standardised format, so I was also able to see which bus and trams I could catch. The live display showed me the next bus was snagged in traffic a couple of streets away and wouldn't arrive for a while.It was late, and I didn't fancy walking through an unfamiliar city. I know my locked phone is useless to a thief - unless they force me to unlock it - and can easily be tracked if stolen. But I could do without the inconvenience. So I opened up my taxi app - the same one I use at home - and a car arrived within a few minutes to take me to my hotel.While waiting, I noticed a warning sign affixed to a lamppost. I don't speak the language, sadly. But I held my phone's camera up to it, and an instant translation appeared. It warned me that pickpockets operated in that area so I should keep my valuables hidden. I quickly put my phone in my pocket.The conversation with the taxi driver was a little stilted. English was his 4th language, and none of the other 3 were ones I was conversant in. But voice-to-text-to-foreign-language worked well enough on the phone to have a pleasant conversation.I'd like to say checking-in to the hotel was a magical experience where they recognised my iris prints and whisked me off to my room. But hotels are so 20th century! At least they gave me an RFID token to unlock my door - no magnetic st
(read more)
As I was getting ready to write this post I spent some time thinking about some of the coding tools that I have used over the course of my career. This includes the line-oriented editor that was an intrinsic part of the BASIC interpreter that I used in junior high school, the IBM keypunch that I used when I started college, various flavors of Emacs, and Visual Studio. The earliest editors were quite utilitarian, and grew in sophistication as CPU power become more plentiful. At first this increasing sophistication took the form of lexical assistance, such as dynamic completion of partially-entered variable and function names. Later editors were able to parse source code, and to offer assistance based on syntax and data types — Visual Studio‘s IntelliSense, for example. Each of these features broke new ground at the time, and each one had the same basic goal: to help developers to write better code while reducing routine and repetitive work. Announcing CodeWhisperer Today I would like to tell you about Amazon CodeWhisperer. Trained on billions of lines of code and powered by machine learning, CodeWhisperer has the same goal. Whether you are a student, a new developer, or an experienced professional, CodeWhisperer will help you to be more productive. We are launching in preview form with support for multiple IDEs and languages. To get started, you simply install the proper AWS IDE Toolkit, enable the CodeWhisperer feature, enter your preview access code, and start typing: CodeWhisperer will continually examine your code and your comments, and present you with syntactically correct recommendations. The recommendations are synthesized based on your coding style and variable names, and are not simply snippets. CodeWhisperer uses multiple contextual clues to drive recommendations including the cursor location in the source code, code that precedes the cursor, comments, and code in other files in the same projects. You can use the recommendations as-is, or you can enhance and customize them as needed. As I mentioned earlier, we trained (and continue to train) CodeWhisperer on billions of lines of code drawn from open source repositories, internal Amazon repositories, API documentation, and forums. CodeWhisperer in Action I installed the CodeWhisperer preview in PyCharm and put it through its paces. Here are a few examples to show you what it can do. I want to build a list of prime numbers. I type # See if a number is pr. CodeWhisperer offers to complete this, and I press TAB (the actual key is specific to each IDE) to accept the recommendation: On the next line, I press Alt-C (again, IDE-specific), and I can choose between a pair of function definitions. I accept the first one, and CodeWhisperer recommends the function body, and here’s what I have: I write a for statement, and CodeWhisperer recommends the entire body of the loop: CodeWhisperer can also help me to write code that accesses various AWS services. I start with # create S3 bucket and TAB-complete the re
(read more)
mold 1.3.0 is a new release of the high-speed linker. This release contains a few new features and general stability/compatibility improvements as shown below. Note for those who create mold binary packages: if you are building mold for binary distribution, please link the bundled libtbb statically (which is default) or rebuild your distro's libtbb package with my patch so that mold's Link-Time Optimization (LTO) works reliably under a heavy load. The --icf=safe option has been supported. This option enables a feature to find and deduplicate identical code that can be merged safely. For C++ programs, it typically reduces the output binary size by a few percent. --icf=safe needs to be used
(read more)
RFCs - requests for comment - or Design Docs are a common tool that engineering teams use to build software faster, by clarifying assumptions and circulating plan
(read more)
Please consider subscribing to LWNSubscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net. Modern language environments make it easy to discover and incorporate externally written libraries into a program. These same mechanisms can also make it easy to inadvertently incorporate security vulnerabilities or overtly malicious code, which is rather less gratifying. The stream of resulting vulnerabilities seems like it will never end, and it afflicts r
(read more)
I’m happy to announce that a new version of size, the PrettySize.NET port for rust, has been released and includes a number of highly requested features and improvements. The last major release o
(read more)
articles . books . about A talk on the new API's and tooling around CoreML, Apple's Machine Learning Toolkit. Table of Contents Introduction Overview of CoreML What is a (ML)model and how does it work? CoreML Usage What's new in CoreML Conclusion for Now Introduction Hello, my name is Quin'darius Lyles-Woods and I am going to talk to you all about what's new in CoreML, Apples Machine Learning Toolkit. I am employed at Rekor, a company building intelligent infrastructure for cites and municipalities, working on their mobile application. Typically, outside of work, I am building things around the house, learning to play the piano, loving on the 13 12 cats and 3 dogs in my house, or simply caught with a beer or a book. Otherwise in my spare time I will am building API's and taking Domains, only half joking. My history with machine learning did not start at with Apple's CoreML Framework, but I wish it did. My first time building intelligent agents was in my Artificial Intelligence class where, hold your breath, I built most of my agents in Javascript as web applications. Okay maybe that was as ground shattering of a thought as I think or thought it was. I would build applications that would do semi-intelligent things and run in the browser I thought that was kinda neat. After this course I took NLP, Natural Language Processing, where I learned tons more about Machine Learning. Eventually making a pretty great model for classification of bodies of political text. Later on my partner and I were invited to present the process of data collection and model creation at the National Council on Undergraduate Research. *fast forward* to NOW Now I am taking what I have learned and am learning into a practical arena at my job. Being a Machine Vision, a subset of machine learning, application there is always for something for you to improve on because. Technically, the model result is never 100% accurate, so the work never ends. Overview of CoreML CoreML is a framework that helps you use models to make predictions. Okay great, talks over. Just kidding, while that is a simple answer there are many things to explain even within that one sentence. Like what is a model, how does it make a predications, or how does it help me do these things? What is a (ML)model and how does it work? A (ML)model has all the functions for doings predictions, can give you your descriptions, and also configuration. I have not told you what a model is yet per-say, I have told you what a model does and some of its properties. That's not really helpful is it? Okay enough being coy. When I was creating this talk, I thought I was going to give everyone a basically walk through of implementing machine learning in an application. Alas, like all good startups I pivoted to a way better idea, and the scraps are going to be useful. My groundbreaking idea for the previous t
(read more)
Got some building work scheduled on the house in the near future, so I need to empty the garage entirely. Might as well get a head start on it this weekend and get some stuff shifted. I forsee a tip run or two in my future as well. Also have all the pieces to overhaul the Z4’s cooling system, providing the rain holds off I’ll crack on with that. Also got the parts to do an oil service on the Fiesta, so might as well whilst I’ve got the car tools out.
(read more)
Some network speeds and network related speeds we see in mid 2022 June 21, 2022 We are not anywhere near a bleeding edge environment. We still mostly use 1G networking, with 10G to a minority of machines, and a mixture of mostly SATA SSDs with some HDDs for storage. A few very recent machines have NVMe disks as their system disks. So here are the speeds that we see and that I think of as 'normal', here in mid 2022 in our environment, with somewhat of a focus on where the limiting factors are. On 1G connections, anything can get wire bandwidth for streaming TCP traffic (or should be able to;
(read more)
If you need to ensure that a particular piece of data is only ever modified by one thread at once, you need a mutex. If you need more than one mutex, you need to be wary of deadlocks. But what if I told you that there’s a trick to avoid ever reaching a deadlock at all? Just acquire them in the right order! First of all, let’s define a deadlock. With two mutexes, foo and bar and two workers, alice and bob, we can have the following scenario: alice acquires foo bob acquires bar alice tries to acquire bar but can’t because bob already has it so they must wait bob tries to acquire foo but can’t because alice already has it so they must wait Now both workers are waiting for each other and neither can complete. Let’s see now how to fix this. The key is that we want to always acquire all of our mutexes in a consistent order.1 In this case, if, whenever we need both foo and bar, we make sure to always acquire foo before bar, we can never deadlock. Of course, as a code base and its number of mutexes grows, this becomes harder and we need to formalize and automate this. We can model the acquisition of mutexes as a graph problem, where all mutexes are nodes and we add an edge between them when we acquire a mutex while holding another. The problem then becomes: is this graph free of cycles? Directed acyclic graphs A directed acyclic graph (or DAG) is a graph \((V, E)\), of nodes \(V\) with edges \(E\) between then, that is directed, meaning the edges can only be followed one way, and that does not contain cycles, meaning you cannot start walking from a certain node and end end up where you started. It can look something like this: You might think that this graph does have cycles, and it would if it were an undirected graph. However, due to the directed edges it’s only possible to move from the left to the right. To check whether a directed graph is a graph is actually a DAG, you can try starting somewhere in the graph and recursively walk all edges until you find a place you’ve been before. When you run out of edges to walk recursively, start somewhere else that you haven’t visited and repeat until you’ve either found a cycle or walked all edges. In pseudocode2: visited = set() visiting = set() def visit(node): if node in visiting: raise CycleFound() elif node in visited: return visiting.add(node) for neighbour in edges(node): visit(neighbour) visiting.remove(node) visited.add(node) for node in nodes(): visit(nodes) This simple algorithm is trivially \(O(|E|)\) or linear in the number of edges in the graph as each edge is taken at most once. That’s good enough if you just need to check a graph once, but we want to ensure it stays a DAG when we add new edges. This is called the… Dynamic DAG problem We know from the previous section that we can relatively cheaply check whether a directed graph is a DAG by finding cycles, but if we know more about the current state of the graph, we can do something more clever. In this section we’ll be looking at an algorithm by David J. Pearce and Paul H.J. Kelly. It uses the existing topolog
(read more)
↩ ↪ April 17, 2013 code go language magpie This is kinda like part three of the Iteration Inside and Out posts, so you may want to check out part one and part two too. But you don’t have to. Most programming languages have one or two singleton values floating around. These are special built-in objects that only have a single instance. null or nil are common. Some languages put true and false in the same bucket. JavaScript, ever the hipster, ironically defines one called undefined. I wanted to talk a little bit about one I just added to my little language Magpie: a sentinel value called done. I’m still not certain it’s a great idea, but I’ll walk you through my thoughts. If nothing else, maybe it will serve as a cautionary tale for other, smarter language designers. Channels I ran into the problem that ultimately led to done when I started working on Magpie’s concurrency story. Its model is based on fibers and channels, much like Go or other CSP-inspired languages. You can spin up a new fiber using the async keyword: async print("I'm in another fiber!") end Fibers are cooperatively scheduled so they only yield to other fibers when they do something that requires waiting. Usually that means IO. So if you do the above, you won’t see that message get printed until something causes the main fiber to yield. For example: async print("I'm in another fiber!") end print("Before") print("After") This program will create a second fiber (but not switch to it). Then it gets to print("Before") in the main fiber. Since printing is IO, that switches to the second fiber. That in turn queues up its print and suspends again, so the main fiber resumes. Ultimately, it prints: Before I'm in another fiber! After You can spin up lots and lots of fibers because they don’t use OS threads, just some memory in the interpreter. This is swell for decoupling stuff and running things concurrently. But sometimes you do need to coordinate fibers with each other. For that, we’ve got channels. A channel is a simple object that you can send objects into and then receive them from another channel. You can create one like: var channel = Channel new To send a value along it, just do: And you can receive it like: var result = channel receive The fun bit is that when a fiber sends a value along a channel, it causes that fiber to suspend until some other fiber shows up to receive the value. A send doesn’t complete until the object has been received. So channels let you not just communicate but also synchronize. Show’s over, folks There’s one other thing you can do with a channel, you can close it: That puts the channel out of commission and tells everyone that they will receive no more values from it. The question I ran into was, “What happens to fibers that are waiting to receive on a channel when it gets closed?” For example: var channel = Channel new // Wait for a value and print it. async print(channel receive) // From the main fiber, close the channel. channel close Here, the second fiber is sitting there, relaxing, maybe having a cocktail while it waits for the channel to spew something forth. But it doesn’t,
(read more)
We trained a neural network to play Minecraft by Video PreTraining (VPT) on a massive unlabeled video dataset of human Minecraft play, while using only a small amount of labeled contractor data. With fine-tuning, our model can learn to craft diamond tools, a task that usually takes proficient humans over 20 minutes (24,000 actions). Our model uses the native human interface of keypresses and mouse movements, making it quite general, and represents a step towards general computer-using agents. Read Paper View Code and model weights MineRL Competition The internet contains an enormous amount of publicly available videos that we can learn from. You can watch a person make a gorgeous presentation, a digital artist draw a beautiful sunset, and a Minecraft player build an intricate house. However, these videos only provide a record of what happened but not precisely how it was achieved, i.e. you will not know the exact sequence of mouse movements and keys pressed. If we would like to build large-scale foundation models in these domains as we’ve done in language with GPT, this lack of action labels poses a new challenge not present in the language domain, where “action labels” are simply the next words in a sentence. In order to utilize the wealth of unlabeled video data available on the internet, we introduce a novel, yet simple, semi-supervised imitation learning method: Video PreTraining (VPT). We start by gathering a small dataset from contractors where we record not only their video, but also the actions they took, which in our case are keypresses and mouse movements. With this data we train an inverse dynamics model (IDM), which predicts the action being taken at each step in the video. Importantly, the IDM can use past and future information to guess the action at each step. This task is much easier and thus requires far less data than the behavioral cloning task of predicting actions given past video frames only, which requires inferring what the person wants to do and how to accomplish it. We can then use the trained IDM to label a much larger dataset of online videos and learn to act via behavioral cloning. VPT method overview VPT Zero-Shot Results We chose to validate our method in Minecraft because it (1) is one of the most actively played video games in the world and thus has a wealth of freely available video data and (2) is open-ended with a wide variety of things to do, similar to real-world applications such as computer usage. Unlike prior works in Minecraft that use simplified action spaces aimed at easing exploration, our AI uses the much more generally applicable, though also much more difficult, native human interface: 20Hz framerate with the mouse and keyboard. Trained on 70,000 hours of IDM-labeled online video, our behavioral cloning model (the “VPT foundation model”) accomplishes tasks in Minecraft that are nearly impossible to achieve with reinforcement learning from scratch. It learns to chop down trees to collect logs, craft those logs into planks, and then craft those planks into a crafting table; this sequence takes a human proficient in Mine
(read more)
This is part 2 of a series of articles on rust-minidump. For part 1, see here. So to recap, we rewrote breakpad’s minidump processor in Rust, wrote a ton of tests, and deployed to production without any issues. We killed it, perfect job. And we still got massively dunked on by the fuzzer. Just absolutely destroyed. I was starting to pivot off of rust-minidump work because I needed a bit of palette cleanser before tackling round 2 (handling native debuginfo, filling in features for other groups who were interested in rust-minidump, adding extra analyses that we’d always wanted but were too much work to do in Breakpad, etc etc etc). I was still getting some PRs from people filling in
(read more)
Speaking this morning at The Linux Foundation's Open-Source Summit, Linus Torvalds talked up the possibilities of Rust within the Linux kernel and that it could be landing quite soon -- possibly eve
(read more)
BIGNUM BAKEOFF contest recap ------ ------- ------- ----- The aim of this contest was to write a C program of 512 characters or less (excluding whitespace) that returned as large a number as possible from main(), assuming C to have integral types that can hold arbitrarily large integers. The winner is Ralph Loader . Entries received and accepted ------- -------- --- -------- Entrant Program name Scott Carnahan {carnahan.c} Tak-Shing Chan {chan.c} Tak-Shing Chan {chan-2.c} Tak-Shing Chan {chan-3.c} Morris Dovey {dovey.c} Elc
(read more)
Introduction The modeling and model testing language I want to run the examples without looking at the learning material! What does modeling get us? (A Simple Example) The Model Checking the Mode
(read more)
I have been using rust as embedded language for cortex M MCU for a while. While I like rust, embedded dev in rust has some friction, and a few things are a bit hard to do. Especially C interop and direct memory manipulation. I decided to give zig a try, and while it is still in very early stage compared to rust, I was able to get a working hello world program quite quickly. And I was able to get RTT working with the SEGGER C library. In this post, I'll share what I learned and a few gotchas. TLDR: The repo is here: https://github.com/kuon/zig-stm32test Hardware I have a STM32L011F3Px chip connected with a SEGGER jlink to my computer. The chip is barebone except for an external 32kHz crystal and decoupling capacitors. Because everything is still work in progress, at the time of writing, master zig is needed. Hopefully, under Arch, it is easy with the AUR package, but your mileage may vary. Ensure you have the following tools: zig master regz a small SVD to zig converter openocd make optional, but I like to have a makefile as a wrapper in all my projects multitail a nice helper to have multiple process run in the same terminal window arm toolchain Directory structure Create an empty directory, and inside, you will have the following structure: . ├── build.zig # Build file ├── libs # Empty lib dir │   └── microzig # Clone the microzig repository in it ├── Makefile # Makefile wrapper ├── ocd # OCD configuration directory │   ├── debug.gdb # GDB configuration │   ├── ocd.cfg # OCD basic configuration │   └── ocd_rtt.cfg # OCD RTT configuration └── src ├── main.zig # Main code file ├── rtt.zig # Segger RTT wrapper ├── SEGGER_RTT.c # Segger RTT file from official SEGGER site ├── SEGGER_RTT_Conf.h ├── SEGGER_RTT.h └── STM32L0x1 # This is my chipset support libraries ├── registers.svd
(read more)
What follows is an embloggified version of an introduction to the rr debugger that I gave at a local Linux Users Group meet up on October 6th, 2015. Hello, everyone! My name is Nick Fitzgerald and I'm here to talk about a (relatively) new kid on the block: the rr debugger. rr is built by some super folks working at Mozilla, and I work at Mozilla too, but not with them and I haven't contributed to rr (yet!) — I'm just a very happy customer! At Mozilla, I am a member of the Firefox Developer Tools team. I've done a good amount of thinking about bugs, where they come from, the process of debugging, and the tools we use in the process. I like to think that I am a fairly savvy toolsmith. These days, I'm doing less of building the devtools directly and more of baking APIs into Gecko (Firefox's browser engine) and SpiderMonkey (Firefox's JavaScript engine) that sit underneath and support the devtools. That means I'm writing a lot of C++. And rr has quickly become the number one tool I reach for when debugging complicated C++ code. rr only runs on Linux and I don't even use Linux as my day-to-day operating system! But rr provides such a great debugging experience, and gives me such a huge productivity boost, that I will reboot into Fedora just to
(read more)
Apple recently made a booboo, unlike any other booboo in the history of programming. Even though Apple’s bug is unprecedented, here’s a brief overview of some predecessor bugs.XBack in 2006, the X server checked to make sure the user was root, but forgot to actually call the function.--- hw/xfree86/common/xf86Init.c +++ hw/xfree86/common/xf86Init.c @@ -1677,7 +1677,7 @@ } if (!strcmp(argv[i], "-configure")) { - if (getuid() != 0 && geteuid == 0) { + if (getuid() != 0 && geteuid() == 0) { ErrorF("The '-configure' option can only be used by root.\n"); exit(1); }How is this possible? Does nobody use a compiler that warns about comparisons always being false?Debian OpenSSLRemember that time back in 2008 when Debian shipped a special limited edition OpenSSL? “As a result, cryptographic key material may be guessable.”--- openssl-a/md_rand.c +++ openssl-b/md_rand.c @@ -271,10 +271,7 @@ else MD_Update(&m,&(state[st_idx]),j); -/* - * Don't add uninitialised data. MD_Update(&m,buf,j); -*/ MD_Update(&m,(unsigned char *)&(md_c[0]),sizeof(md_c)); MD_Final(&m,local_md); md_c[1]++;OK, I’m cheating here, it’s a three line fix. How is this possible? Does nobody read the OpenSSL mailing list or the Debian bug tracker? Whatever happened to code review?Regular OpenSSLAlso in OpenSSL and also from 2008, “OpenSSL 0.9.8i and earlier does not properly check the return value from the EVP_VerifyFinal function, which allows remote attackers to bypass validation of the certificate chain via a malformed SSL/TLS signature for DSA and ECDSA keys.”--- lib/libssl/src/ssl/s3_srvr.c +++ lib/libssl/src/ssl/s3_srvr.c @@ -2009,7 +2009,7 @@ static int ssl3_get_client_certificate(S else { i=ssl_verify_cert_chain(s,sk); - if (!i) + if (i <= 0) { al=ssl_verify_alarm_type(s->verify_result); SSLerr(SSL_F_SSL3_GET_CLIENT_CERTIFICATE,SSL_R_NO_CERTIFICATE_RETURNED);Bypass validation of the certificate chain? That’s bad, right? Like “worst security bug you could possibly imagine” bad, right?AndroidLet’s look at
(read more)
Hurl is a command line tool that runs HTTP requests defined in a simple plain text format. It can perform requests, capture values and evaluate queries on headers and body response. Hurl is very versatile: it can be used for both fetching data and testing HTTP sessions. GET https://example.org HTTP/1.1 200 csrf_token: xpath "string(//meta[@name='_csrf_token']/@content)" POST https://example.org/login?user=toto&password=1234 X-CSRF-TOKEN: {{csrf_token}} HTTP/1.1 302 Chaining multiple requests is easy: GET https://example.org/api/health GET https://example.org/api/step1 GET https://example.org/api/step2 GET https://example.org/api/step3 Hurl can run HTTP requests but can also be used to test HTTP responses. Different types of queries and predicates are supported, from XPath and JSONPath on body response, to assert on status code and response headers. It is well adapted for REST / JSON apis POST https://example.org/api/tests { "id": "4568", "evaluate": true } HTTP/1.1 200 header "X-Frame-Options" == "SAMEORIGIN" jsonpath "$.status" == "RUNNING" jsonpath "$.tests" count == 25 jsonpath "$.id" matches /\d{4}/ HTML content GET https://example.org HTTP/1.1 200 xpath "normalize-space(//head/title)" == "Hello world!" and even SOAP apis POST https://example.org/InStock Content-Type: application/soap+xml; charset=utf-8 SOAPAction: "http://www.w3.org/2003/05/soap-envelope" <?xml version="1.0" encoding="UTF-8"?> GOOG HTTP/1.1 200 Hurl can also be used to test HTTP endpoints performances: GET https://example.org/api/v1/pets HTTP/1.0 200 duration < 1000 And responses bytes content GET https://example.org/data.tar.gz HTTP/1.0 200 sha256 == hex,039058c6f2c0cb492c533b0a4d14ef77cc0f78abccced5287d84a1a2011cfb81; Text FormatFor both devops and developers Fast CLIA command line for local dev and continuous integration Single BinaryEasy to install, with no runtime required Hurl is a lightweight binary written in Rust. Under the hood, Hurl HTTP engine is powered by libcurl, one of the most powerful and reliable file transfer library. With
(read more)
So I decided to finally give GitHub Copilot a spin. Let's see how our new robot overlords are doing in terms of programming! Turns out there is no way to just go to a website, start typing and see the magic in action. There is a plugin for Neovim. I don't like to install stuff that is not in the Debian repos though. So Docker to the rescue! As with anything I do manually, I put the steps into a script so I don't have to do it again. And so I can share them. So here it is, the script to easily try Copilot in Vim in Docker. Unfortunately it is a "script" that you have to execute manually line by line. Since it switches in and out of the container and needs you to do manually register the token shown by the plugin during the setup. If the plugin setup was non-interactive, it
(read more)
Download PDF Abstract: We present the first class of mathematically rigorous, general, fully self-referential, self-improving, optimally efficient problem solvers. Inspired by Kurt Goedel's celebrated self-referential formulas (1931), such a problem solver rewrites any part of its own code as soon as it has found a proof that the rewrite is useful, where the problem-dependent utility function and the hardware and the entire initial code are described by axioms encoded in an initial proof searcher which is also part of the initial code. The searcher systematically and efficiently tests computable proof techniques (programs whose outputs are proofs) until it finds a provably useful, computable self-rewrite. We show that such a self-rewrite is glob
(read more)
[This essay in Spanish] [This essay in French] [This essay in Chinese] In an old joke, two noblemen vie to name the bigger number. The first, after ruminating for hours, triumphantly announces "Eighty-three!" The second, mightily impressed, replies "You win." A biggest number contest is clearly pointless when the contestants take turns. But what if the contestants write down their numbers simultaneously, neither aware of the other�s? To introduce a talk on "Big Numbers," I invite two audience volunteers to try exactly this. I tell them the rules: You have fifteen seconds. Using standard math notation, English words, or both, name a single whole number�not an infinity�on a blank index card. Be precise enough for any reasonable modern mathematician to determine exactly what number you�ve named, by consulting only your card and, if necessary, the published literature. So contestants can�t say "the number of sand grains in the Sahara," because sand drifts in and out of the Sahara regularly. Nor can they say "my opponent�s number plus one," or "the biggest number anyone�s ever thought of plus one"�again, these are ill-defined, given what our reasonable mathematician
(read more)
data-diff is in shape to be run in production, but also under development. If you run into issues, please file an issue and we'll help you out ASAP! You can also find us in #tools-data-diff in the Locally Optimistic Slack. data-diff is a command-line tool and Python library to efficiently diff rows across two different databases. ⇄ Verifies across many different databases (e.g. PostgreSQL -> Snowflake) 🔍 Outputs diff of rows in detail 🚨 Simple CLI/API to create monitoring and alerts 🔁 Bridges column types of different formats and levels of precision (e.g. Double ⇆ Float ⇆ Decimal) 🔥 Verify 25M+ rows in <10s, and 1B+ rows in ~5min. ♾️ Works for tables with 10s of billions of rows data-diff splits the table into smaller segments, then checksums each segment in both databases. When the checksums for a segment aren't equal, it will further divide that segment into yet smaller segments, checksumming those until it gets to the differing row(s). See Technical Explanation for more details. This approach has performance within an order of magnitude of count(*) when there are few/no changes, but is able to output each differing row! By pushing the compute into the databases, it's much faster than querying for and comparing every row. †: The implementation for downloading all rows that data-diff and count(*) is compared to is not optimal. It is a single Python multi-threaded process. The performance is fairly driver-specific, e.g. PostgreSQL's performs 10x better than MySQL. Table of Contents Common use-cases Example command and output Supported Databases How to install How to use Technical Explanation Performance Considerations Developmen
(read more)
In “Difficult Problems and Hard Work”, David McIver writes: An idiosyncratic distinction I find useful (though don’t reliably stick to) is that there is hard work and difficult problems, and these are not all that closely related. The distinction is roughly that something is hard work if you have to put a lot of time and effort into it and a difficult [problem] if you have to put a lot of skill or thinking into it. You can generally always succeed at something that is “merely” hard work if you can put in the time and effort, while your ability to solve a difficult problem is at least somewhat unpredictable. (emphasis mine) I think about this distinction regularly in the context of software engineering, though I think it probably applies to most “knowledge work”. At an intuitive level, I think we’ve all encountered this: there are problems that are solvable by throwing a lot of human-hours at it (“Hard Work”), and problems that are not a function of raw work hours, but rather require dealing with ambiguity (“Difficult Problems”). The more unpredictable the task is as a function of allocated effort to task completion, the more likely it is to be a Difficult Problem. In software engineering, Hard Work can look like: Writing glue code. Write a CRUD API. “Increase unit test coverage”. Clone feature X from system A into system B. Whereas Difficult Problems can look like: Designing an architecture for a new, ambiguously scoped system.
(read more)
A tiny spawn wrapper for Node.js. const {ls, curl} = require('tinysh') const list = ls('-la').trim().split('\n') const resp = curl('https://medv.io') Usage Import any binary you would like to call. Use it like a function. const {cat} = require('tinysh') const content = cat('README.md') To get exit code or stderr, use .status or .stderr. const {git} = require('tinysh') console.log(git('pull').status) To pass options to the spawn, bind to an options object. const {tee} = require('tinysh') tee.call({input: 'Hello, world!'}, 'file.txt') License MIT
(read more)
Intro Docs Users Releases Code 1.0 date released: 2022/06/22 archive: qbe-1.0.tar.xz sha256: 257ef3727c462795f8e599771f18272b772beb854aacab97e0fda70c13745e0c git commit: cd778b44ba11925d65ee10ff29fe22d4a45809dd No video for you, friend, you're missing on something... 8 years after the first commit in the git repo, at the ask of some package managers and users, I release QBE 1.0. All backends are at parity thanks to great work from contributors, and QBE is used somewhat seriously by a couple people. Have fun!
(read more)
Hello! Yes, this blog is still alive. In this post, I want to share a small little pattern that I’ve found to have a surprisingly high quality-of-life improvement, and I call it the list of monoids pattern. The idea is that whenever we have a monoidal value - a type that is an instance of the Monoid type class - we can sometimes produce a more ergonomic API if we change our functions to instead to a list of these monoidal values. I recently proposed an instance of this pattern to lucid, and it was well received and ultimately merged as part of the new lucid2 package. To motivate this p
(read more)
On 22 June 2022, the 122nd Ecma General Assembly approved the ECMAScript 2022 language specification, which means that it’s officially a standard now. This blog post explains what’s new. Table of contents: The editors of ECMAScript 2022  What’s new in ECMAScript 2022?  FAQ  What is the difference between JavaScript and ECMAScript?  Who designs ECMAScript? TC39 – Ecma Technical Committee 39  How are features added to ECMAScript? They go through the stages of the TC39 process  How important are ECMAScript versions?  How is [my favorite feature proposal] doing?  Where can I look up which features were added in a given ECMAScript version?  Free books on JavaScript   The editors of ECMAScript 2022   The editors of this release are: Shu-yu Guo Michael Ficarra Ke
(read more)
Like most people who are extremely online, Brazilian screenwriter Fernando Marés has been fascinated by the images generated by the artificial intelligence (AI) model DALL·E mini. Over the last few weeks, the AI system has become a viral sensation by creating images based on seemingly random and whimsical queries from users — such as “Lady Gaga as the Joker,” “Elon Musk being sued by a capybara,” and more.  Marés, a veteran hacktivist, began using DALL·E mini in early June. But instead of inputting text for a specific request, he tried something different: he left the field blank. Fascinated by the seemingly random results, Marés ran the blank search over and over. That’s when Marés noticed something odd: almost every time he ran a blank requ
(read more)
But what would trigger the most dramatic consequence of native String table is that it is the part of GC roots! Which means, it should be scanned/updated by the garbage collector specially. In OpenJDK, that means doing hard work during the pause. Indeed, for Shenandoah where pauses depend mostly on GC root set size, having just 1M records in String table yields this: $ ... StringIntern -p size=1000000 --jvmArgs "-XX:+UseShenandoahGC -Xlog:gc+stats -Xmx1g -Xms1g" ... Initial Mark Pauses (G) = 0.03 s (a = 15667 us) (n = 2) (lvls, us = 15039, 15039, 15039, 15039, 16260) Initial Mark Pauses (N) = 0.03 s (a = 15516 us) (n = 2) (lvls, us = 14844, 14844, 14844, 14844, 16088) Scan Roots = 0.03 s (a = 15448 us) (n = 2) (lvls, us = 14844, 14844, 14844, 14844, 16018) S: T
(read more)
IEEE Account Change Username/Password Update Address Purchase Details Payment Options Order History View Purchased Documents Profile Information Communications Preferences Profession and Education Technical Interests Need Help? US & Canada: +1 800 678 4333 Worldwide: +1 732 981 0060 Contact & Support About IEEE Xplore Contact Us Help Accessibility Terms of Use Nondiscrimination Policy Sitemap Privacy & Opting Out of Cook
(read more)
Vale 0.2 is out, and it includes the beginnings of a feature we like to call Fearless FFI. This is part of Vale's goal to be the safest native language. 0 Most languages compromise memory safety in some way which can lead to difficult bugs and security vulnerabilities. Vale takes a big step forward here, by isolating unsafe and untrusted code and keeping it from undermining the safe code around it. This page describes the proof-of-concept we have so far, plus the next steps. It involves some borderline-insane acrobatics with inline assembly, bitwise xor and rotate, and two simultaneous stacks. Buckle up! If you're impressed with our track record so far and believe in the direction we're heading, please consider sponsoring us on GitHub! We can't
(read more)
At Modos, our mission is to help you live a healthy life by creating technology that respects your time, attention, and well-being. We are an open-hardware and open-source company, and are building an ecosystem of devices to re-imagine personal computing and build a collective vision of calm, intentional, and humane technology. Today, we'd like to introduce the Modos Paper Monitor: an open-hardware standalone portable monitor made for reading and writing, especially for people who need to stare at the display for a long time.Specifications13.3” 1600x1200 Eink panel without front lightingDisplayPort 1.2 input, up to 224MP/s with a potential to support 2200x1650 resolution panel at 60HzMicroUSB power input (consumes 1.5W~2W continuously)*In time, we will offer a single USB Type-C connectio
(read more)
The power of components in your template-based Python web app. Reusable, encapsulated, and testable. Write server-side components as single Jinja template files. Use them as HTML tags without doing any importing. Say goodbye to spaghetti templates. We want our Python code to be easy to understand and test. Template code, however, often fails even basic code standards: long methods, deep conditional nesting, and mystery variables everywhere. But when it's built with components, you see where everything is, understand what are the
(read more)
Last year, I asked the question: Does shadow DOM improve style performance? I didn’t give a clear answer, so perhaps it’s no surprise that some folks weren’t sure what conclusion to draw. In this post, I’d like to present a new benchmark that hopefully provides a more solid answer. Shadow DOM and style performance To recap: shadow DOM has some theoretical benefits to style calculation, because it allows the browser to work with a smaller DOM size and smaller CSS rule set. Rather than needing to compare every CSS rule against every DOM node on the page, the browser can work with smaller “sub-DOMs” when calculating style. However, browsers have a lot of clever optimizations in this area, and userland “style scoping” solutions have emerged (e.g. Vue, Svelte, and CSS Modu
(read more)
Together with Lætitia Avrot and Nikolay Samokhvalov I was invited to participate in a Community Panel (YouTube video) about PostgreSQL Upgradability at Postgres Vision 2022. The panel was moderated by Bruce Momjian and initiated and organized by Jimmy Angelakos. Bruce did talk with each of us before, which helped a lot to guide the discussion in the right direction. The recording of the panel discussion is available on the Postgres Vision website. During this panel each of us provided examples for how easy or complicated PostgreSQL upgrades still are.   Minor version upgrades One result of our discussion is that minor upgrades (as example v14.0 to v14.1) are relatively easy to do, but might hold some surprises for anyone who does not pay att
(read more)
The some and any keywords are not new in Swift. The some keyword was introduced in Swift 5.1 whereas the any keyword was introduced in Swift 5.6. In Swift 5.7, Apple makes another great improvement on both of these keywords. We can now use both of these keywords in the function’s parameter position! func doSomething(with a: some MyProtocol) { // Do something } func doSomething(with a: any MyProtocol) { // Do something } This improvement not only made the generic functions look a lot cleaner, but also unlocked some exciting new ways to write generic code in Swift. Spoiler alert – we can now say goodbye to the following error message: protocol can only be used as a generic constraint because it has Self or associated type requirements Wanted to know more? Read
(read more)
July 8, 2021 2-minute readThe problem I’ve been tinkering a bit with Elm lately. The super-enforced functional and minimal paradigm is very refreshing, and serves as a sort of detox after spending one too many hour stuck in Android’s not-so-lovely XML + mutating class world. Setting up a new bare minimum Elm app should be quite simple, but it turns out that there are a few more steps required than one would expect. My first instinct – being a React guy – was to try yarn create elm-app (or npx create-elm-app), hoping it would do the Elm-equivalent of what create-react-app does. Turns out, to my disappointment, that the end result leaves something to be desired. No proper live-reload out of the box, and a lot of the webpack stuff I was hoping to avoid completely with Elm. Yuck.The
(read more)
A VMSS is, put as plainly as possible, a cluster of VMs that are scaled on-demand or manually.Context The Vipps App – until recently – took about 45 minutes to build on A$ure Azure. With a weekly release schedule, that might not sound like a huge dealbreaker, but since we’re using a “feature branch” approach to how we do git (ie no develop branch; master should always be deployable-ish), we don’t allow merging anything to master without a successful cloud build. Waiting for code review is hard to avoid, but additionally waiting for slow builds is just wrong. It’s not easy for our internal testers to check out what we’re doing either, they too are kept waiting. Suffice to say, the pain has been real. Personally, coming from a Golang background (where builds happen at the spe
(read more)
A fun decryption story! In 1914, The Netherlands sent a peace mission to Albania (I did not know this either). The mission commander, Major Lodewijk Thomson, was killed in battle under circumstances that are still unclear. And we'd love to know! en.wikipedia.org/wiki/Lodewi… Jun 21, 2022 · 7:48 AM UTC · Twitter Web App 12 177 34 445 Recently (2009), an encrypted Albanian telegram from that time was found in Dutch military archives. Could this perhaps shed some light on the situation? Intriguingly, no one had ever been able to decrypt the message. 2 7 2 40 Dutch researcher Florentijn van Kampen, affiliated with Radboud University's iHub, decided to give it a try using modern cryptographic techniques. I mean, 1914 encryption, how hard could it be?! ecp.ep.liu.se/index.php/hist… 1 3 31 First of all, some basic facts: The telegram consists of 736 numbers between 19 and 119. There are 49 different numbers. The frequency per number ranges from 1 to 74. Since there appear to be a maximum of 100 different numbers, we can put this on a 10*10 matrix:
(read more)
this a collection of thoughts on software development by grug brain developer grug brain developer not very smart, but grug brain developer program many long year and learn some things although mostly still confused grug brain developer try collect learns into small, easily digestible and funny page, not only for you, the young grug, but also for him because as grug brain developer get older he forget important things, like what had for breakfast or if put pants on big brained developers are many, and many not expected to like this THINK they are big brained developers many, many more even, and many more even definitely hate this is fine! is free country sort of and end of day not really matter too much, but grug hope you have fun reading and maybe learn from many, many mistakes grug make over long program life number one predator of grug is complexity complexity bad say again: complexity very bad you say now: complexity very, very bad given choice between complexity and one on one with t-rex, grug take t-rex: at least grug can see t-rex complexity is spirit demon that enter codebase through well-meaning but ultimately very clubbable non grug-brain developers and project managers who not fear complexity spirit demon or even know about one day code base understandable and grug can get work done, everything good. next day impossible: complexity spirit deamon has entered the code and very dangerous situation! grug not able see complexity demon, but grug feel its presence in code base, mocking him make change here break unrelated thing there mock mock mock ha ha so funny grug love programming and not becoming shiney rock speculator. club not work on comp
(read more)
Because I’ve obviously gone all “Slurp Juice” on this MEGA attack thing, an attempt at a decoder ring. First: the client doesn’t trust the server. That’s the whole point of the design. The attacker is Mega, the target is the client. The client wants to store encrypted files on the server. The client generates a mess of keys. An RSA key (we’ll spend a lot of time with this). A Curve25519 key. A key for every file they store. And then a master key, one key to rule them all, k_m. The client is going to forget all of these keys. Mega wants you to be able to install a client somewhere else and log in and get all your files, and you can’t be schlepping the keys around on like a USB fob or something. So instead: as the client generates these keys, it encrypts them under k_m and uploads them to the server (along with the associated public keys, not encrypted, fine). The server doesn’t have k_m and so can’t decrypt them. What the client remembers from machine to machine is their password. From the password they’ll derive everything else they need. Decoder ring: PBKDF2: A hashy algorithm that takes a password (a “low entropy secret”) and spits out a crypto key (a “high entropy secret” suitable for plugging into AES or whatever). RSA-CRT: RSA where you precompute some of the values to make it go faster, notably qInv, which is 1/q mod p — p and q are the factors of your RSA modulus. You can sort of glaze over this. ECB: Block ciphers encrypt 16 byte chunks, not more not less. If you want to encrypt an arbitrary string, you need a “mode” that ties together multiple block cipher invocations. ECB is the default mode, and the dumbest mode: just feed every 16 bytes of the plaintext through the cipher and cat together the resulting ciphertexts. ECB is notoriously bad. You can see penguins through it. Authenticated Ciphers: By itself, a cipher provides confidentiality: you can’t recover the plaintext from the ciphertext without a key. It doesn’t provide integrity: an attacker can flip bits in the ciphertext, and you’ll decrypt something different from the original plaintext (usually: with big random blocks of garbage). Mega do
(read more)
Wednesday, June 15th, 2022Type checking in Whiley is a curious thing as it goes both forwards and backwards. This is sometimes referred to as bidirectional type checking (see e.g. here and here). This is surprisingly useful in Whiley (perhaps because the language has a reasonably unusual feature set).Backwards TypingType checkers normally work in a backwards (or bottom up) direction. That is, they start from leaves of the abstract syntax tree and work upwards. Typing a statement like xs[i] = ys[i] + 1 (when xs and ys have type int[]) might look something like this:They key here is that types have to agree (modulo subtyping), otherwise we have a type error.LimitationsAs a general approach, backwards typing works well in most cases. But, there are some limitations when applying this to Whiley:(Sizing). Variables of type int in Whiley can hold arbitrary sized integers and, because of this, backwards typing can lead to inefficiency. Consider this:Under the backwards typing scheme, the constant 124 is given type int. That means, under-the-hood when the constant is created, we’ll allocate space for an arbitrary sized integer and give it the value 124. Then, we’ll immediately coerce it to a u8 causing a deallocation. It would be much better if we automatically determined the type of
(read more)
[This piece is co-authored by Ryan Berger and Stefan Mada (both Utah CS undergrads), by Nader Boushehri, and by John Regehr.] An optimizing compiler traditionally has three main parts: a frontend that translates a source language into an intermediate representation (IR), a “middle end” that rewrites IR into better IR, and then a backend that translates IR into assembly language. My group and our collaborators have spent a lot of time finding and reporting defects in the LLVM compiler’s middle end optimizers using Alive2, a formal methods tool that we created. This piece is about extending that work to cover one of LLVM’s backends. Why is it useful to try to prove that an LLVM backend did the right thing? It turns out that (despite the name) LLVM IR isn’t all that low-level — the backends need to do a whole lot of work in order to create efficient object code. In fact, there’s quite a bit of code in the backends, such as peephole optimizations that clean up local inefficiencies, that ends up duplicating analogous code in the LLVM middle end optimizers. And where there’s a lot of code, there’s a lot of potential for bugs. The basic job performed by Alive2 is to prove that a function in LLVM IR refines another one, or else to provide a counterexample showing a violation of refinement. To understand “refinement” we need to know that due to undefined behaviors, an LLVM function can effectively act non-deterministically. For example, a function that returns a value that it loaded from uninitialized storage could return “1” one time we execute it and “2” the next time. An LLVM optimization pass is free to turn this function into one that always returns “1”. The refinement relation, then, holds when — for every possible circumstance (values of arguments, values stored in memory, etc.) in which that function could end up being called — the optimized function exhibits a subset of the behaviors of the original function. Non-trivial refinement, where the behaviors after optimization are a proper subset of the original behaviors, is ubiquitous in practice. If the optimized code contains any behavior not in the unoptimized code
(read more)
Fear and loathing in FreeBSD, or qorg’s experiences with FreeBSD Introduction Not so long ago I wrote my experiences with OpenBSD. This post was about my experiences with OpenBSD but as a server, not as desktop. Using an operating system as a desktop is completely different than using it as a server. One day I thought “damn, Linux sucks! But I have to use this because the developer of the browser that I use is an asshole!”. And had to stick to Linux for a while. But then another day I thought “Hmm, FreeBSD claims to run Linux binaries better than Linux, let’s give it a try”. Good operating systems have to sell themselves some way. And that claim worked for me. So I went to FreeBSD.org, clicked the big yellow button that says “Download FreeBSD” and downloaded the memstick image for amd64 because that’s what my computer runs. I will be updating this site as I have more experiences with FreeBSD. So add to bookmarks! Last update: 2022-06-21 Installation The installation was pretty straight forward. I don’t think many people can get lost in this. I just selected ZFS as my file system (more on that later). And I don’t remember much other things in the installation. And as I forgot them. I don’t think they are very worth mentioning. Networking Sadly I no longer have the router in my room so I can’t use an ethernet cable. So I have to use the dreaded wireless card. I was very surprised when I find out this Atheros card is supported by FreeBSD so I don’t have to open the computer and put an Intel one. For the network card to work, I only had to modify the kernel booting process. Sounds very hard but it is just editing /boot/loader.conf/). I added the following lines to use the ath driver: if_ath_load="YES" if_ath_pci_load="YES" Then, in /etc/rc.conf (we will talk about it later) wlans_ath0="wlan0" ifconfig_wlan0="up" And then, i can just create a wpa_supplicant config file that works for my router, and run # wpa_supplicant -iwlan0 -c /etc/wpa_supplicant.conf -B. And then call dhclient so I get an IP address. But I guess I should use a static IP addre
(read more)
Free Weekly newsletter with the latest from the web Keep up to date about what you care about. Each week a recap of stories, projects and tutorials. Newsletter sent every monday with a recap of articles, projects and tutorials from previous week. Frequently asked questions Will I receive spam? Emails are sent only once a week, every monday. They are always on topic and contain the latest articles, tutorials and projects. You can unsubscribe at any time, by replying "unsubscribe" or following t
(read more)
Published on June 20, 2022 I start a lot of projects. A lot! Django is my go-to framework for spinning up a quick personal project, and while it's a fantastic framework, a big part of the reason I love Django is that it feels familiar. I have a lot of muscle memory for starting a new project. Here are six things that I do after I run django-admin startproject. Move the SECRET_KEY into an environment variable While it's one that you can get away with if you're keeping your source private, I reconfigure the SECRET_KEY as an environm
(read more)
As far as static analyzers are concerned, one of the most important point to consider is filtering out false positives as much as possible, in order for the reports to be actionable. This is an area on which Coverity did an excellent job, and likely a major reason why they got so popular within the open source community, despite being a closed-source product. LLVM has the LLVM_ENABLE_Z3_SOLVER build option, which allows building LLVM against the Z3 constraint solver. It is documented as follow: LLVM_ENABLE_Z3_SOLVER:BOOL If enabled, the Z3 constraint solver is activated for the Clang static analyzer. A recent version of the z3 library needs to be available on the system. The option is enabled in the Debian 11 package (clang-tools-11), but not in Fedora 36 or Ubuntu 22.04 ones. I added a build option (not enabled by default) to the llvm and clang packages in Pkgsrc, and successfully built Z3 enabled packages on NetBSD. For Pkgsrc users, add the following in mk.conf, and build lang/clang: PKG_OPTIONS.llvm= z3 PKG_OPTIONS.clang= z3 There are two ways of using Z3 with the Clang Static Analyzer, and to demonstrate them, let’s reuse the small demo snippet from the SMT-Based Refutation of Spurious Bug Reports in the Clang Static Analyzer paper. unsigned int func(unsigned int a) { unsigned int *z = 0; if ((a & 1) && ((a & 1) ^1)) return *z; // unreachable return 0; } For each method, we can use Clang directly on a given translation unit or use scan-build. The first way is using Z3 as an external constraint solver: $ clang --analyze -Xanalyzer -analyzer-constraints=z3 main.c $ scan-build -constraints z3 clang -c main.c scan-build: Using '/usr/lib/llvm-11/bin/clang' for static analysis scan-build: Analysis run complete. scan-build: Removing directory '/tmp/scan-build-2022-06-21-171854-18215-1' because it contains no reports. scan-build: No bugs found. This is a lot slower than the default, and the commit which documented the feature mentions a ~15x slowdown over the built-in constraint solver. The second way is using the default range based solver but having Z3 do refutation to filter out false positives, which is a lot faster: $ clang --analyze -Xanalyzer -analyzer-config -Xanalyzer crosscheck-with-z3=true main.c $ scan-build -analyzer-config crosscheck-with-z3=true clang -c main.c scan-build: Using '/usr/lib/llvm-11/bin/clang' for static
(read more)
Wow, what a mouthful! Although this architecture has featured in a number of my other writings, I haven't really described it in detail by itself. Which is a shame, because I think it works really well and is quite simple, a case of Sophisticated Simplicity.Why a reference architecture? The motivation for creating and now presenting this reference architecture is that the way we build connected mobile apps is broken. Our current approach(es) to building mobile applications are broken and none of the proposed solutions appear to help. How are they broken? They are overly complex, require way too much code, perform poorly and are unreliable. Very broadly speaking, these problems can be traced to the misuse of procedural abstraction for a problem-space that is broadly state-based, and can be solved by adapting a state-based architectural style such as in-process REST and combining it with well-known styles such as MVC. More specifically, MVC has been misapplied by combining UI updates with the model updates, a practice that becomes especially egregious with asynchronous call-backs. In addition, data is pushed to the UI, rather than having the UI pull data when and as needed. Asynchronous code is modelled using call/return and call-backs, leading to call-back hell, needless and arduous transformation of any dependent code into asynchronous code (see "what color is your function") that is also much harder to read, discouraging appropriate abstractions. Backend communication is also an issue, with newer async/await implementations not really being much of an improvement over callback-based ones, and arguably worse in terms of actual readability. (They seem readabl
(read more)
By Ben Johnson, June 04, 2018 — 14 min read · pdf Go’s paradox is that error handling is core to the language yet the language doesn’t prescribe how to handle errors. Community efforts have been made to improve and standardize error handling but many miss the centrality of errors within our application’s domain. That is, your errors are as important as your Customer and Order types. An error also must serve the different goals for each of its consumer roles—the application, the end user, and the operator. This post explores the purpose of errors for each of these consumers within our application and how we can implement a simple but effective strategy that satisfies each role’s needs. This post expands on many ideas about application domain & project design from Standard Package Layout so it is helpful to read that first. Why We Err Errors, at their core, are simply a way of explaining why things didn’t go how you wanted them to. Go splits errors into two groups—panic and error. A panic occurs when you don’t expect something to go wrong such as accessing invalid memory. Typically, a panic is unrecoverable so our application fails catastrophically and we simply notify an operator to fix the bug. An error, on the other hand, is when we expect something could go wrong. That’s the focus of this post. Types of errors We can divide error into two categories—well-defined errors & undefined errors. A well-defined erro
(read more)
ABSTRACT Large-scale pretrained AI models have shown state-of-the-art accuracy in a series of important applications. As the size of pretrained AI models grows dramatically each year in an effort to achieve higher accuracy, training such models requires massive computing and memory capabilities, which accelerates the convergence of AI and HPC. However, there are still gaps in deploying AI applications on HPC systems, which need application and system co-design based on specific hardware features. To this end, this paper proposes BaGuaLu1, the first work targeting training brain scale models on an entire exascale supercomputer, the New Generation Sunway Supercomputer. By combining hardware-specific intra-node optimization and hybrid parallel strategies, BaGuaLu enables decent performance and scalability on unprecedentedly large models. The evaluation shows that BaGuaLu can train 14.5-trillion-parameter models with a performance of over 1 EFLOPS using mixed-precision and has the capability to train 174-trillion-parameter models, which rivals the number of synapses in a human brain. References Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. arXiv:2005.14165 [cs.CL]Google ScholarJacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google ScholarWilliam Fedus, Barret Zoph, and Noam Shazeer. 2021. Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. arXiv:2101.03961 [cs.LG]Google ScholarHaohuan Fu, Junfeng Liao, Jinzhe Yang, Lanning Wang, Zhenya Song, Xiaomeng Huang, Chao Yang, Wei Xue, Fangfang Liu, Fangli Qiao, et al. 2016. The Sunway TaihuLight supercomputer: system and applications. Science China Information Sciences 59, 7 (2016), 1--16.Google ScholarCross RefKaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity Mappings in Deep Residual Networks. In ECCV 2016. 630--645.Google ScholarYanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Dehao Chen, Mia Xu Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Yonghui Wu, and Zhifeng Chen. 2019. GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8--14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alché-Buc, Emily B. Fox, and Roman Garnett (Eds.). 103--112.Google ScholarThorsten Kurth, Sean Treichler, Joshua Romero, Mayur Mudigonda, Nathan Luehr, Everett Phillips, Ankur Mahesh, M
(read more)
June 21st, 2022 Today we’re announcing our beta release of TypeScript 4.8! To get started using the beta, you can get it through NuGet, or- use npm with the following command: npm install -D [email protected] You can also get editor support by Downloading for Visual Studio 2022/2019 Following directions for Visual Studio Code. Here’s a quick list of what’s new in TypeScript 4.8! Improved Intersection Reduction, Union Compatibility, and Narrowing Improved Inference for infer Types in Template String Types --build, --watch, and --incremental Performance Improvements Errors When Comparing Object and Array Literals Improved Inference from Binding Patterns File-Watching Fixes (Especially Across git checkouts) Find-All-References Performance Improvements Breaking Changes Improved Intersection Reduction, Union Compatibility, and Narrowing TypeScript 4.8 brings a series of correctness and consistency improvements under --strictNullChecks. These changes affect how intersection and union types work, and are leveraged in how TypeScript narrows types. For example, unknown is close in spirit to the union type {} | null | undefined because it accepts null, undefined, and any other type. TypeScript now recognizes this, and allows assignments from unknown to {} | null | undefined. function f(x: unknown, y: {} | null | undefined) { x = y; // always worked y = x; // used to error, now works } Another change is that {} intersected with any other object type simplifies right down to that object type. That meant that we were able to rewrite NonNullable to just use an intersection with {}, because {} & null and {} & undefined just get tossed away. - type NonNullable<T> = T extends null | undefined ? never : T; + type NonNullable<T> = T & {}; This is an improvement because intersection types like this can be reduced and assigned to, while conditional types currently cannot. So NonNullable<NonNullable<T>> now simplifies at least to NonNullable<T>, whereas it didn’t before. function foo<T>(x: NonNullable<T>, y: NonNullable<NonNullable<T>>) { x = y; // always worked y = x; // used to error, now works } These changes also allowed us to bring in sensible improvements in control flow analysis and type narrowing. For example, unknown is now narrowed just like {} | null | undefined in truthy branches. function narrowUnknownishUnion(x: {} | null | undefined) { if (x) { x; // {} } else { x; // {} | null | undefined } } function narrowUnknown(x: unknown) { if (x) { x; // used to be 'unknown', now '{}' } else { x; // unknown } } Generic values also get narrowed similarly. When checking that a value isn’t null or undefined, TypeScript now just intersects it with {} – which again, is the same as saying it’s NonNullable. Putting many of the changes here together, we can now define the following function without any type assertions. function throwIfNullable<T>(value: T): NonNullable<T> { if (value === undefined || value === null) { throw Error("Nullable value!"); } // Used to fail because 'T' was not assi
(read more)
The annual developer survey of the GraphQL ecosystem Currently Open 2022 When GraphQL was first introduced it offered a radically new way to build APIs, with more control, more granularity, and more flexibility. But that flexibility came at a price in the form of extra complexity, and a crop of frameworks, libraries, and services quickly appeared to help define better patterns and workflows. Now, for the first time ever we're surveying the GraphQL community to figure out which of these many tools are the most popular, and which features are actually being used. With your help, let's see wh
(read more)
The C++20 standard added constraints and concepts to the language. This addition introduced two new keywords into the language, concept and requires. The former is used to declare a concep
(read more)
This book is for anyone who is interested in computing, and wants to learn more about the exciting, but sometimes daunting world of The Shell. The shell is simple interface for working with computers and programs and learning some of its features can enormously increase your productivity as any computer user - whether a home user or hobbyist, programmer, data scientist, writer, administrator or other professional.For the newcomer, you'll learn what a shell is, how to use it on your system, and then how to become more effective everyday by integrating the shell into your work. For the experienced professional, there is a wealth of detailed tips and tricks in each chapter that go into advanced
(read more)
A wall of lava lamps at the offices of Cloudflare Lavarand was a hardware random number generator designed by Silicon Graphics that worked by taking pictures of the patterns made by the floating mate
(read more)