This project is a proof of concept that any website can identify and track you, even if you are using private browsing or incognito mode in your web browser. Many people think that they can hide their identity if they are using private browsing or incognito mode. This project will prove that they are wrong. How to use the website Visit http://www.nothingprivate.ml and enter your name Click the "See the magic" button Visit the same website in Private browsing / Incognito mode See the magic ⭐ Don't scroll down and ruin the fun... Just follow the steps above... 😄 Hey! How? Hope you are surprised! 😄 Yes, the website can remember your name even if you had visited it via private browsing or incognito mode. Yes, nothing is private in this world anymore! This is what the big companies are doing with your identity. You think that going into private mode will wipe out all the traces? Absolutely not! In reality, using private browsing or incognito mode will just help you to clear your browsing history. Your internet service provider, search engines, and your favorite websites can still track you. They know your likes and dislikes. They use your data to earn money. The video below explains everything: Yes, nothing is free... How to stay safe? One way to reduce the likelyhood of browser fingerprinting by using some of the browsers listed in the list of browsers implementing countermeasures curated by the community. Browser fingerprinting is just an example of several ways that can be used to track your identity. For some others visit Freecodecamp blog. Here's a picture from the blog that explains the current situation: References https://privatebrowsingmyths.com/ https://panopticlick.eff.org/ https://amiunique.org/ https://www.pcworld.com/article/192648/browser_fingerprints.html https://en.wikipedia.org/wiki/Device_fingerprint https://nakedsecurity.sophos.com/2014/12/01/browser-fingerprints-the-invisible-cookies-you-cant-delete/ https://spreadprivacy.com/browser-fingerprinting/ https://time.com/4673602/terms-service-privacy-security/ https://snapsearch.online/tips/androids-best-private-browsers-privacy-test/ News articles Google faces $5 bil
(read more)
Learn the hack - Stop the attack WebGoat is a deliberately insecure application that allows interested developers just like you to test vulnerabilities commonly found in Java-based applications that use common and popular open source components. Description Web application security is difficult to learn and practice. Not many people have full blown web applications like online book stores or online banks that can be used to scan for vulnerabilities. In addition, security professionals frequently need to test tools against a platform known to be vulnerable to ensure that they perform as advertised. All of this needs to happen in a safe and legal environment. Even if your intentions are good, we believe you should never attempt to find vulnerabilities without permission. The primary goal of the WebGoat project is simple: create a de-facto interactive teaching environment for web application security. In the future, the project team hopes to extend WebGoat into becoming a security ben
(read more)
GIMP 2.99.8 is our new development version, once again coming with a huge set of improvements. “Work in Progress 2” (follow-up of 2.99.6 image) by Aryeom, Creative Commons by-sa 4.0 - GIMP 2.99.8 To get a more complete list of changes, you should refer to the NEWS file or look at the commit history. The Clone, Heal and Perspective Clone tools now work when multiple layers are selected. There are 2 new modes in particular: When sourcing from multiple selected drawables then cloning into a single drawable, the pixel source is the composited render of source layers. This is similar to “Sample Merged”, except that it is limited to a list of drawables and you don’t have to hide the layers that you don’t want to source from. When cloning while multiple drawables are selected, each drawable clones from itself to itself, i.e. every drawable is both its source and target (the layers selected when sourcing do not matter in this case). This can be very useful in particular when you need to heal several layers exactly the same way, for instance when working on textures and various texture mappings. Development of this feature was proposed and financially supported by Creative Shrimp: Gleb Alexandrov and Aidy Burrows, well-known Blender educators. Here’s an excerpt from a new course where multi-layer cloning is already used: Your browser does not support the video tag. Extract of a video course by Creative Shrimp (Gleb Alexandrov and Aidy Burrows) Selection cue fixed on Wayland and macOS¶ Windows drawing logics evolved in recent compositing window managers. In particular, the drawing of image selection (marching ants 🐜 representing your selection boundary) broke on Wayland, as well as on macOS since Big Sur release. The selection tools were still perfectly working but the outlines were simply not visible on the canvas anymore. We fixed this by reimplementing part of how selections were being drawn over the image. We aimed to only fix this for Wayland, but our recent macOS contributor (see below in macOS package section) confirmed it also fixes the issue for Big Sur. Now the next step is to backport this fix to the stable branch (only for the sake of macOS, since the stable GTK2 version uses XWayland and thus doesn’t exhibit the bug). There have been two more Wayland-specific changes. For our Flatpak builds, we will now use the new fallback-x11 permission instead of x11 to prevent unnecessary X11 access while in Wayland, hence improving security step by step. Finally, some people reported huge memory leaks under Wayland only (it was fine on X11). We didn’t do much so we can’t take any credit for this, but this seems to have been fixed, probably in a dependency with Wayland-specific code. Wider coverage of input devices thanks to Windows Ink support¶ Windows Pointer Input Stack (Windows Ink) support was recently added to GTK3 by Luca Bacci, who also made it available in GIMP and added a new option in the Preferences dialog to switch between Wintab (older API) and Windows Ink. You can find this option on the Input Devices page. Pointer input API selection — GIMP 2.99.8 This is a huge milestone for artists using Windows since more graphics tablets or touch devices come with Ink support as a default whereas the legacy Wintab interface requires specific drivers. This is even more the case with Windows 8 and newer, for which most tablets should work out-of-the-box with Windows Ink. Clicking anywhere on the toolbox or on Wilber’s drop area now returns the focus to the canvas (similarly to the Esc shortcut). This allows you to work on canvas with shortcuts more efficiently. For instance, you could pan directly with the Space bar without having to click on canvas (hence activating a tool) when your keyboard focus was previously on some text input widget, by clicking anywhere on toolbox (buttons and dead area alike) first. Dropping thumbnail icon¶ After years of discussions and bug reports, we dropped the thumbnail icon feature. Previously, when images were opened, the application icon in the taskbar would combine a preview of the active image and the actual application icon (Wilber). The icon would then change whenever the active image changed. For many people, this complicated locating GIMP’s window among windows of other running applications. Moreover, due to recent changes in desktop environments’ behavior, this feature was actually working on less and less platforms. So depending on your OS and desktop environment, it either didn’t work at all or actively worked against you. This is why we decided to do away with it. Improved file formats support: JPEG-XL, PSD/PSB, and more¶ JPEG-XL is now optionally supported thanks to Daniel Novomeskỳ who also previously contributed to HEIC/AVIF support. GIMP can load and export JPEG-XL files (.jxl) in grayscale and RGB, with color profiles support. Our exporting code also provides a “lossless” option and several “Effort/Speed” encoding values. JPEG-XL exporting options — GIMP 2.99.8 This plug-in is different from the third-party plug-in that is part of the libjxl library that we use too. It supports GIMP 3 plugin API, reads grayscale images as grayscale, is written in C rather than C++, and exposes several presets for speed/quality tradeoff. We also do not yet expose features that could be considered experimental. If you are interested in JPEG-XL support for GIMP 2.10.x, please use the plug-in from libjxl. We also improved support for Adobe Photoshop project files. GIMP now supports larger-than-4GiB PSD files and loading up to 99 channels (specs say that 56 is the max but some sample PSD files have more channels). Additionally, now you can also load PSB files which are essentially PSD files with support for width and height of up to 300,000 pixels. There have been even more changes to file formats support and plug-ins: 16-bit SGI images are now supported (until now, they were loaded as 8-bit). The WebP plug-in was ported to the GimpSaveProcedureDialog API. Script-Fu now handles GFile and GimpObjectArray types. … Plug-in development¶ Our API for plug-in developers got the following improvements: New gimp_display_present() function to present a specific display at the top of the image display stack. New gimp_export_thumbnail() function to query the user settings (added in “Image Import & Export” page of Preferences in this version) on whether or not a file plug-in should export the image thumbnail. New gimp_procedure_dialog_fill_expander() function to create a GtkExpander in procedure dialogs. All widgets within a same container in a GimpProcedureDialog are added to their own GtkSizeGroup for better aligned generated dialog, yet only within their own level of widgets. Memory leak fixes¶ Several contributors including Andrzej Hunt and Massimo Valentini started chasing small memory leaks with code analyzers, which is a very nice way to spend your downtime. We recommend! 👍 Continuous integration changes¶ Windows¶ Development installer “nightlies”¶ We wrote rules for the continuous integration platform to create installers. This is very useful for users who want to test new unreleased features and bug fixes. Installers are being created once a week because the full process takes ca. 2 hours and we didn’t want to trigger it too often. If you want to test the latest installer for Windows, here is how you can do it: Go to GIMP’s scheduled pipelines listing and click the “Last Pipeline” ID listed next to the Windows installer item. Select the job named “win-installer-nightly” Click the “Browse” button Navigate to the build/windows/installer/_Output/ directory Finally click the gimp-2.99.*-setup.exe file to download and install it. This procedure or any updated version of it is available in the “Automatic development builds” section of the download page. ⚠️ Be warned that a weekly installer is a purely automated build, there is no human verification. It is happening at a semi-random time during the development
(read more)
As Sylvain Peyronnet already mentioned, logic is an important part of theoretical computer science. However, it is not enough to learn logic from textbooks tailored for pure mathematicians. In other words, it's also important to learn logic from a more "computer science" perspective. Finite Model Theory We want to learn techniques that deal with finite structures. It is well-known that many traditional tools from model theory, e.g., compactness and Löwenheim-Skolem theorem, are not applicable to finite models. This leads us to the study of Finite Model Theory. For this area, I recommend the following excellent books: Leonid Libkin, Elements of Finite Model Theory. (textbook) Grädel et al., Finite Model Theory and Its Applications. (survey articles and applications) A sub-area of finite model theory is descriptive complexity, where we want to characterizes complexity classes by the type of logic needed to define the languages. The definitive reference for descriptive complexity is: Neil Immerman, Descriptive Complexity. Proof Complexity Another important area of logic in computer science is Proof Complexity, a study of three way relationship among complexity classes, weak logical systems, and propositional proof system. The following two related aspects are considered: (i) the complexity of of proofs of propositional formulas, and (ii) the study of weak theories of arithmetic, called bounded arithmetic. Aspect (i) has to do with the following question: "Is there a propositional proof system in which every tautology has a proof of size polynomial in the size of the tautology?" Aspect (ii) studies logical systems which use restricted reasoning based on concepts from computational complexity. In other words, we assign with each complexity class $C$ a logical theory $VC$, where the provably total functions in $VC$ are exactly the functions in the complexity class $C$. One recent development is a new research program called "bounded reverse mathematic" proposed by Stephen Cook and Phuong Nguyen, where the goal is to classify theorems (of interest in computer science) based on the (minimal) computational complexity of concepts need
(read more)
Umami is a simple, fast, website analytics alternative to Google Analytics. Getting started A detailed getting started guide can be found at https://umami.is/docs/ Installing from source Requirements A server with Node.js 12 or newer A database (MySQL or Postgresql) Get the source code and install packages git clone https://github.com/mikecao/umami.git cd umami npm install Create database tables Umami supports MySQL and Postgresql. Create a database for your Umami installation and install the tables with the included scripts. For MySQL: mysql -u username -p databasename < sql/sc
(read more)
A lightning talk by Gary Bernhardt from CodeMash 2012 This talk does not represent anyone's actual opinion. For a more serious take on software, try Destroy All Software Screencasts: 10 to 15 minutes every other week, dense with information on advanced topics like Unix, TDD, OO Design, Vim, Ruby, and Git. If you liked this, you might also like Execute Program: interactive courses on TypeScript, Modern JavaScript, SQL, regular expressions, and more. Each course is made up of hundreds of
(read more)
ClickHouse is the workhorse of many services at Yandex and several other large Internet firms in Russia. These companies serve an audience of 258 million Russian speakers worldwide and have some of the greatest demands for distributed OLAP systems in Europe. This year has seen good progress in ClickHouse's development and stability. Support has been added for HDFS, ZFS and Btrfs for both reading datasets and storing table data, a T64 codec which can significantly improve ZStandard compression, faster LZ4 performance and tiered storage. Anyone uncomfortable with the number of moving parts in a typical Hadoop setup might find assurances in ClickHouse as being a single piece of software rather than a loose collection of several different projects. For anyone unwilling to pay for Cloud hosting, ClickHouse can run off a laptop running MacOS; paired with Python and Tableau there would be little reason to connect to the outside world for most analytical operations. Being written in C++ means there are no JVM configurations to consider when running in standalone mode. ClickHouse relies heavily on 3rd-party libraries which helps keep the C++ code base at ~300K lines of code. To contrast, PostgreSQL's current master branch has about 800K lines of C code and MySQL has 2M lines of C++. There have been 13 developers that have made at least 100 commits to the project this year. PostgreSQL has only had five developers reach the same target, MySQL has only seen two. ClickHouse's engineers have managed to deliver a new release every ten days on average for the past 2.5 years. In this post I'm going to benchmark several ways of importing data into ClickHouse. ClickHouse Up & Running ClickHouse supports clustering but for the sake of simplicity I'll be using a single machine for these benchmarks. The machine in question has an Intel Core i5 4670K clocked at 3.4 GHz, 8 GB of DDR3 RAM, a SanDisk SDSSDHII960G 960 GB SSD drive which is connected via a SATA interface. The machine is running a fresh installation of Ubuntu 16.04.2 LTS and I'll be running version 19.15.3.6 of ClickHouse. The dataset used in this post is the first 80 million records from my 1.1 Billion Taxi Ride Benchmarks. The steps taken to produce this dataset are described here. The 80 million lines are broken up into 4 files of 20 million lines each. There have been three formats of each file produce
(read more)
Why Code Install Use Jobs Cache vs MySQL Replication Blockchain Multizone Chat Contact Bedrock is a simple, modular, WAN-replicated, Blockchain-based data foundation for global-scale applications. Taking each of those in turn: Bedrock is simple. This means it exposes the fewest knobs necessary, with appropriate defaults at every layer. Bedrock is modular. This means its functionality is packaged into separate “plugins” that are decoupled and independently maintainable. Bedrock is WAN-replicated. This means it is designed to endure the myriad real-world problems that occur across slow, unreliable internet connections. Bedrock is Blockchain-based. This means it uses a private blockchain to synchronize and self organize. Bedrock is a data foundation. This means it is not just a simple database that responds to queries, but rather a platform on which data-processing applications (like databases, job queues, caches, etc) can be built. Bedrock is for global-scale applications. This means it is built to be deployed in a geo-redundant fashion spanning many datacenters around the world. Bedrock was built by Expensify, and is a networking and distributed transaction layer built atop SQLite, the fastest, most reliable, and most widely distributed database in the world. Why to use it If you’re building a website or other online service, you’ve got to use something. Why use Bedrock rather than the alternatives? We’ve provided a more detailed comparision against MySQL, but in general Bedrock is: Faster. This is true for networked queries using the Bedrock::DB plugin, but especially true for custom plugins you write yourself because SQLite is just a library that operates inside your process’s memory space. That means when your plugin queries SQLite, it isn’t serializing/deserializing over a network: it’s directly accessing the RAM of the database itself. This is great in a single node, but if you still want more (because who doesn’t?) then install any number of nodes and load-balance reads across all of them. This means every CPU of every database server is available for parallel reads, each of which has direct access to the database RAM. Simpler. This is because Bedrock is written for modern hardware with large SSD-backed RAID drives and generous RAM file caches, and thereby doesn’t mess with the zillion hacky tricks the other databases do to eke out high performance on largely obsolete hardware. This results in fewer esoteric knobs, and sane defaults that “just work”. More reliable. This is because Bedrock’s synchronization engine supports active/active distributed transactions with automatic failover, and can be clustered not just inside a single datacenter, but across multiple datacenters spanning the internet. This means Bedrock continues functioning not only if a single node goes down, but even if you lose an entire datacenter. After all, it doesn’t matter
(read more)
SpawnFest is an annual 48 hour online contest where teams try to build the best BEAM-based application, as determined by the judges based on certain criteria. In this blog post I am going to introduce the new tool I created during SpawnFest.eFlambé is a tool for rapidly profiling Erlang and Elixir code. It is intended to be one of the first tools you reach for when debugging a performance issue in your Elixir or Erlang application. With a single command you can visualize your code’s performance as an interactive flame graph in your flame graph viewer of choice. It’s written in Erlang and published to hex.pm.eFlambé flame graph viewed in speedscopeThere are no new ideas behind eFlambé. Brendan Gregg introduced flame graphs nearly a decade ago and there have been several Erlang projects that have made it possible to generate flame graphs of Elixir and Erlang call stacks. By far the most popular has been Vlad Ki’s eflame. While eflame works well it has two disadvantages:Generating a SVG flame graph is a multi-step process.Code that needs to be profiled must be invoked by the eflame:apply/3 directly. Sometimes you want to profile a function call as it is made by code inside of your running application. With eflame you must wrap the function call in an eflame:apply/3 call and re-compile and restart your application.eFlambé improves upon eflame by making flame graph generation a single step process, and making it possible to profile any function inside a running application without having to recompile code or run more than one command.Using eFlambé is easy. Simply add it as a dependency to your project.For Elixir projects, add the following line to your mix.exs file’s dependency section and then run mix deps.get:{:eflambe, "~> 0.2.1"}For Erlang rebar3 projects add the following line to the dependency section of your rebar.config file and then run rebar3 get-deps:{eflambe, "0.2.1"}Then start up your application with a remote shell attached. For Elixir projects this will typically be iex -S , and for Erlang rebar3 projects this will be rebar3 shell.There are two different ways to generate flame graphs with eFlambé.Profiling a single callGenerating a flame graph of a single function call can be done with the eflambe:apply/2 function. For example, suppose I want to generate a flame graph of my Fibonacci algorithm:iex(1)> :eflambe.apply({MyFibonacci, generate, [10]}, [output_format: :brendan_gregg, open: :speedscope])The first argument is a tuple containing the module name, function name, and the list of arguments to invoke the function with. The second argument is a list of options for eFlambé. In this example we are specifying the output format as brendan_gregg (one of the formats that speedscope can read) and instructing eFlambé to open the generated flame graph in speedscope (you’ll need to have speedscope already installed for this to work). See the readme for the complete list of options. This will execute MyFibonacci.generate/1 with an argument of 10. Because I specified open: :speedscope eFlambé will immediately open the flame graph data in speedscope when the function returns.Flame graph of my Fibonacci function viewed in spe
(read more)
During our Apport research we exploited Ubuntu’s crash handler, and following that, we decided to once again audit the coredump creation code. But this time, we chose to focus on a more general different target, rather than a specific crash handler. In this post, we will explore how the Linux kernel itself behaves when a process crash happens. We will show bugs we found in the Linux kernel that allow unprivileged users to create root-owned core files, and how we were able to use them to get an LPE through the sudo program on machines that have been configured by administrators to allow running a single innocent command. On Linux, a coredump will be generated for a process upon receiving
(read more)
Looking at our AWS bills, there was one particular line that stood out like a sore thumb. Data transfer. It seemed way out of proportion. “Doh!” I hear you bemoan, “everyone knows tha
(read more)
Author: Chloé LourseyreEditor: Peter Fordham Context Header guards Every C++ developer has been taught header guards. Header guards are a way to prevent a header being included multiple times which would be problematic because it would mean that the variables, function and classes in that header would be defined several times, leading to a compilation error. Example of a header guard: #ifndef HEADER_FOOBAR #define HEADER_FOOBAR class FooBar { // ... }; #endif // HEADER_FOOBAR For those who are not familiar with it, here is how it works: the first time the file is included, the macro HEADER_FOOBAR is not defined. Thus, we enter into the #ifndef control directi
(read more)
I’ve been wanting to publicly comment on Lenovo’s statement on Linux support for a while, as there’s much to say about it, and my failing attempt at finding a suitable replacement for my venerab
(read more)
VCS history versus large open source development October 19, 2021 I recently read Fossil's Rebase Considered Harmful (via), which is another rerun of the great rebase versus everything else debate. This time around, one of the things that occurred to me is that rebasing and an array of similar things allow maintainers of large, public open source repositories to draw a clean line between how people develop changes in private and what appears in the immutable public history of the project. Any open source project can benefit from clean public history, partly because clean history makes it easy to use bisection to locate bugs, but a large project especially benefits because it has so many contributors of varying skill levels and practices. (In addition, consumers of public open source repositories often already see a linear view of the project's code history.) Another aspect of using rebasing and other things that erase history (such as emailed patch series) is that they free people to develop changes in whatever style of VCS usage they find most comfortable and useful. You can set your editor to make a commit every time you save a file, and no one else has to care in the way they very much would if you proposed to merge the entire sequence intact into a large, public open source repository. The more contributors you have (and the more disparate they are), the more potentially useful this is. Of course, there's a continuum, both between projects and in general. It's undeniably sometimes useful to know how a change was developed over time, for various reasons. It can also be useful to know how a change has flowed through various public versions of the code. The Linux kernel famously has a whole collection of trees that changes can wind up in before they get pulled into the mainline, and when this is done the changes often continue to carry their history of trees. Presumably this is useful to Linus Torvalds and other kernel developers. One way to put this is that as an open source project grows larger and larger, I think that it makes less and less sense to try to represent almost everything that happens to the project in its VCS history. VCS history is only o
(read more)
Latest specification is a work in progressLeading browser vendors are putting the finishing touches to a set of APIs that make it easier for developers to protect their web applications against cross-site scripting (XSS) attacks.Many websites rely on dynamically generated content in the browser. Often, the generated markup includes content provided by outside sources, such as user-provided input, which can include malicious JavaScript code.Sanitizing dynamic markup and making sure it does not contain harmful code is one of the most serious challenges of web security.Native sanitization supportCurrently, web developers rely on third-party libraries such as DOMPurify to sanitize HTML content and prevent XSS attacks.The Sanitzer API, which was first proposed earlier this year in a draft specification, will give browsers native support to remove harmful code from markup that is dynamically added to web pages.The API is being jointly developed by Google, Mozilla, and Cure53, the maintainer of the DOMPurify library.Read more of the latest browser security news“The new Sanitizer API proposal is the first step towards a standardized API for a common task that many frontend libraries (e.g, React, Vue) or sanitizers (e.g, DOMPurify, sanitize-html
(read more)
Sendable and @Sendable are part of the concurrency changes that arrived in Swift 5.5 and address a challenging problem of type checking values passed between structured concurrency constructs and actor messages. Before diving into the topic of sendables, I encourage you to read up on my articles around async/await, actors, and actor isolation. These articles cover the basics of the new concurrency changes, which directly connect to the techniques explained in this article.Raycast is the Swiss Army knife for your Mac. Create GitHub pull requests, clear derived data, reset SPM, and so much more. Accessible via the keyboard and extendible with an API. When should I use Sendable? The Sendable protocol and closure indicate whether the public API of the passed values passed thread-safe to the compiler. A public API is safe to use across concurrency domains when there are no public mutators, an internal locking system is in place, or mutators implement copy on write like with value types. Many types of the standard library already support the Sendable protocol, taking away the requirement to add conformance to many types. As a result of the standard library support, the compiler can implicitly create support for your custom types. For example, integers support the protocol: extension Int: Sendable {} Once we create a value type struct with a single property of type int, we implicitly get support for the Sendable protocol: // Implicitly conforms to Sendable struct Article { var views: Int } At the same time, the following class example of the same article would not have implicit conformance: // Does not implicitly conform to Sendable class Article { var views: Int } The class does not conform because it is a reference type and therefore mutable from other concurrent domains. In other words, the class article is not thread-safe to pass around, and the compiler can’t implicitly mark it as Sendable. Implicit conformance when using generics and enums It’s good to understand that the compiler does not add implicit conformance to generic types if the generic type does not conform to Sendable. // No implicit confor
(read more)
2021-10-19As you know, Rust does not support optional function arguments nor keyword arguments, nor function overloading. To overcome this limitation rust developers frequently apply builder pattern. It requires some extra coding, but from the API ergonomics perspective, gives a similar effect as keyword arguments and optional arguments. Introduction to problem Consider the following rust structure: struct User { email: Option<String>, first_name: Option<String>, last_name: Option<String> } In Ruby, a class that holds the same data can be defined as: class User attr_reader :email, :first_name, :last_name def initialize(email: nil, first_name: nil, last_name: nil) @email = email @first_name = first_name @last_name = last_name end end Don't worry much about Ruby, I just want you to show how easily a user can be created by explicitly specifying relevant fields: greyblake = User.new( email: "[email protected]", first_name: "Sergey", ) last_name is not there, so it gets the default value nil automatically. Initializing a structure in Rust Since we do not have default arguments in Rust, in order to initialize such structure we would have to list all fields: let greyblake = User { email: Some("ex[email protected]".to_string()), first_name: Some("Sergey".to_string()), last_name: None, } This is quite similar to Ruby's keyword arguments, but we have to set all fields although last_name is None. It works well, but for big complex structures, it can be verbose and annoying. Alternatively we can implement a new() constructor: impl User { fn new( email: Option<String>, first_name: Option<String>, last_name: Option<String> ) -> Self { Self { email, first_name, last_name } } } Which will be used in the following way: let greyblake = User::new( Some("[email protected]".to_string()), Some("Sergey".to_string()), None ) But it became even worse: we still have to list values for all the fields, but now it's much easier to screw up by passing values in the wrong order (yeah, the newtype technique could help us here, but this article is not about that
(read more)
View Release Notes for G4.0.0Today marks the soft release of the 4th Generation of Waterfox. After enough time has elapsed, the automatic update will be seeded out to all users.New WebsiteYou may have noticed a new website - much more information, with a more practical structure. This website also allows us to add documentation, support documents and better ways to convey information. We will be optimising the website over the next few weeks as well as setting up redirects for any old pages that have been missed.New BrowserWaterfox has returned to its roots with performance at the forefront. We have aggressively optimised Waterfox for as much performance as possible. Unfortunately this means we have to leave older systems behind - but any computer from the last decade should work.ARM build
(read more)
It surprises me that when people think of "software that brings about the singularity" they think of text models, or of RL agents. But they sneer at decision tree boosting and the like as boring algorithms for boring problems. To me, this seems counter-intuitive, and the fact that most people researching ML are interested in subjects like vision and language is flabbergasting. For one, because getting anywhere productive in these fields is really hard, for another, because their usefulness seems relatively minimal. I've said it before and I'll say it again, human brains are very good at the stuff they've been doing for a long time. This ranges from things like controlling a human-like body to things like writing prose and poetry. Seneca was as good of a philosophy writer as any m
(read more)
Similar to tomnomnom/gron but in Awk. Features true JSON parser in pure Awk. Reasonably fast with Gawk/Mawk even on large-ish files. Slow with BWK on big JSON files (100K+). Developed in xonixx/intellij-awk. Incubated from xonixx/awk_lab. Usage Gron: $ echo '{"a":[1,2,3]}' | ./gron.awk json={} json.a=[] json.a[0]=1 json.a[1]=2 json.a[2]=3 Un-Gron: $ echo '{"a":[1,2,3]}' | ./gron.awk | ./gron.awk - -u { "a": [ 1, 2, 3 ] } Filter part of JSON: $ curl -s "https://api.github.com/repos/xonixx/gron.awk/commits?per_page=1" \ | ./gron.awk | grep "commit.author" | ./gron.awk - -u [ { "commit": { "author": { "date": "2021-10-19T11:31:34Z", "email": "[email protected]", "name": "xonix" } } } ] JSON structure: $ curl -s 'https://ip-ranges.amazonaws.com/ip-ranges.json' | ./gron.awk - -s .syncToken = "1634535194" .createDate = "2021-10-18-05-33-14" .prefixes[].ip_prefix = "3.5.140.0/22" .prefixes[].region = "ap-northeast-2" .prefixes[].service = "AMAZON" .prefixes[].network_border_group = "ap-northeast-2" .ipv6_prefixes[].ipv6_prefix = "2a05:d07a:a000::/40" .ipv6_prefixes[].region = "eu-south-1" .ipv6_prefixes[].
(read more)
LocalizationThe technology that powers Continuous Localization at CanvaBy Minh Cung and Simon HammondAt Canva, part of our commitment to inclusivity is building a global design product that’s accessible to everyone in the world. Our vision is to empower the world to design, so one of our crazy big goals is to be available in every language. With over 7,000 languages in the world, that’s a work in progress, but at the time of this post, Canva is available in 104 languages across the globe.Localization matters to us, not just because it helps fulfil our mission to be truly accessible to everyone in the world, but also because it fuels our growth. The majority of our users today work in a language other than English, and as we continue to grow, a larger and larger proportion of our users will be working in non-English languages!Canva also changes fast: since we release new features every day, we like all of them to be translated and ready to go for as many of our international users as possible. Our engineers and designers need to be aware of the needs of our users around the world, without being slowed down by manual checks and wait times.In this blog post, we’ll show you how we built a localization system that scales to as many languages as we like, without having to wait weeks for the features to roll out. We’ll look at the tooling our engineers interact with to get their features translated and merged without sacrificing quality or user experience (UX) for our users.Localization isn’t just a matter of taking the English language content from each page and passing it through machine translation. Localization requires that we maintain a sensitivity to the culture
(read more)
Hello everybody! I bring good news! GCC with Ada support has been updated in NetBSD! Now versions 10 and 11 should work on x86 and x86_64 NetBSD machines! You can find them in pkgsrc-wip (gcc10-aux) [1] and Ravenports (gcc11) [http://www.ravenports.com/]! First things first, the acknowledgements: a big thank you goes to J. Marino who did the original gcc-aux packages and who provided most if not all the work when it came to fixing the threads and symbols. Another big thank you goes to tobiasu who correctly picked up that the pthread structure wrappers were not correct and had to be remade. Another big thank you goes to Jay Patelani for his help with pkgsrc. So, long story short. Most of the work that had been done up until a few weeks ago was done correctly, but the failing tests (most related to tasking) were failing in very strange ways. It happened that the pthread structure memory that the Ada wrapper was using was incorrect, so we were getting completely erratic behaviour. Once that got fixed, pretty much all tests passed. J. Marino also took the time and effort to create __gnat_* function wrappers to all the symbols that the NetBSD people have renamed. This is a much cleaner fix and allows for the renamed functions to generate the correct symbols since now they are getting preprocessed. It should also be more "upstream friendly". The issue, however, remains if NetBSD decides to rename more functions that are still being linked directly. There are
(read more)
Andrew Huth October 20, 2021 Accessibility isn't fixing a giant backlog of audit bugs. Being accessible is a design and engineering process that identifies and fixes issues in a tight feedback loop. And ideally involves testing with real people. Subscribe below to get future posts from Andrew Huth Or grab the RSS feed
(read more)
authorBaptiste Daroussin 2021-10-19 06:46:12 +0000 committerBaptiste Daroussin 2021-10-20 07:34:05 +0000 commitd410b585b6f00a26c2de7724d6576a3ea7d548b7 (patch) treecaa69cf6c229e93127ed7df91709158475d255c2 parentef0d94a3d34c880bd9f86cd842ee01b6075bc1d8 (diff)downloadsrc-d410b585b6f00a26c2de7724d6576a3ea7d548b7.tar.gzsrc-d410b585b6f00a26c2de7724d6576a3ea7d548b7.zip sh(1): make it the default shell for the root userIn the recent history sh(1) has gain the missing features for it to become a usable interractive shell: - command completion - persistent history support - improvements on the default
(read more)
I grew up with Release Early, Release Often [0] mantra, first as way to take part of the FLOSS community, but also to try to gather feedback to grow my skills. The latter is mostly unsuccessful, I used to mostly receive indirect feedback e.g. other projects picking up the same ideas, and more recently direct feedback in the form of private or public e-mails, conversations on IRC, mailing lists or forums (: e.g. https://lobste.rs/ :) [0] https://en.wikipedia.org/wiki/The_Cathedral_and_the_Bazaar#Lessons_for_creating_good_open_source_software There is many ways to release a project,
(read more)
This blog post describes the optimizations enabled by -ffast-math when compiling C or C++ code with GCC 11 for x86_64 Linux (other languages/operating systems/CPU architectures may enable slightly different optimizations). Most of the “fast math” optimizations can be enabled/disabled individually, and -ffast-math enables all of them:1 -ffinite-math-only -fno-signed-zeros -fno-trapping-math -fassociative-math -fno-math-errno -freciprocal-math -fun-safe-math-optimizations -fcx-limited-range When compiling standard C (that is, when using -std=c99 etc. instead of the default “GNU C” dialect), -ffast-math also enables -ffp-contract=fast, allowing the compiler to combine multiplication and addition instructions with an FMA instruction (godbolt). C++ and GNU C a
(read more)
Xorshift is a simple, fast pseudorandom number generator developed by George Marsaglia. The generator combines three xorshift operations where a number is exclusive-ored with a shifted copy of itself: /* 16-bit xorshift PRNG */ unsigned xs = 1; unsigned xorshift( ) { xs ^= xs << 7; xs ^= xs >> 9; xs ^= xs << 8; return xs; } There are 60 shift triplets with the maximum period 216-1. Four triplets pass a series of lightweight randomness tests including randomly plotting various n × n matrices using the high bits, low bits, reversed bits, etc. These are: 6, 7, 13; 7, 9, 8; 7, 9, 13; 9, 7, 13. 7, 9, 8 is the most efficient when implemented in Z80, generating a number in 86 cycles. For comparison the example in C takes approx ~1200 cycles when compiled with HiSoft C v1.3
(read more)
On September 30, 2021, Google released version 94.0.4606.71 of Chrome. The release note specified that two of the fixed vulnerabilities, CVE-2021-37975 and CVE-2021-37976, are being exploited in the wild. In this post, I’ll analyse CVE-2021-37975 (reported by an anonymous researcher), which is a logic bug in the implementation of the garbage collector (GC) in v8 (the JavaScript interpreter of Chrome). This bug allows reachable JavaScript objects to be collected by the garbage collector, which then leads to a use-after-free (UAF) vulnerability for arbitrary objects in v8. In this post, I’ll go through the root cause analysis and exploit development for the bug. As usual, the disclaimer here is that, as I don’t have access to any information about the bug that isn’
(read more)
The source and binary releases v2.9 are available for clone and download. The official v2.9 of AsmBB has been released. Here is a link to the release commits in the repository: https://asm32.info/fossil/asmbb/timeline?t=v2.9&n=200 The important changes One new responsive theme has been created, named "Urban Sunrise". This is an attempt to really improve the forum appearance. ( feedback is welcome ). Also, this theme contains really improved post editors with embedded extended help for the post formatting. In addition it supports Unicode Emoji characters in really native way, both in the post editor and the real-time chat: 😃 🤖 🏆 🥇 "Urban Sunrise" supports source code syntax highlighting (through the JS library). The real-time chat now accepts m
(read more)
Some time ago, I wrote a Twitter thread about one of the unseen hard problems in software development—access to the common knowledge. Since then, a few things happened to me, one of the most important being the inception of the new project trying to attack those problems, named WikipediaQL (that had already attracted some positive attention even in the early stages it is). I am still working on that project and plan a series of articles on the problems of common sense knowledge extraction and practical approaches to it. As a prelude for this series and linkable justification of various aspects of my work, the current article is the (more orderly) republishing of the Twitter thread above. Here goes. The problem Some of the hardest problems to bring in the software development ec
(read more)
An analysis of current and potential kernel security mitigations Posted by Jann Horn, Project Zero This blog post describes a straightforward Linux kernel locking bug and how I exploited it against Debian Buster's 4.19.0-13-amd64 kernel. Based on that, it explores options for security mitigations that could prevent or hinder exploitation of issues similar to this one. I hope that stepping through such an exploit and sharing this compiled knowledge with the wider security community can help with reasoning about the relative utility of various mitigation approaches. A lot of the individual exploitation techniques and mitigation options that I am describing here aren't novel. However, I believe that there is value in writing them up together to show how various mitigations interact
(read more)
Author Name Kurt Mackey Twitter @mrkurt Fly.io runs apps close to users, by transmuting Docker containers into micro-VMs that run on our own hardware around the world. This is a post about one of the major costs of running a service like ours, but if you're more interested in how Fly.io works, the easiest way to learn more is to try it out; you can be up and running in just a couple minutes.Two obvious costs of running Internet apps for users on your own hardware: hardware and bandwidth. We buy big servers and run them in racks at network providers that charge us to route large volumes of traffic using BGP4 Anycast. You probably have at least a hazy intuition for what those costs look like. But there's a non-obvious cost to what we do: we need routable IPv4 addresses. To
(read more)
Making a good Pull Request involves more than writing good code. The Pull Request model has turned out to be a great way to build software in teams - particularly for distributed teams; not only for open source development, but also in enterprises. Since some time around 2010, I've been reviewing Pull Requests both for my open source projects, but also as a team member for some of my customers, doing closed-source software, but still using the Pull Request work flow internally. During all of that time, I've seen many great Pull Requests, and some that needed some work. A good Pull Request involves more than just some code. In most cases, there's one or more reviewer(s) involved, who will have to review your Pull Request in order to evaluate whether it'
(read more)
I frequently run against an issue when submitting stories from Ferrous Systems here: I think there’s an ambiguity around the use of the author field and I have the gut feeling there’s different information needs here. As an example, this story is written by Jorge, not me: https://lobste.rs/s/f2uvgf/structuring_testing_debugging Strictly speaking, I am not the author - so if the author field is just there to mark the author in the replies, I should not check it. But I am a managing employee/owner of that company. So I’d like to disclose that - even though I think the content is relevant for this platform and interesting - I have an interest in it being posted here. I am not ashamed of doing so, but it feels bad to not to be open about this - it enables readers to give feed
(read more)
In this blog post we'll explore how to structure a procedural macro, AKA proc-macro, crate to make it easier to test. We'll show different testing approaches: unit tests, for code coverage; compile-fail tests, to ensure good error messages; and end-to-end tests, to validate behavior We'll also share some techniques for debugging the seemingly inscrutable error messages generated by buggy procedural macros. ⚠️ This post assumes you know the basics of creating a procedural macro crate and are familiar with the libraries syn, a Rust code parser, and quote, a Rust code generator – a refresher section on those concepts is included below. Procedural macros operate at the syntax level, as source code transformations, before any type checking happens. The output of procedural
(read more)
Look, I get it. You don’t like the Perl pro­gram­ming lan­guage or have oth­er­wise dis­re­gard­ed it as ​“dead.” (Or per­haps you haven’t, in which case please check out my oth­er blog posts!) It has weird noisy syn­tax, mix­ing reg­u­lar expres­sions, sig­ils on vari­able names, var­i­ous braces and brack­ets for data struc­tures, and a menagerie of cryp­tic spe­cial vari­ables. It’s old: 34 years in December, with a his­to­ry of (some­times ama­teur) devel­op­ers that have used and abused that syn­tax to ship code of ques­tion­able qual­i­ty. Maybe you grudg­ing­ly accept its util­i­ty but think it should die grace­ful­ly, main­tained only to run lega­cy applicatio
(read more)
TL;DR: The setuptools team no longer wants to be in the business of providing a command line interface and is actively working to become just a library for building packages. This does not mean that setuptools itself is deprecated, or that using setup.py to configure your package builds is going to be removed. The only thing you must stop doing is directly executing the setup.py file — instead delegate that to purpose-built or standards-based tools, preferably those that work with any build backend. For a long time, setuptools and distutils were the only game in town when it came to creating Python packages, and both of these provided a simple enough interface: you write a setup.py file that invokes the setup() method, you get a Makefile-like interface exposed by invoki
(read more)
If you have ever looked at fuzzing in any depth you will quickly realize it’s not as trivial as it first appears. There are many different types of fuzzers, but here we are focused on network fuzzers.  These fuzzers are of particular interest as they are most suited to fuzzing telecoms products/protocols, where the application and source code are generally not available.  There are very few fuzzer options when the input to these applications is via network sockets instead of the more traditional command line interface. In this blog we will cover some basic background of fuzzing, before diving into the specifics of trying to fuzz telecoms 5G protocols using both proprietary and open source fuzzers.  We will aim to assess these fuzzers for their suitability to fuzz 5G network protocols.  We will end this post with a comparison of the fuzzers, some of the findings, and a general conclusion regarding the fuzzing of 5G telecoms protocols. The focus of this research is on the use of the different fuzzers with regard to 5G telecoms protocols and not the specific vulnerabilities that were found, although one example vulnerability found is cited within. Background So, what is fuzzing?  Fuzzing is simply an automated process of sending invalid or random inputs to a program/system under test in an attempt to cause a crash or malfunction. Fuzzing is not a new technology, however it is becoming more prominent in today’s software development life cycle. It is often used to find vulnerabilities in software that might otherwise be missed by normal unit/system tests.  While the high-level concept of fuzzing is easy to grasp, the actual implementation
(read more)
A new IHP release with new features and many bug fixes. This release also includes the Stripe Integration and Docker support for IHP Pro and IHP Business users 🚀 💰 Payments with Stripe: This version finally ships one of the most requested features, the Stripe Integration. With the new Stripe Integration you can easily deal with payments and subscriptions in your projects. If you ever wanted to build as SaaS with Haskell, today is the best day to start! 💻 module Web.Controller.CheckoutSessions where import Web.Controller.Prelude import qualified IHP.Stripe.Types as Stripe import qualified IHP.Stripe.Actions as Stripe instance Controller CheckoutSessionsController where beforeAction = ensureIsUser action CreateCheckoutSessionAction = do plan <- query @Plan |> fetchOne stripeCheckoutSession <- Stripe.send Stripe.CreateCheckoutSession { successUrl = urlTo CheckoutSuccessAction , cancelUrl = urlTo CheckoutCancelAction , mode = "subscription" , paymentMethodTypes = ["card"] , customer = get #stripeCustomerId currentUser , lineItem = Stripe.LineItem { price = get #stripePriceId plan , quantity = 1 , taxRate = Nothing , adjustableQuantity = Nothing } , metadata = [ ("userId", tshow currentUserId) , ("planId", tshow planId) ] } redirectToUrl (get #url stripeCheckoutSession) action CheckoutSuccessAction = do plan <- fetchOne (get #planId c
(read more)
int auth_call(auth_session_t *as, char *path, ...) { char *line; struct authdata *data; struct authopts *opt; pid_t pid; int status; int okay; int pfd[2]; int argc; char *argv[64]; /* 64 args should be more than enough */ #define Nargc (sizeof(argv)/sizeof(argv[0])) va_start(as->ap0, path); argc = 0; if ((argv[argc] = _auth_next_arg(as)) != NULL) ++argc; if (as->fd != -1) { argv[argc++] = "-v"; argv[argc++] = "fd=4"; /* AUTH_FD, see below */ } /* XXX - fail if out of space in argv */ for (opt = as->optlist; opt != NULL; opt = opt->next) { if (argc < Nargc - 2) { argv[argc++] = "-v"; argv[argc++] = opt->opt; } else { syslog(LOG_ERR, "too many authentication options"); goto fail; } } while (argc < Nargc - 1 && (argv[argc] = _auth_next_arg(as))) ++argc; if (argc >= Nargc - 1 && _auth_next_arg(as)) { if (memcmp(&nilap, &(as->ap0), sizeof(nilap)) != 0) { va_end(as->ap0); explicit_bzero(&(as->ap0), sizeof(as->ap0)); } if (memcmp(&nilap, &(as->ap), sizeof(nilap)) != 0) { va_end(as->ap); explicit_bzero(&(as->ap), sizeof(as->ap)); } syslog(LOG_ERR, "too many arguments"); goto fail; } argv[argc] = NULL; if (socketpair(PF_LOCAL, SOCK_STREAM, 0, pfd) == -1) { syslog(LOG_ERR, "unable to create backchannel %m"); warnx("internal resource failure"); goto fail; } switch (pid = fork()) { case -1: syslog(LOG_ERR, "%s: %m", path); warnx("internal resource failure"); close(pfd[0]); close(pfd[1]); goto fail; case 0: #define COMM_FD 3 #define AUTH_FD 4 if (dup2(pfd[1], COMM_FD) == -1) err(1, "dup of backchannel"); if (as->fd != -1) { if (dup2(as->fd, AUTH_FD) == -1) err(1, "dup of auth fd"); closefrom(AUTH_FD + 1); } else closefrom(COMM_FD + 1); execve(path, argv, auth_environ); syslog(LOG_ERR, "%s: %m", p
(read more)
October 18, 2021 I think a lot of people on the internet are wrong. In particular (well, this time, anyway), about what is and isn't valid to use in a "hostname". Yes, yes, xkcd 386, but still, humor me, will ya? Just about everywhere you look, people will tell you that a valid hostname consists only of the characters a-z, numbers, and hyphens ("-"), but that they can't start or end with a hyphen. So you start putting together a regular expression along the lines of: ^[a-z0-9]+(([a-z0-9-])*[a-z0-9]+)*$ Now, as the famous saying goes, you have (at least) two problems: DNS names are case-insensitive, so www.netmeister.org and wWw.NeTmEiStEr.OrG are equivalent: $ host www.netmeister.org www.netmeister.org is an alias for panix.netmeister.org. panix.netmeister.org has address 166.84.7.99 panix.netmeister.org has IPv6 address 2001:470:30:84:e276:63ff:fe72:3900 $ host wWw.NeTmEiStEr.OrG www.netmeister.org is an alias for panix.netmeister.org. panix.netmeister.org has address 166.84.7.99 panix.netmeister.org has IPv6 address 2001:470:30:84:e276:63ff:fe72:3900 $ And in my zone file, I can create labels using upper or lower case characters, and a lookup for either will yield all matching results: $ grep ^[bB] valid.dns.netmeister.org b IN A 203
(read more)
Normally, when you hear the phrase “trusted computing,” you think about schemes designed to create roots of trust for companies, rather than the end user. For example, Microsoft’s Palladium project during the Longhorn development cycle of Windows is a classically cited example of trusted computing used as a basis to enforce Digital Restrictions Management against the end user. However, for companies and software maintainers, or really anybody who is processing sensitive data, maintaining a secure chain of trust is paramount, and that root of trust is always the hardware. In the past, this was not so difficult: we had very simple computers, usually with some sort of x86 CPU and a BIOS, which was designed to be just enough to get DOS up and running on a system. This combination resulted in something trivial to audit and for the most part everything was fine. More advanced systems of the day, like the Macintosh and UNIX workstations such as those sold by Sun and IBM used implementations of IEEE-1275, also known as Open Firmware. Unlike the BIOS used in the PC, Open Firmware was written atop a small Forth interpreter, which allowed for a lot more flexibility in handling system boot. Intel, noting the features that were enabled by Open Firmware, ultimately decided to create their own competitor called the Extensible Firmware Interface, which was launched with the Itanium. Intel’s EFI evolved into an architecture-neutral variant known as the Unified Extensible Firmware Interface, frequently referred to as UEFI. For the most part, UEFI won against Open Firmware: the only vendor still supporting it being IBM, and only as a legacy compatibility option for their POWER machines. Arguably the demise of Open Firmware was more related to industry standardization on x86 instead of the technical quality of UEFI however. So these days the most common architecture is x86 with UEFI firmware. Although many firmwares out there are complex, this in and of itself isn’t impossible to audit: most firmware is built on top of TianoCore. However, it isn’t ideal, and is not even the largest problem with modern hardware. Low-level hardware initi
(read more)
These notes provide a more detailed discussion of major new features, including the motivation for implementing them and their usage examples. For the complete list of changes, refer to the Release Announcement or the NEWS files in the individual packages. See also the discussion of this release on r/cpp/ and r/programming/. The main focus of this release is support for build-time dependencies and the host/target configuration split that it necessitates. This support required a large amount of ground work which produced functionality useful in its own right, such as hermetic builds and configuration linking. Another notable new feature is ad hoc regex pattern rules. The following sections discuss these and other new features in detail. A note on backwards compatibility: this release cannot be upgraded to from 0.13.0 and has to be installed from scratch. 1Infrastructure 1.1New CI configurations (15 new, 58 in total) 2Toolchain 2.1Standard pre-installed build system modules 2.2Performance optimizations 3Build System 3.1Hermetic build configurations 3.2Ad hoc regex pattern rules 3.3Complete C++20 modules support with GCC 3.4Warning suppression from external C/C++ libraries 3.5Pre-defined config..develop variable 3.6Automatic DLL symbol exporting 3.7Kconfig configuration support 4Project Dependency Manager 4.1Build-time dependencies 4.2Build configuration linking 4.3Configuration preservation during synchronization 5Package Dependency Manager 5.1Build-time dependencies and configuration linking 1.1 New CI configurations (15 new, 58 in total) The following new build configurations have been added to the CI service: freebsd_13.0-clang_11.0 linux_debian_10-gcc_10.2 linux_debian_10-gcc_11.2 linux_debian_10-clang_11.0[_libc++] linux_debian_10-clang_12.0[_libc++] linux_debian_11-clang_13.0[_libc++] macos_11-clang_12.0 (Xcode 12.5.1 Clang 12.0.5) macos_11-clang_13.0 (Xcode 13
(read more)
The first FreeBSD 12.3-PRERELEASE snapshots are finally available. This means we can try them in a new ZFS Boot Environment without touching out currently running 13.0-RELEASE system. We can not take the usual path with creating new BE from our current one and upgrade it to newer version because 12.3 has older major version then the 13.0 one. This is kinda a paradox in the FreeBSD release process that when released the 12.3-RELEASE will have some newer commits and features then older 13.0-RELEASE which was released earlier this year. Of course not all things that have been committed to HEA
(read more)
Outline 1. Introduction2. How does MT19937 PRNG work?3. Using Neural Networks to model the MT19937 PRNG3.1 Using NN for State Twisting3.1.1 Data Preparation3.1.2 Neural Network Model Design3.1.3 Optimizing the NN Inputs3.1.4 Model Results3.1.5 Model Deep Dive3.1.5.1 Model First Layer Connections3.1.5.2 The Logic Closed-Form from the State Twisting Model Output3.2 Using NN for State Tempering3.2.1 Data Preparation3.2.2 Neural Network Model Design3.2.3 Model Results3.2.4 Model Deep Dive3.2.4.1 Model Layers Connections3.2.4.2 The Logic Closed-Form from the State Tempering Model Output3.2.
(read more)
2021-10-18 19:00:00 One of the tasks given me by the Python Software Foundation as part of the Developer in Residence job was to look at the state of CPython as an active software development project. What are people working on? Which standard libraries require most work? Who are the active experts behind which libraries? Those were just some of the questions asked by the Foundation. In this post I’m looking into our Git repository history and our Github PR data to find answers. All statistics below are based on public data gathered from the python/cpython Git repository and its pull requests. To make the data easy to analyze, they were converted into Python objects with a bunch of scripts that are also open source. The data is stored as a shelf which is a persistent dictionary-like object. I used that since it was the simplest possible thing I could do and very flexible. It was easy to control consistency that way, which was important for doing incremental updates: the project we’re analyzing changes every hour! Downloading Github PR data from scratch using its REST API is a time consuming process due to the rate limits it imposes of clients. It will take you multiple hours to do it. Fortunately, since this is based on the immutable history of the Git repository and historical pull requests, you can speed things up significantly if you download an existing shelve.db file and start from there. Before we begin The work here is based on a snapshot of data in time, it deliberately merges some information, skips over other information, and might otherwise be incomplete or inaccurate due to it essentially being preliminary work. Please avoid drawing far reaching conclusions from this post alone. Who is who? Even though the entire dataset comes from public sources, e-mail addresses are considered personally identifiable information so I avoid collecting them by using Github usernames instead. This is mostly fine but is a tricky proposition when data from the Git repository needs to be linked as well. Commit authors and co-authors are listed in commit metadata and the commit message using the Authored-by and Co-authored-by headers) using the traditional NAME notation. To link them, I used a handy user search endpoint in Github’s REST API. Again, due to rate limits, I cache the results (including misses) to avoid wasting queries on address
(read more)
The papers over the next few weeks will be from (or related to) research from VLDB 2021 - on the horizon is one of my favorite systems conferences SOSP. As always, feel free to reach out on Twitter with feedback or suggestions about papers to read! These paper reviews can be delivered weekly to your inbox, or you can subscribe to the Atom feed. TAO: Facebook’s Distributed Data Store for the Social Graph This is the first in a two part series on TAOTAO stands for “The Associations and Objects” - associations are the edges in graph, and objects are the nodes. , Facebook’s read-optimized, eventually-consistent graph database. Unlike other graph databases, TAO focuses exclusively on serving and caching a constrained set of application requests at scale (in contrast to systems focused on data analysis). Furthermore, the system builds on expertise scaling MySQL and memcache as discussed in a previous paper review. The first paper in the series focuses on the original TAO paper, describing the motivation for building the system, it’s architecture, and engineering lessons learned along the way. The second part focuses on TAO-related research published at this year’s VLDB - RAMP-TAO: Layering Atomic Transactions on Facebook’s Online TAO Data Store. This new paper describes the design and implementation of transactions on top of the existing large scale distributed system - a task made more difficult by the requirement that applications should gradually migrate to the new functionality and that the work to support transactions should have limited impact on the performance of existing applications. What are the paper’s contributions? The original TAO pa
(read more)
%PDF-1.5 %���� 103 0 obj << /Length 3692 /Filter /FlateDecode >> stream xڍZY���~���#�j� �)�/���I���J!�$2��}�v�PK3�� ��ܳ~�\%��*Y}�Y����|����JV*��rv�f�RY�Ԭ����C�n�[�K��_M����ƨ$�׺=�7�%�?�{�� ?�}��e���?eq��i\|��Ư6&��Ҽ�wk�n_ ��K�W���h��EW �q��u`��RXN��e���]wh�0:_e0��U�0�dx��ߕ��<v���t�b��8K��ֱS���P��OL���~ )�q���EܣV��z��Ԧp8����[~�(����v�ʺ� yw�������8sk��L�w�D5옩�_v�(�w���{�*�� ��%�;���3�W��B�F�$Vt ��O�X�L�7]y�c��(��7��!�������#�p��W� �iM<�^h�\l X�51��z�o��/���I�'�"�I�a5h\�y�W6w���w��b�6��~���߷P��fѩ�Z�L�{�z��#w��@Wտp�j7H��� �q�j��ꋱ��'��}T��TѶ�����c�/Mw�ˢ�����t��kp6C�Hj��(�d��,΀��g�ӆiys���D�n��ڊ���\��IX5 _��H���Z�h:�P&�VMU�}�yE���8O-���k��=�������Q�-ۊ�� |�}�u��ܪ�M���|��'.�t覾�xd��*h�Ч��ۋ��O88?W�������z�s�r�tTL�;L
(read more)
How do you know whether the code you wrote is readable? In a recent Twitter thread about pair and mob programming, Dan North observes: "That’s the tricky problem I was referring to. If you think you can write code that other humans can understand, without collaborating or calibrating with other humans, assuming that an after-the-fact check will always be affirmative, then you are a better programmer than me." I neither think that I'm a better programmer than Dan nor that, without collaboration, I can write code that other humans can understand. That's why I'd like someone else to review my code. Not write it together with me, but read it after I've written it. Advantages of pair and ensemble programming # Pair programming and ensemble (AKA mob) programming is an efficient way to develop software. It works for lots of people. I'm not insisting otherwise. By working together, you can pool skills. Imagine working on a feature for a typical web application. This involves user interface, business logic, data access, and possibly other things as well. Few people are experts in all those areas. Personally, I'm comfortable around business logic and data access, but know little about web front-end development. It's great to have someone else's expertise to draw on. By working together in real time, you avoid hand-offs. If I had to help implementing a feature in an asynchronous manner, I'd typically implement domain logic and data access in a REST API, then tell a front-end expert that the API is ready. This way of working introduces wait times into the process, and may also cause rework if it turns out that the way I designed the API doesn't meet the requirements of the front end. Real-time collaboration addresses some of these concerns. It also improves code ownership. In Code That Fits in Your Head, I quote Birgitta Böckeler and Nina Siessegger: "Consistent pairing makes sure that every line of code was touched or seen by at least 2 people. This increases the chances that anyone on the team feels comfortable changing the code almost anywhere. It also makes the codebase more consistent than it would be with single coders only. "Pair programming alone does not guarantee you achieve collective code ownership. You need to make sure that you also rotate people through different pairs and areas of the code, to prevent knowledge silos." With mob programming, you take many of these advantages to the next level. If you include a domain expert in the group, you can learn about what the organisation actually needs as you're developing a feature. If you include specialised testers, they may see edge cases or error modes you didn't think of. If you include UX experts, you'll have a chance to develop software that users can actually figure out how to use. There are lots of benefits to be had from pair and ensemble programming. In Code That Fits in Your Head I recommend that you try it. I've recommended it to my customers. I've had good experiences with it myself: "I’ve used [mob programming] with great success as a programming coach. In one engagemen
(read more)
Incremental backup with strong cryptographic confidentiality baked into the data model. In a small package, with no dependencies. This project is still experimental! Things may break or change. See below on status. Features Designed around public key cryptography such that decryption key can be kept offline, air-gapped. Backup to local or remote storage with arbitrary transport. Incremental update built on inode identity and hashed block contents, compatible with moving and reorganizing entire trees. Data deduplication. Low local storage requirements for change tracking -- roughly 56-120 bytes per file plus 0.1-5% of total data size. Live-streamable to storage. Compatible with append-only media. No local storage required for staging a backup that will be stored remotely. Optional support for blinded garbage-collection of blobs on the storage host side. Written entirely in C with no library dependencies. Requires no installation. Built on modern cryptographic primitives: Curve25519 ECDH, ChaCha20, and SHA-3. Status Bakelite is presently experimental and is a work in progress. The above-described features are all present, but have not been subjected to third party review or extensive testing. Moreover, many advanced features normally expected in backup software, like to controls over inclusion/exclusion of files, are not yet available. The codebase is also in transition from a rapidly developed proof of concept to something more mature and well-factored. Data formats may be subject to change. If attempting to use Bakelite as part of a real backup workflow, you should keep note of the particular version used in case it's needed for restore. Note that the actual backup format is more mature and stable than the configuration and local index format, so the more likely mode of breakage when upgrading is needing to start a new full (non-incremental) backup, not inability to read old backups. Why another backup program? Backups are inherently attack surface on the privacy/confidentiality of one's data. For decades I've looked for a backup system with the right cryptographic properties to minimize this risk, and turned up nothing, leaving me reliant on redundant copies of important things (the "Linus backup strategy") rather than universal system-wide backups. After some moderately serious data loss, I decided it was time to finally write what I had in mind. Among the near-solutions out there, some required large library or language runtime dependencies, some only worked with a particular vendor's cloud storage service, and some had poor change tracking that failed to account for whole trees being moved or renamed. But most importantly, none with incremental capability addressed the catastrophic loss of secrecy of all past and current data in the event that the encryption key was exposed. Data model A backup image is a Merkle tree of nodes representing directories, files, and file content blocks, with each node identified by a SHA-3 hash of its encrypted contents, and the root of the tree referenced by a signed summary record. For readers familiar with the git data model, this is very much like a git tree (not commit) bu
(read more)
Swarm: preview and call for collaboration For about a month now I have been working on building a game1, tentatively titled Swarm. It’s nowhere near finished, but it has at least reached a point where I’m not embarrassed to show it off. I would love to hear feedback, and I would especially love to have others contribute! Read on for more details. Swarm is a 2D tile-based resource gathering game, but with a twist: the only way you can interact with the world is by building and programming robots. And there’s another twist: the kinds of commands your robots can execute, and the kinds of programming language features they can interpret, depends on wha
(read more)
In legacy code, you can often spot explicit new and delete lurking in various places and waiting to produce pointer-related issues. This blog post shows six patterns t
(read more)
We all know and have at least once used the top(1) command to track information about our cpu and processes, but how many of you know what each field means? Today we will guide you through each of these fields. By default, top(1) displays the ‘top’ processes on each system and periodically updates this information every 2.0 seconds using the raw cpu use percentage to rank the processes in the list. Default top(1) Output This is how to default top(1) command output looks. We will use it as a base to describe each field and column. last pid: 95139; load averages: 0.21, 0.25, 0.25 up 17+06:17:12 21:01:34 22 processes: 1 running, 21 sleeping CPU: 0.1% us
(read more)
One of the hardest things for me to grokk in enterprise programming, was dependency injection (DI). Namely because that word already had a meaning to me, which didn’t require a lot of book
(read more)
✍︎A production search system.Want to build or improve a search experience? Start here.Ask a software engineer: “How would you add search functionality to your product?” or “How do I build a search engine?” You’ll probably immediately hear back something like: “Oh, we’d just launch an ElasticSearch cluster. Search is easy these days.” But is it? Numerous current products still have suboptimal search experiences. Any true search expert will tell you that few engineers have a very deep understanding of how search engines work, knowledge that’s often needed to improve search quality.Even though many open source software packages exist, and the research is vast, the knowledge around building solid search experiences is limited to a select few. Ironically, searching online for search-related expertise doesn’t yield any recent, thoughtful overviews.Emoji Legend❗ “Serious” gotcha: consequences of ignorance can be deadly 🔷 Especially notable idea or piece of technology ☁️ ️Cloud/SaaS 🍺 Open source / free software 🦏 JavaScript 🐍 Python ☕ Java 🇨 C/C++Why read this?Think of this post as a collection of insights and resources that could help you to build search experiences. It can’t be a complete reference, of course, but hopefully we can improve it based on feedback (please comment or reach out!).I’ll point at some of the most popular approaches, algorithms, techniques, and tools, based on my work on general purpose and niche search experiences of varying sizes at Google, Airbnb and several startups. ❗️Not appreciating or understanding the scope and complexity of search problems can lead to bad user experiences, wasted engineering effort, and product failure. If you’re impatient or already know a lot of this, you might find it useful to jump ahead to the tools and services sections.Some philosophyThis is a long read. But most of what we cover has four underlying principles:🔷 Search is an inherently messy problem:Queries are highly variable. The search problems are highly variable based on product needs.Think about how different Facebook search (searching a graph of people).YouTube search (searching individual videos).Or how different both of those are are from Kayak (air travel planning is a really hairy problem).Google Maps (making sense of geo-spacial data).Pinterest (pictures of a brunch you might cook one day).Quality, metrics, and processes matter a lot:There is no magic bullet (like PageRank) nor a magic ranking formula that makes for a good approach. Processes are always evolving collection of techniques and processes that solve aspects of the problem and improve overall experience, usually gradually and continuously.❗️In other words, search is not just just about building software that does ranking or retrieval (which we will discuss below) for a specific domain. Search systems are usually an evolving pipeline of components that are tuned and evolve over time and that build up to a cohesive experience.In particular, the key to success in search is building processes for evaluation and tuning into the product and development cycles. A search system architect should think
(read more)
IEEE Account Change Username/Password Update Address Purchase Details Payment Options Order History View Purchased Documents Profile Information Communications Preferences Profession and Education Technical Interests Need Help? US & Canada: +1 800 678 4333 Worldwide: +1 732 981 0060 Contact & Support
(read more)
What is DataStation?DataStation is an open-source data IDE for developers. It allows you to easily build graphs and tables with data pulled from SQL databases, logging databases, metrics databases, HTTP servers, and all kinds of text and binary files. Need to join or munge data? Write embedded scripts as needed in Python, JavaScript, Ruby, R, or Julia. All in one application. DataStation stores intermediate results as a JSON-encoded array of objects (e.g. [{ "a": 1, "b": "y" }, { "a": 2, "b": "z" }]). It uses JSON since DataStation supports scripting with intermediate res
(read more)
Ensuring timely information is available for searchers is critical. Yet historically one of the biggest pain points for website owners has been to have search engines quickly discover and consider their latest website changes. It can take days or even weeks for new URLs to be discovered and indexed in search engines, resulting in loss of potential traffic, customers, and even sales. IndexNow is a new protocol created by Microsoft Bing and Yandex, allowing websites to easily notify search engines whenever their website content is created, updated, or deleted. Using an API, once search engines are notified of updates they quickly crawl and reflect website changes in their index and search results. IndexNow is an initiative for a more efficient Internet: By telling search engines whether an URL has been changed, website owners provide a clear signal helping search engines to prioritize crawl for these URLs, thereby limiting the need for exploratory crawl to test if the content has changed. In the future, search engines intend to limit crawling of websites adopting IndexNow. IndexNow is also an initiative for a more open Internet: By notifying one search engine you will notify all search engines that have adopted IndexNow. How to adopt IndexNow API? For developers If you are a developer, good news, IndexNow is very easy to adopt. Generate a key supported by the protocol using our online key generation tool. Host the key in text file named with the value of the key at the root of your website. Start submitting URLs when your URLs are added, updated, or deleted. You can submit one URL or a set of URLs per API call. Submit one URL is easy as sending a simple HTTP request containing the URL changed and your key. https://www.bing.com/IndexNow?url=url-changed&key=your-key See detailed instructions at the Microsoft Bing IndexNow site or the IndexNow protocol website. For non-developers Good news, many popular platforms have adopted or are planning to adopt IndexNow. If you are using one of these platforms, there is nothing that you will have to do once they have adopted IndexNow. Websites Many large websites (such as eBay, LinkedIn, MSN, GitHub, Bizapedia and more) have adopted Microsoft Bing Webmaster URL submission API and are planning migration to IndexNow. If you publish content or products on these sites, your content will automatically receive the benefits of faster publishing. Content Management Systems (CMS) We are encouraging all Web Content Management Systems to adopt IndexNow to help their users get their latest website content immediately indexed and minimize crawl load on their websites. WordPress: Microsoft Bing provided the open-source code to support IndexNow to help WordPress and other CMSs to adopt IndexNow. Wix plans to integrate IndexNow. Duda will support IndexNow in a few weeks. Content Delivery Networks (CDN) Cloudflare is a global network platform. Cloudflare’s network exists between customers and servers that originate content. Cloudflare see trends in the way crawlers and bots access website resources allowing them to proactively send us signals on which crawls are likely to yield fresh
(read more)
Thank you. Before I begin the talk, I will put forth a little idea I thought of in the last day or so. It's a programming problem having to do with Graph Theory: you have a graph. The nodes contain a record with a language and a person, and, just to make the example concrete: the nodes might be (C, Ritchie), (ADA, Ichbiah), (Pascal, Wirth), or Brinch-Hansen perhaps. (Lisp, Steele), (C++, Stroustrup) might also be part of the population. There is an edge from X to Y, whenever X.Person will throw a barb in public at Y.Language. And the questions are: is this a complete graph? Does it have self-edges? If it's not complete, what cliques exist? There are all sorts of questions you can ask. I guess if it were a finite state machine, you could ask about diagnosability, too. Can you push at the people and get them to throw these barbs? [Slide 1] The paper itself tells the history of C, so I don't want to do it again. Instead, I want to do a small comparative language study, although it's not really that either. I'm going to talk about a bunch of twenty-year old languages. Other people can discuss what the languages contain. These were things that were around at the time, and I'm going to draw some comparisons between them just to show the way we were thinking and perhaps explain some things about C. Indirectly, I want to explain why C is as it is. So the actual title of the talk, as opposed to the paper, is `Five Little Languages and How They Grew.' [Slide 2] Here are the five languages: Bliss, Pascal, Algol 68, BCPL, C. All these were developed in more or less the same period. I'm going to argue that they're very much similar in a lot of ways. And each succeeded in various ways, either by use or by influence. C succeeded really without politics in a sense that we didn't do any marketing, so there must have been a need for it. What about the rest of these? Why are these languages the same? [Slide 3] In the first place, the things that they're manipulating, their atomic types, their ground-level objects, are essentially identical. They are simply machine words interpreted in various ways. The operations that they allow on these are actually very similar. This is contrast to SNOBOL, for example, which has strings, or Lisp, which has lists. The languages I'm talking about are just cleverly-designed ways of shuffling bits around; everybody knows about the operations once they've learned a bit about machine architecture. That's what I mean by concretely grounded. They're procedural, which is, to say, imperative. They don't have very fancy control structures, and they perform assignments; they're based on this old, old model of machines that pick up things, do operations, and put them someplace else. They are very much influenced by Algol 60 and FORTRAN and the other languages of that were discussed in the first HOPL conference. Mostly they were designed (speaking broadly) for `systems programming.' Certainly some of them, like BCPL and C and Bliss, are explicitly system programming languages, and Pascal has been used for that. Algol 68 didn't really have that in mind, but it really can be used for the purpose;
(read more)
Last week, I onboarded onto a new code review system. In learning how to review code in this new tool, I got to thinking about best features of previous code review systems I’ve used. What follows is an opinionated (and very incomplete) list of what I think are “table stakes” for code review in 2021. For the purposes of this post, “code review tool” refers to a web UI for reviewing code – think Github, Gitlab, BitBucket, Gerrit, Phabricator’s Differential, and Azure Devops. This list is agnostic to the choice of the underlying SCM (e.g. git, mercurial, perforce, etc.), so I refer to diffs, PRs, and CLs interchangeably.1 Requesting and Writing Reviews Table Stakes: [As author] Ability to view diffs in a work-in-progress state before sending them out for review. [As reviewer] Ability to comment on specific lines. [As reviewer] Ability to mark a comment as “actionable” (or “unresolved”), distinguishable from a “no action needed” comment. Reviewers are able to signal whether or not they think the diff is mergeable. Nice to Have: [As reviewer] Ability to comment on a specific substring of a line. [As reviewer] Ability to make in-line suggestions. [As author] Ability to accept in-line suggestions within the tool. [As reviewer] Ability to jump from files in the diff view into a full copy of the existing file. Alternatively, strong integration with a separate “code search” system, such as SourceGraph or CodeSearch. [As author] Automated “Round robin” reviewer assignment within a team. Why? “Round robin” reviewer assignment load balances code review responsibilities across a team, increases everyone’s exposure to the team’s code, and reduces some of the power differential in code review (since everyone reviews each other’s code). This works best in small- to mid-sized teams of roughly equally leveled engineers. This approach may not work in larger teams, or teams that have a large variety of experience levels. [As author] The system can automatically find reviewers based on changed files. The system can enforce that a particular reviewer (or team) signs off on a diff based on file(s) changed. This is especially important in monorepo projects, where each team only “owns” a subset of paths in the repo. Really Nice to Have: Reviewers can signal the degree of their confidence in the code (e.g. “rubber stamp” < “looks good to me” < “full approval”) [As author] Ability to view and respond to review comments inline within your IDE. Pull Requests (PRs) / Change Lists (CLs) / Diffs Table Stakes: Ability to attach issues/bugs/tasks to diffs. Implicitly, this means that the code review system needs to be somewhat aware of the system you use to keep track of issues. Code review history (including review comments) is saved somewhere, preferably indefinitely. Each PR/CL/Diff gets a unique identifier that can be referred to during review & after submission. (e.g. #123, cl/123) Ability to view the difference between versions of the same PR. For example: Author sends out version 1. Reviewer makes comments on version 1. Author edits the PR, and sends out version 2. Reviewer
(read more)
1module beam(r1, r2, shr, msr){2 /* The walking beam acts as a class I lever transferring the3 * movement from the pitmans arms to the horse head. */45 H = 12; // Height6 W = 10; // Width7 e = 10; // Total added extension89 difference(){10 union(){11 translate([(r2-r1)/2,0,H/2]) // Walking beam body12 cube([r1+r2+e, W, H], center = true);1314 rotate([90, 0, 0]) // Fulcrum or pivoting point15 cylinder(r = 2*msr, h = W, center = true);16 }1718 rotate([90,0,0]) // Pivoting point hole19 cylinder(r = msr, h = W+1, center = true);2021 translate([r2,0,H/2]) // Equalizer mounting screw hole22 cylinder(r = shr, h
(read more)
Arthur-PC - a HD64180 based computer That was my machine, the machine. It was built between about 1985 and 1992. Something was changing on it all the time. This is
(read more)
TL;DR: CharsetDecoders got several times faster in JDK 17, leaving CharsetEncoders behind. After a few false starts and some help from the community I found a trick to speed up CharsetEncoders similarly. This may or may not speed up your apps in the future. This is a technical read, but also a story about the process of failing and trying again, with no graphs and lots of distracting links to source code. Sorry. But there will be cake. Decoding / Encoding I’ve previously blogged about some JDK 17 improvements to charset decoding, where I used intrinsics originally added to optimize JEP 254 in a few places to realize localized speed-ups of 10x or more. But d
(read more)
Pyinstrument is a call stack sampling profiler with low overhead to find out time spent in your Django application. QueryCount is a simplistic ORM query count middleware that counts the nu
(read more)
Sculpt OS version 21.10 introduces GPU-accelerated graphics on Intel, media playback in the web browser, VirtualBox 6, and USB webcam support. At the first glance, the just released Sculpt 21.10 looks and feels nearly identical to the time-tested previous version 21.03. However, a look at the installable packages reveals a firework of exciting new features. First and technically most exciting, the new version enables the use of hardware-accelerated graphics on Intel GPUs, paving the ground for graphics-intensive applications and games. The GPU support is based on the combination of the Mesa library stack with our custom GPU multiplexer as featured in Genode 21.08. Note that this fresh new feature should best be regarded as experimental and be used with caution. Second, our port of the Chromium-based Falkon web browser has become able to present media content like videos and sound. Look out for the browser in the tools menu of cproc's depot. It is accompanied with a ready-to-use audio driver and a mixer component. In cases where audio output is not desired, the browser - or any other component that requests audio output - can be connected to a new component called black hole, which merely mimics an audio driver without any audible effect. Third, with the addition of the new file-vault component, Sculpt now provides an easy way to setup and use an encrypted file store using our custom CBE block encrypter as underlying crypto container. The file vault is especially useful in combination with the recall-fs component that provides each client with a distinct storage compartment. Finally, the support for USB webcams as introduced with Genode 21.05 has entered Sculpt OS in the form of a new webcam package. The webcam support can best be combined with our new port of VirtualBox 6 that is available in addition to VirtualBox version 5. With Sculpt 21.10, both VirtualBox versions can be used in parallel. Sculpt OS 21.10 is available as ready-to-use system image at the Sculpt download page and is accompanied with updated d
(read more)
In this tutorial, we will create an Ethereum token on the Polygon Network from scratch. To create our token we will use Python and Python-like programming languages (Brownie and Vyper, we will learn more about them later).By the end of this tutorial, you will have a personal token on a real Polygon network and hopefully a better understanding of how everything works on the Ethereum network.One thing to keep in mind is that the Python library we will be using today is meant for development and testing only. This means that the code we will write today is not meant to be used for production and users should not be interacting with it. However, that doesn't mean your token isn't "real". It is very real and can be used as any other token. Mainly you can just transfer it to someone else. At this stage, there isn't a ton of utility behind it, since it will be created in isolation.I have decided to call my token razzle-dazzle for unimportant reasons. Even though it doesn't matter too much, but I encourage you to come up with a fun, short name that will be somewhat personal to you.If at any point you are lost or having trouble following, you can also use my Github repo where I host this code. If you have any questions, feel free to create an issue in that repo.1. PrerequisitesIn this tut
(read more)
What to learn It's common to see people advocate for learning skills that they have or using processes that they use. For example, Steve Yegge has a set of blog posts where he recommends reading compiler books and learning about compilers. His reasoning is basically that, if you understand compilers, you'll see compiler problems everywhere and will recognize all of the cases where people are solving a compiler problem without using compiler knowledge. Instead of hacking together some half-baked solution that will never work, you can apply a bit of computer science knowledge to solve the problem in a better way with less effort. That's not untrue, but it's also not a reason to study compilers in particular because you can say that about many different areas of computer science and math. Queuing theory, computer architecture, mathematical optimization, operations research, etc. One response to that kind of objection is to say that one should study everything. While being an extremely broad generalist can work, it's gotten much harder to "know a bit of everything" and be effective because there's more of everything over time (in terms of both breadth and depth). And even if that weren't the case, I think saying “should” is too strong; whether or not someone enjoys having that kind of breadth is a matter of taste. Another approach that can also work, one that's more to my taste, is to, as Gian Carlo Rota put it, learn a few tricks: A long time ago an older and well known number theorist made some disparaging remarks about Paul Erdos' work. You admire contributions to mathematics as much as I do, and I felt annoyed when the older mathematician flatly and definitively stated that all of Erdos' work could be reduced to a few tricks which Erdos repeatedly relied on in his proofs. What the number theorist did not realize is that other mathematicians, even the very best, also rely on a few tricks which they use over and over. Take Hilbert. The second volume of Hilbert's collected papers contains Hilbert's papers in invariant theory. I have made a point of reading some of these papers with care. It is sad to note that some of Hilbert's beautiful results have been com
(read more)
I have a small Vue 2 project (an admin UI for dictmaker) that I created with vue cli six months ago. Today, I picked it up again to finish it, and started out by doing a yarn upgrade. Of course, blindly upgrading all dependencies is never a good idea, but this a tiny WIP project with just one dependency that I added, and there is a constant stream of GitHub dependabot alerts every month forcing to upgrade some dependency or other, so what is the worst that could happen? At least that is what I thought. After the upgrade, the project refused to build with the error Syntax Error: TypeError: eslint.CLIEngine is not a constructor. Really? A syntax error in a tiny project that was building just fine before the upgrade, and that too, not in the little code I wrote, but in the tooling? Has the Javascript syntax changed completely to the point of breaking in six months despite the layers and layers of transpilation black magic that the build tools have? Do developers have to take the overhead of worrying about the numerous dependencies in the toolchain as much as their own business logic? Turns out, eslint got upgraded to v8 and it has significant breaking changes that is incompatible with projects created with versions of vue cli until whenever. I just want to work on my project and really don’t care about eslint or babel or their million dependencies, but this is a showstopper, so I start Googling. Thanks to the pages and pages of cryptic stack traces on GitHub issues and Stackoverflow and no clear answers, it does not look it is worth trying to debug and fix the broken build system. I figure I might as well manually port it to Vue 3 (which is out now after many release candidates over many years), and it is the future of Vue. I decide to read up on Vue 3, its new composition API and differences with Vue 2. Of course, all of this because an upgrade has introduced a syntax error on a tiny project that was created just six months ago. So, vite is the new cool tool with which you create a Vue 3 project. It is fast because it uses esbuild, a tool that is not written in Javascript, ironically. So, I create a new Vue 3 project with vite and try to add Buefy, the UI l
(read more)
I’ve been intermittently working on a process-based compartmentalisation runtime for Verona that should let Verona programs load C++ libraries in a safely sandboxed region. It’s now past the ‘make it work’ stage, where all of the obvious security vulnerabilities are fixed, and is now at the ‘make it correct’ step, which has involved handing it over to a colleague in the Microsoft Security Response Center to do some red teaming and tell me what less-obvious things I missed. He’ll almost certainly find some things - the current implementation is very much ‘research-quality code’ - but this is the first step in making it something I’d be happy to depend on. We can then move onto ‘make it fast’. For testing there’s also a C++ API that makes it fairly easy
(read more)
arXiv:2102.06595 (math) [Submitted on 12 Feb 2021] Download PDF Abstract: I offer a revisionist interpretation of Galileo's role in the history of science. My overarching thesis is that Galileo lacked technical ability in mathematics, and that this can be seen as directly explaining numerous aspects of his life's work. I suggest that it is precisely because he was bad at mathematics that Galileo was keen on experiment and empiricism, and eagerly adopted the telescope. His reliance on these hands-on modes of research was not a pioneering contribution to scientific method, but a last resort of a mind ill equipped to make a contribution on mathematical
(read more)
8 June 2020 • Last updated 16 June 2020 We received a pair of questions that prompted this Q&A article. The first is straightforward: What type of documentation do you create for your code? Do you use UML? Subset of UML? Something else? Can you provide samples? The second question arose during a discussion on Timeless Laws of Software Development, by Jerry Fitzpatrick: The author admonishes developers to create and record a software architecture for their project before they start coding, and then says “Beware: activity diagrams, flowcharts, and sequence diagrams describe operation, not architecture.” In “Better Embedded Systems Software”, Phillip Koopman says that “an architecture is some figure that had boxes and arrows representing components and connections” and provides examples like “call graph, class diagram, data flow diagram, hardware allocation diagram, control hierarchy diagram” as well as a few exceptions to the rule (“message dictionary, real time schedule, memory map”).These definitions seem to contradict. What do you consider to be a valid architecture diagram? What are some that you have discovered you prefer over others?
(read more)
Down the rabbit hole: my brief odyssey into the esoteric world of the tight-knit time zone data maintenance community who quietly keep the world’s computers from avoiding DST-related-meltdownsThe next time your Linux or MacOS-based computer boots into the perfect time zone, say a mental thank you to Paul Eggert and the team responsible for maintaining the world time zone computer database. Photo by Pixabay from PexelsI run a small YouTube channel.And now and again, I record short videos documenting how to “do” certain things using Linux.I make them as much for myself as for my 300-odd subscribers.Because Linux, or rather doing things with it, tends to … you know … be quite complicated. And I can’t always remember how I got X to work three months later.It’s nice to create documentation I can refer back to and it’s even better if others find it interesting as they occasionally tell me they do. But, for now at least, that’s about all there is to it.Yesterday evening, I recorded a short video describing how to look up the time zone database (tzdb) to find the right way to denote the time zone on a certain world clock program (gworldclock). I wasn’t expecting that the video would give Netflix a run for its money. It hasn’t. But it has brought me into contact with a world so wonderfully weird that it could well the stuff of fiction. Thankfully it isn’t.As most techies know, time zone setting is a fairly elementary feature of computing which most operating systems bake into their graphical user interfaces (GUI). Time zones are attached to locales. Setting a locale is often done on the basis of rough geolocation which users can manually override. Once set,
(read more)
"Include what you use" means this: for every symbol (type, function variable, or macro) that you use in foo.cc, either foo.cc or foo.h should #include a .h file that exports the declaration of that symbol. The include-what-you-use tool is a program that can be built with the clang libraries in order to analyze #includes of source files to find include-what-you-use violations, and suggest fixes for them. The main goal of include-what-you-use is to remove superfluous #includes. It does this both by figuring out what #includes are not actually needed for this file (for both .cc and .h files), and replacing #includes with forward-declares when possible. 26 May 2021 iwyu 0.16 compatible with llvm+clang 12 is released. Major changes: [iwyu_tool] Accept --load/-l argument for load limiting [iwyu_tool] Signal success/failure with exit code [mappings] Harmonize mapping generators [mappings] Add mapping generator for CPython [mappings] Improve mappings for libstdc++ and Boost [cmake] Add explicit C++14 compiler flag ... and many internal improvements For the full list of closed issues see the iwyu 0.16 milestone. Contributions in this release by Alexey Storozhev, Florian Schmaus, Kim Grasman, Omer Anson, saki7. Sorr
(read more)
I've recently changed DNS provider for this blog, and that forced me to look into how DNS works a bit closer. I did manage a DNS server for a couple years circa 2006, but I have to say I'd forgotten most of it. In case I forget it again, I'm recording my notes here.DNS zoneA DNS "zone" is a set of records. There are various types of records; in general, a record is essentially a line of text of the form:domain ttl class type data The class is usually IN (standing for "internet"). There are many types; some common ones are A, MX, NS, and TXT. There is only one record of a given t
(read more)
Monday, 16 October 2006, in categories: Programming, Linux, Emacs At some hot, boring afternoon I got an _Idea_. With the help of public accessible e-mail adresses I asked 10 questions to a bunch of programmers that I consider very interesting people and I respect them for variuos things they created. Coming out with question was a 5 minute job for me - these are things I would ask about if I could speak with them personally for, let’s say, 10 minutes, and I didn’t have time for thinking too much. The last two question don’t have anything to do with programming, this is sim
(read more)
Half a century ago, MIT played a critical role in the development of the flight software for NASA’s Apollo program, which landed humans on the moon for the first time in 1969. One of the many contributors to this effort was Margaret Hamilton, a computer scientist who led the Software Engineering Division of the MIT Instrumentation Laboratory, which in 1961 contracted with NASA to develop the Apollo program’s guidance system. For her work during this period, Hamilton has been credited with popularizing the concept of software engineering. In recen
(read more)
Inlining is one of the most important compiler optimizations. We can often write abstractions and thin wrapper functions without incurring any performance penalty, because the compiler will expand the method for us at call site. If a function is not inlined, conventional wisdom says that the compiler has to assume that the method can modify any global state and change the memory behind any pointer or reference that might have “escaped”. In this short post, I’ll demonstrate exactly this effect. Furthermore, we will see that even if a function is not inlined, as long as the implementation is visible, some optimizations are still performed and sometimes
(read more)
I’m blown away by how far Nintendo Switch emulation has come in just the past few years. To me, it’s just as mind-blowing as when Valve rolled their first Proton release to the public. Linux gamers can not only play most of their favorite Windows games through Proton, but they can also play their favorite Switch games with higher frame rates and resolutions, thanks to emulation. In both cases, the experience is nearly flawless, thanks to Valve/CodeWeaver’s contributions to Wine, and the (mostly) voluntary, rigorous work programmers put in to their emulation projects to ensure a smooth, painless experience. I enjoy the Ryujinx emulator in particular, so I wanted to sit down and chat with gdkchan, the primary heart and soul behind the project (not to discredit the several other
(read more)
The time is finally here: 12.0 is officially released! 12.0 is our multiplayer update, where we made playing together as easy as possible. No more port-forwarding or other stuff: start a server and ask your friend(s) to join. We take care of the rest. For more details, please read our blog: New Multiplayer Experience Besides this major update, 12.0 also comes with some other nice features: Display icon/text whether vehicle is lost. Moving camera on title screen background. Hide block signals in GUI by default (you can toggle this in the settings). Raise the maximum NewGRF limit to 255. To name just a few. And as always, we made sure to include tons (over 85!) of bug fixes in this release. A special thanks goes out to our translators: they translated the game
(read more)
Some thoughts on the std::execution proposal and my understanding of the underlying theory. What’s proposed From the paper’s Introduction This paper proposes a self-contained design for a Standard C++ framework for managing asynchronous execution on generic execution contexts. It is based on the ideas in [P0443R14] and its companion papers. Which doesn’t tell you much. It proposes a framework where the principle abstractions are Senders, Receivers, and Schedulers. Sender A composable unit of work. Receiver Delimits work, handling completion, exceptions, or cancellation. Schedulers Arranges for the context work is done in. The primary user facing concept is the sender. Values and functions can be lifted directly into senders. Senders can be stacked together, with a
(read more)
Published on 17 Oct 2021 by Susam Pal Time for Programming Puzzles In the beginning of this month, we concluded our previous reading sessions on analytic number theory. It took about 79 hours spread across 120 meetings and 7 months to complete reading a 300-page textbook on analytic number theory. After a short break of two weeks, we are now going to resume our club activities. We have chosen the CSES Problem Set and the associated book written by Antti Laaksonen for this new series of reading sessions. Puzzles, Programming, and Mathematics I have been fond of puzzles (mathematical, programming, or otherwise) since my childhood days. Despite being a considerable time sink and having arguably very little utility in life, the activity of solving puzzles
(read more)
C++ is infamous for long compilation times. Part of the problem is that editing private declarations of a class causes recompilation for all of its users. There are several ways to work around this problem and reduce incremental compile times: Use interfaces. A lot of code using a concrete class could use an interface instead. This comes with a run-time price of using virtual functions. The Pimpl idiom (private pointer to implementation), achieves the same compile-time benefit as interfaces without the virtual calls, but comes at a run-time price of allocating and using additional heap objects. The Priv idiom, introduced below, does not sacrifice runtime performance at all, but it also only brings part of the compilation time benefits of the other approaches. Below is a short introd
(read more)
Contents Contents physics, mathematical physics, philosophy of physics Surveys, textbooks and lecture notes (higher) category theory and physics geometry of physics books and reviews, physics resources theory (physics), model (physics) experiment, measurement, computable physics mechanics mass, charge, momentum, angular momentum, moment of inertia dynamics on Lie groups rigid body dynamics field (physics) Lagrangian mechanics configuration space, state action functional, Lagrangian covariant phase space, Euler-Lagrange equations Hamiltonian mechanics phase space symplectic geometry Poisson manifold symplectic manifold symplectic groupoid multisymplectic geometry n-symplectic manifold spacetime
(read more)
/* * tclHash.c -- * * Implementation of in-memory hash tables for Tcl and Tcl-based * applications. * * Copyright © 1991-1993 The Regents of the University of California. * Copyright © 1994 Sun Microsystems, Inc. * * See the file "license.terms" for information on usage and redistribution of * this file, and for a DISCLAIMER OF ALL WARRANTIES. */ #include "tclInt.h" /* * When there are this many entries per bucket, on average, rebuild the hash * table to make it larger. */ #define REBUILD_MULTIPLIER 3 /* * The following macro takes a preliminary integer hash value and produces an * index into a hash tables bucket list. The idea is to make it so that * preliminary values that are arbitrarily similar will end up in different * buckets. The hash function was taken fro
(read more)
I write code 100 hours/week. I’ve done so for the last 2 years and, excluding a life-altering event (illness?) I probably won’t stop. The average week I typically spend around: 48h/week @ day job, Sourcegraph, building developer tools 55h/week coding in Zig 7h/day sleeping 9h/week caring for self, chores, etc. 5h/week games or chatting with friends Sometimes coding drops to around 80h/week if I feel I need more sleep, time for something else, or if I’m just not feeling it. My calendar My days are pretty fluid, I don’t maintain a strict calendar - but an average week does pretty much look identical to the following. (click to expand) Green is open-source coding in Zig Red is time spent at day job Yellow is sleep Blue is bein
(read more)
Sculpt is a component-based desktop operating system that puts the user in the position of full control. It is empowered by the Genode OS Framework, which provides a comprehensive set of building blocks, out of which custom system scenarios can be created. The name Sculpt hints at the underlying idea of crafting, molding, and tweaking the system interactively. Starting from a fairly minimalistic and generic base system, this tour through the Sculpt system will cover the following topics: A boot image that is a live system, rescue system, and bootstrap system all in one, Connecting to a wired or wireless network, Installing and deploying software, Ways to tweak and int
(read more)
You ever have one of those days? Where everything’s falling apart and nothing makes sense? Where you’re left wondering “how did any of this ever work?” Yeah. That was yesterday for me. First, a little backstory.⌗ So there’s this little MLP:FIM community I’m a part of called Cider and Saddle. It’s a small bunch, only about 6 people all-told, and we’re all good friends. In that group are a couple of nerds like myself who enjoy tinkering with computers and programming. Shortly after I joined CaS, us nerds got together and discussed the idea of building up a set of linked services for the community to use, backed by some kind of internal account system. The specifics of this system will definitely be the subject of a future blog post, but I’ll offer some highlights here:
(read more)
Okay, so we just spent two posts talking about weird macro trivia. We’ve completely lost sight of why we were writing a test framework in the first place, so let’s review the bug we’re trying to fix: This isn’t a particularly difficult or complicated or scary bug. You can actually see what the problem is just by looking at that image: the rays are not sorted properly. That green one in the bottom right is out of order, and that’s messing up the triangle fan. So you can sort of guess where I’m going with all of this – you know I’m going to end up writing a test case for the “sort the points around the origin” function. Something like this: (test "triangle fan points are sorted in the right order" (def points [[14.2 103.2] [123.442 132.44] ...]) (expect (sort-points
(read more)
The Recursive InterNetwork Architecture (RINA) is a new computer network architecture proposed as an alternative to the architecture of the currently mainstream Internet protocol suite. RINA's fundamental principles are that computer networking is just Inter-Process Communication or IPC, and that layering should be done based on scope/scale, with a single recurring set of protocols, rather than based on function, with specialized protocols. The protocol instances in one layer interface with the protocol instances on higher and lower layers via new concepts and entities that effectively reify networking functions currently specific to protocols like BGP, OSPF and ARP. In this way, RINA claims to support features like mobility, multihoming and quality of service without the need for additional specialized protocols like RTP and UDP, as well as to allow simplified network administration without the need for concepts like autonomous systems and NAT. Background[edit] The principles behind RINA were first presented by John Day in his 2008 book Patterns in Network Architecture: A return to Fundamentals.[1] This work is a start afresh, taking into account lessons learned in the 35 years of TCP/IP’s existence, as well as the lessons of OSI’s failure and the lessons of other network technologies of the past few decades, such as CYCLADES, DECnet, and Xerox Network Systems. The starting point for a radically new and different network architecture like RINA is an attempt to solve or a response to the following problems which do not appear to have practical or compromise-free solutions with current network architectures, especially the Internet protocol suite and its functional
(read more)
We make a lot of bad decisions in organization and system design because of the way our heads are shaped. It's not our fault, but to fix it we have to understand it.I have been using the phrase "feedback loop" for my entire career. An immature way to look at the world, simplifying as much as humanly possible, is in terms of cause and effect. We do A, doing A makes B happen. It's so simple that it's built into all lifeforms. Plants grow one way, they get more sun. Every now and then, herd creatures like deer or meerkats stop what they're doing, pop their heads up, and look for predators.There's no programming language or master plan required for this. Thing A happens. It makes thing B happen. Over time lifeforms that embrace A->B in certain environments breed better than those who don't.When we observe, make notes, talk about, and study things around us, we think in the same format. It's not surprising. Various things in life are just strings of causality; D->E->G->H. Our job as natural philosophers and scientists is simply categorizing those things and events, then describing the various chains that we've categorized.As soon as people started talking about one thing causing another, we asked "Why?" If there's always cause and effect, what is the ultimate cause of everything? The study of ultimate causes, reasoning backwards from cause and effect, is called teleology.This works very well for many people and is our default way of talking about the world. It worked for mankind as a whole for tens of thousands of years, up until just relatively recently. But over time, as various species went extinct, the weather was found to be different from place-to-place, ri
(read more)
In the last few months, I migrated both my workstation and my servers (a DigitalOcean VPS and a Raspberry Pi 3) to NixOS. To best summarize the benefits, let's just say that it's like having a "dotfiles" repo, but for your entire system (or multiple!), including custom software, service configuration, drivers, kernel tweaks, etc. While a similar result could be achieved by more mainstream configuration management tools, such as Puppet or Ansible, they do not integrate deeply into your OS. As such, over a longer period it's likely for your system to accumulate all sorts of manual tweaks done outside of your configuration management framework. The system's configuration is not fully described by your declarative configuration anymore and you will likely not reproduce the system exactly if you have to recreate it from scratch. NixOS treats the system as mostly immutable and makes it way harder to mess something up: you can't just edit the files under /etc or upgrade globally installed packages by hand. Most meaningful changes you'll only be able to do via edits to configuration.nix and nixos-rebuild switch, which replaces the entire system with a new generation. This also provides an ability to rollback everything easily and effectively, which conventional tools often lack or implement incompletely. Despite the situation getting progressively better, while learning Nix and NixOS I still felt like some parts of the stack are under-documented. While there are extensive resources such as the NixOS manual, Nix pills, or nix.dev; what I lacked the most was how-tos or recipes which would show how to connect the tools to achieve your desired goal. Time and time again, I had to resort to reading others' configurations and the source code in the nixpkgs repository. This series of articles intends to fill this niche somewhat. Most of these methods are the same ones I use to run my own "private cloud". My own infrastructure is quite pedestrian - I only run a few services, including hosting my websites and email - so it's not exactly a guide for "enterprise battle-tested infrastructure". However, once I nailed the process down, I have definitely had
(read more)
Introduction I recently purchased a thermal receipt printer off of AliExpress for a project. It features both WiFi and USB connectivity which I thought was really cool for the price. To my dismay, I realized after purchasing that the drivers and configuration application only run on Windows. This wasn't a huge deal, as thermal printers generally use the somewhat kinda standardized command set called ESC/POS. Unfortunately while many of the formatting commands are shared between printers, the commands to setup the WiFi connection don't seem to be documented anywhere, and I suspect are device-specific. Since booting into Windows every time I want to manage the printer's network settings isn't ideal, I decided to reverse engineer the WiFi configuration commands. Initially I tried to run the configuration tool in wine, but it couldn't communicate with the printer over USB, which wasn't too surprising. Running in Windows I booted my spare laptop into windows and launched the config tool there. I then setup the WiFi through Advanced -> Set Net At this point I noticed that the application supported configuring the printer over the network, meaning I might be able to change the settings under wine again, as network
(read more)
Performance is one of the top reasons developers choose Rust for their applications. In fact, it's the first reason listed under the "Why Rust?" section on the rust-lang.org homepage, even before memory safety. This is for good reason too--many benchmarks show that software written in Rust is fast, sometimes even the fastest. This doesn't mean that everything written in Rust is guaranteed to be fast, though. In fact, it's surprisingly easy to write slow Rust code, especially when attempting to appease the borrow checker by cloning or Arc-ing instead of borrowing, a strategy which is generally recommended to new Rust users. That's why it's important to profile and benchmark Rust code to see where any bottlenecks are and to fix them, just like you would in any other language. In this post, I'll demonstrate some basic tools and techniques for doing so, based on my recent experience working to improve the performance of the mongodb crate. Note: all the example code used in this post can be found here. Index Profiling Benchmarking Using perf and cargo flamegraph to generate flamegraphs Identifying bottlenecks in a flamegraph Attack of the Clone Speeding up deserialization Analyzing results Viewing Criterion's HTML report Performing a realistic benchmark using wrk Next Steps Conclusion Shameless plug References Acknowledgments Profiling Whenever doing any kind of performance tuning work, it is absolutely essential to profile the code before attempting to fix anything, since bottlenecks can often reside in unexpected places, and suspected bottlenecks are often not as impactful as assumed. Not adhering to this principle can lead to premature optimization, which may unnecessarily complicate the code and waste development time. This is also why newcomers are advised to liberally clone things when getting started--the clones can help with readability and probably won't have a serious impact on performance, but if they do, later profiling will reveal that, so there's no need to worry about them until then (this is the "full version" of the advice). Benchmarking The first step in profiling is to establish a set of consistent benchmarks t
(read more)
Let’s face it: programming books aren’t usually much fun. Informative? Yes. Engaging? Sure. Some authors liven up their books with funny examples or witty asides, but the fun part is usually applying the knowledge found within a book, not its content.  why's (poignant) Guide to Ruby is different. It's chock-full of comic strips, strange digressions, and seemingly off-topic sidebars. Cartoon foxes offer peanut-gallery style commentary to the text. A strip about an elf and his pet ham provide increasingly strange problems for code examples. As the guide unfolds, the book's prose plays off the strips, and vice versa, until the author writes himself in, disguised as a rabbit, and bounds off the page. Needless to say, it's an unusual way to write a book about computer programming. But it's anything but boring. The (poignant) Guide was written by the pseudonymous programmer, artist, and musician known as “_why the lucky stiff,” a prolific member of the Ruby community during the programming languages explosive growth in the 00s. _why brought his eccentric humor and sense of adventure to everything he did, mixing his art into programming and his programming into his art. For example, his keynote at RailsConf 2006 featured musical interludes with his band The Thirsty Cups and a guest interview about pudding. “_why is our Picasso, or perhaps Dali,” says Software Architect Eleanor McHugh. “He would write beautiful code, code written just because it could be done, rather than to solve a particular problem. He wrote code because it was fun, and the code was justified by its own existence.” For a few years, it seemed like _why was everywhere: blogs, mailing lists, Twitter, GitHub, and the conference circuit. Then on August 19, 2009 he was gone. For a few years, it seemed like _why was everywhere: blogs, mailing lists, Twitter, GitHub, and the conference circuit. Then on August 19, 2009 he was gone. He deleted his website, his social media profiles, and his GitHub account along with all the projects he had created. Apart from a brief return in 2013 to publish a series of characteristically strange documents, he has stayed gone and made it clear he do
(read more)
The dust is now settling and we finally should have all proposals for new C23 features. Below you find links to some of the newer ones that I co-authored. Many previous proposals are still open because WG14 only voted in favor and now they have to be revisted before they can be decided. Only reserve names of optional functions if necessary. This mitigates the naming explosion in and other headers. http://open-std.org/jtc1/sc22/wg14/www/docs/n2839.htmAvoid evaluation of sizeof for VLA expressions for which the size is already known http://open-std.org/jtc1/sc22/wg14/www/docs/n2
(read more)
A mini assembler for x86_64, written for fun and learning. Minias can assemble itself and many/most things compiled with the cproc C compiler i.e., large amounts of real world software. Project Goals: A simple, tiny, fast implementation (in that order). Assemble the output of cproc/qbe and chibicc. Relocatable elf output. Non Goals: Assemble every assembly instruction. Assemble other architectures. Work as a library. Install the peg/leg parser generator, make and a C compiler and run: or leg asm.peg > asm.peg.inc cc -O2 *.c -o minias Essential features: Self host with cpr
(read more)
The 1.18 release of the Go language is likely to include by far the biggest change to the language since its creation: parametric polymorphism, colloquially called generics. There has been much discussion about how the core libraries will adapt, and how to make that adaptation. See #45955 and #48594 for example, and there are others already and sure to be more soon. How to use these ideas in the standard library requires great thought and planning. Putting them in the library now also adds a significant burden to rolling out the release. I propose that we do not update the libraries in 1.18. The reason is simple and compelling: It's too much to do all at once, and we might get it wrong. The language changes have been worked on in some form for over a decade, but the library changes are very new, and we have no experience with the use of the new types in Go on which to base a strong case for their design. Yes, we can reason about them at length and much has been done. Experience with other languages helps, but one thing Go has taught us is that it grows its own ways of doing things. For generics, we don't know what those new ways are yet. Also, the compatibility promise makes the cost of getting any detail wrong quite high. We should wait, watch, and learn. Instead, I propose we still design, build, test, and use new libraries for slices, maps, channels, and so on, but start by putting them in the golang/x/exp repository. That way, these new libraries - which are truly experimental at this stage - can be tested in production, but can be changed, adapted, and grown for a cycle or two, letting the whole community try them out, if they are interested and willing to accept a little instability, without requiring every detail of every component to be ready from day one. Once they have soaked a bit, and updated through experience, we move them into the main repo as we have done with other externally-grown packages, but with the confidence that they work well in practice and are deserving of our compatibility promise. I realize everyone wants to get their hands on the fun of the new language feature, and is looking forward to fixing some of the issues in the core libraries that will be less clumsy once it arrives, but I strongly believe it is best to take it slow for now. Use, learn, study, and move cautiously.
(read more)
Austin Z. Henley Assistant Professor Home | Publications | Teaching | Blog | CV 10/15/2021Imagine if someone summoned a magical genie and wished for a perfect code editor. Since it is perfect, does that mean it provides you everything you ever need to code the optimal solution? Or since it is perfect, does it enable you to accomplish the coding aspect instantly? Thus, the paradox: Does the perfect code editor mean that you spend nearly 100% of your work time using the editor or does it mean you spend nearly 0% of your work time using the editor? What metric can we even use to measure the perfect code editor? How will we know if and when we have it? Are we close to reaching that point? The case for 100% of time in the editor The trend for mainstream code editors seems to involve bringing more information into the editor. Version control info is displayed in the file listing. Feedback from linters, analyzers, and compilers are annotated on top of the editor. Issue tracking. Author info. Code reviews. Test status. All in the editor! Bringing this information closer together should mean a better feedback loop. The user doesn't need to bounce between information sources. This results in fewer clicks, less context switching, and lower cognitive load. In particular, I spend a lot of time rummaging through 20 browser tabs containing Stack Overflow, GitHub, and documentation pages. Even worse is searching on Slack. It is all annoying! But we are discussing the perfect code editor. All possible information is now inside the editor. Imagine GitHub's Copilot, Stack Overflow, all your customer feedback,
(read more)
31 May 2008 One of the most impressive hacks I've ever read about has to be the Black Sunday kill. Since the original 2001 Slashdot article I read on this is 99.9% quote, I'm going to do the same. I can see why they quoted so extensively; it'd be difficult to improve on the unusually succinct, well written summary provided by Pat from Belch: One of the original smart cards, entitled 'H' cards for Hughes, had design flaws which were discovered by the hacking community. These flaws enabled the extremely bright hacking community to reverse engineer their design, and to create smart card writers. The writers enabled the hackers to read and write to the smart card, and allowed them to change their subscription model to receive all the channels. Since the technology of satellite television is broadcast only, meaning you cannot send information TO the satellite, the system requires a phone line to communicate with DirecTV. The hackers could re-write their smart cards and receive all the channels, and unplug their phone lines leaving no way for DirecTV to track the abuse. DirecTV had built a mechanism into their system that allowed the updating of these smart cards through the satellite stream. Every receiver was designed to 'apply' these updates when it received them to the cards. DirecTV applied updates that looked for hacked cards, and then attempted to destroy the cards by writing updates that
(read more)
Outline 1. Introduction2. How does xorshift128 PRNG work?3. Neural Networks and XOR gates4. Using Neural Networks to model the xorshift128 PRNG4.1 Neural Network Model Design4.2 Model Results4.3 Model Deep Dive5. Creating a machine-learning-resistant version of xorshift1286. Conclusion 1. Introduction This blog post proposes an approach to crack Pseudo-Random Number Generators (PRNGs) using machine learning. By cracking here, we mean that we can predict the sequence of the random numbers using previously generated numbers without the knowledge of the seed. We started by breaking a simple PRNG, namely XORShift, following the lead of the post published in [1]. We simplified the structure of the neural network model from the one proposed in that post. Also, we have achieved a higher accuracy. This blog aims to show how to train a machine learning model that can reach 100% accuracy in generating random numbers without knowing the seed. And we also deep dive into the trained model to show how it worked and extract useful information from it. In the mentioned blog post [1], the author replicated the xorshift128 PRNG sequence with high accuracy without having the PRNG seed using a deep learning model. After training, the model can use any consecutive four generated numbers to replicate the same sequence of the PRNG with bitwise accuracy greater than 95%. The details of this experiment’s implementation and the best-trained model can be found in [2].  At first glance, this seemed a bit counter-intuitive as the whole idea behind machine learning algorithms is to learn from the patterns in the data to perform a specific task, ranging from supervised, unsupervised to reinforcement learning. On the other hand, the pseudo-random number generators’ main idea is to generate random sequences and, hence, these sequences should not follow any pattern. So, it did not make any sense (at the beginning) to train an ML model, which learns from the data patterns, from PRNG that should not follow any pattern. Not only learn, but also get a 95% bitwise accuracy, which means that the model will generate the PRNG’s exact output and only gets, on average, two bits wrong. So, how is this possible? And why can machine learning crack the PRNG? Can we even get better than the 95% bitwise accuracy? That is what we are going to discuss in the rest of this article. Let’s start first by examining the xorshift128 algorithm. ** Editor’s Note: How does this relate to security? While this research looks at a non-cryptographic PRNG, we are interested, generically, in understanding how deep learning-based approaches to finding latent patterns within functions presumed to be generating random output could work, as a prerequisite to attempting to use deep learning to find previously-unknown patterns in cryptographic (P)RNGs, as this could potentially serve as an interesting supplementary method for cryptanalysis of these functions. Here, we show our work in beginning to explore this space. ** 2. How does xorshift128 PRNG work? To understand whether machine learning (ML) could crack the xorshift128 PRNG, we need to comprehend how it works and chec
(read more)
If you’re using Python in the world of data science or scientific computing, you will soon discover that Python has two different packaging systems: pip and Conda. Which raises some questions: How are they different? What are the tradeoffs between the two? Which should you use? While it’s not possible to answer this question for every situation, in this article you will learn the basic differences, constrained to: Python only; Conda has support for other languages but I won’t go into that. Linux, including running on Docker, though with some mention of macOS and Windows. Focusing on the Conda-Forge package repository; Conda has multiple package repositories, or “channels”. By the end you should understand why Conda exists, when you might want to use it, and the tradeoffs between choosing each one. The starting point: which kind of dependencies? The fundamental difference between pip and Conda packaging is what they put in packages. Pip packages are Python libraries like NumPy or matplotlib. Conda packages include Python libraries (NumPy or matplotlib), C libraries (libjpeg), and executables (like C compilers, and even the Python interpreter itself). Pip: Python libraries only For example, let’s say you want to install Python 3.9 with NumPy, Pandas, and the gnuplot rendering tool, a tool that is unrelated to Python. Here’s what the pip requirements.txt would look like: Installing Python and gnuplot is out of scope for pip. You as a user must deal with this yourself. You might, for example, do so with a Docker image: FROM ubuntu:20.04 RUN apt-get update && apt-get install -y gnuplot python3.9 COPY requirements.txt . RUN pip install -r requirements.txt Both the Python interpreter and gnuplot need to come from system packages, in this case Ubuntu’s packages. Conda: Any dependency can be a Conda package (almost) With Conda, Python and gnuplot are just more Conda packages, no different than NumPy or Pandas. The environment.yml that corresponds (somewhat) to the requirements.txt we saw above will include all of these packages: name: myenv channels: - conda-forge dependencies: - python=3.9 - numpy - pandas - gnuplot Conda only relies on the operating system for basic facilities, like the standard C library. Everything above that is Conda packages, not system packages. We can see the difference if the corresponding Dockerfile; there is no need to install any system packages: FROM continuumio/miniconda3 COPY environment.yml . RUN conda env create This base image ships with Conda pre-installed, but we’re not relying on any existing Python install, we’re installing a new one in the new environment. Note: Outside the very specific topic under discussion, the Dockerfiles in this article are not examples of best practices, since the added complexity would obscure the main point of the article. To ensure you’re following all the best practices you need to have a secure, correct, fast Dockerfiles, check out the Python on Docker Production Handbook. Why Conda packages everything Why did Conda make the decision to package everything, Python interpreter included? Ho
(read more)
Under a Creative Commons licenseopen accessAbstractThe construction of various categories of “timed sets” is described in which the timing of maps is considered modulo a “complexity order”. The properties of these categories are developed: under appropriate conditions they form discrete, distributive restriction categories with an iteration. They provide a categorical basis for modeling functional complexity classes and allow the development of computability within these settings. Indeed, by considering “program objects” and the functions they compute, one can obtain models of computability – i.e. Turing categories – in which the total maps belong to specific complexity class
(read more)
Soft Vendor TAKERU was the world’s first PC software vendor machine developed by Brother Industries, Ltd. In 1986. It is a page in the Japanese PC history books. It also made a big contributi
(read more)
If you’re knowledgeable in a technical field, writing a book to teach others a few things can be a rewarding experience on many different levels. With the many avenues available for self-publishing these days, an important question to ask yourself is “Do I need a publisher?”. The answer depends on your particular situation and your particular set of skills. In the twenty years that I have been writing books, I have taken both routes, and this post is a collection of the many things I have learned along the way. Using a Traditional Publisher A traditional publisher generally commissions you to write a manuscript, and then they do all the work necessary to turn that manu
(read more)
Microsoft delivers the latest Windows security and user experiences updates monthly. Updates are modular meaning that, regardless of which update you currently have
(read more)