May 23, 2018

hackergotchi for Jonathan Dowland

Jonathan Dowland


I'm experimenting with Mastodon, an alternative to Twitter. My account is I'm happy for recommendations on interesting people to follow!

Inspired by Iustin, I also started taking a look at Hakyll as a possible replacement for IkiWiki. (That's at, although there's nothing to see yet.)

23 May, 2018 04:18PM

hackergotchi for Benjamin Mako Hill

Benjamin Mako Hill

Natural experiment showing how “wide walls” can support engagement and learning

Seymour Papert is credited as saying that tools to support learning should have “high ceilings” and “low floors.” The phrase is meant to suggest that tools should allow learners to do complex and intellectually sophisticated things but should also be easy to begin using quickly. Mitchel Resnick extended the metaphor to argue that learning toolkits should also have “wide walls” in that they should appeal to diverse groups of learners and allow for a broad variety of creative outcomes. In a new paper, Sayamindu Dasgupta and I attempted to provide an empirical test of Resnick’s wide walls theory. Using a natural experiment in the Scratch online community, we found causal evidence that “widening walls” can, as Resnick suggested, increase both engagement and learning.

Over the last ten years, the “wide walls” design principle has been widely cited in the design of new systems. For example, Resnick and his collaborators relied heavily on the principle in the design of the Scratch programming language. Scratch allows young learners to produce not only games, but also interactive art, music videos, greetings card, stories, and much more. As part of that team, Sayamindu was guided by “wide walls” principle when he designed and implemented the Scratch cloud variables system in 2011-2012.

While designing the system, Sayamindu hoped to “widen walls” by supporting a broader range of ways to use variables and data structures in Scratch. Scratch cloud variables extend the affordances of the normal Scratch variable by adding persistence and shared-ness. A simple example of something possible with cloud variables, but not without them, is a global high-score leaderboard in a game (example code is below). After the system was launched, we saw many young Scratch users using the system to engage with data structures in new and incredibly creative ways.

cloud-variable-scriptExample of Scratch code that uses a cloud variable to keep track of high-scores among all players of a game.

Although these examples reflected powerful anecdotal evidence, we were also interested in using quantitative data to reflect the causal effect of the system. Understanding the causal effect of a new design in real world settings is a major challenge. To do so, we took advantage of a “natural experiment” and some clever techniques from econometrics to measure how learners’ behavior changed when they were given access to a wider design space.

Understanding the design of our study requires understanding a little bit about how access to the Scratch cloud variable system is granted. Although the system has been accessible to Scratch users since 2013, new Scratch users do not get access immediately. They are granted access only after a certain amount of time and activity on the website (the specific criteria are not public). Our “experiment” involved a sudden change in policy that altered the criteria for who gets access to the cloud variable feature. Through no act of their own, more than 14,000 users were given access to feature, literally overnight. We looked at these Scratch users immediately before and after the policy change to estimate the effect of access to the broader design space that cloud variables afforded.

We found that use of data-related features was, as predicted, increased by both access to and use of cloud variables. We also found that this increase was not only an effect of projects that use cloud variables themselves. In other words, learners with access to cloud variables—and especially those who had used it—were more likely to use “plain-old” data-structures in their projects as well.

The graph below visualizes the results of one of the statistical models in our paper and suggests that we would expect that 33% of projects by a prototypical “average” Scratch user would use data structures if the user in question had never used used cloud variables but that we would expect that 60% of projects by a similar user would if they had used the system.

Model-predicted probability that a project made by a prototypical Scratch user will contain data structures (w/o counting projects with cloud variables)

It is important to note that the estimated effective above is a “local average effect” among people who used the system because they were granted access by the sudden change in policy (this is a subtle but important point that we explain this in some depth in the paper). Although we urge care and skepticism in interpreting our numbers, we believe our results are encouraging evidence in support of the “wide walls” design principle.

Of course, our work is not without important limitations. Critically, we also found that rate of adoption of cloud variables was very low. Although it is hard to pinpoint the exact reason for this from the data we observed, it has been suggested that widening walls may have a potential negative side-effect of making it harder for learners to imagine what the new creative possibilities might be in the absence of targeted support and scaffolding. Also important to remember is that our study measures “wide walls” in a specific way in a specific context and that it is hard to know how well our findings will generalize to other contexts and communities. We discuss these caveats, as well as our methods, models, and theoretical background in detail in our paper which now available for download as an open-access piece from the ACM digital library.

This blog post, and the open access paper that it describes, is a collaborative project with Sayamindu Dasgupta. Financial support came from the eScience Institute and the Department of Communication at the University of Washington. Quantitative analyses for this project were completed using the Hyak high performance computing cluster at the University of Washington.

23 May, 2018 04:17PM by Benjamin Mako Hill

hackergotchi for Vincent Bernat

Vincent Bernat

Multi-tier load-balancing with Linux

A common solution to provide a highly-available and scalable service is to insert a load-balancing layer to spread requests from users to backend servers.1 We usually have several expectations for such a layer:

It allows a service to scale by pushing traffic to newly provisioned backend servers. It should also be able to scale itself when it becomes the bottleneck.
It provides high availability to the service. If one server becomes unavailable, the traffic should be quickly steered to another server. The load-balancing layer itself should also be highly available.
It handles both short and long connections. It is flexible enough to offer all the features backends generally expect from a load-balancer like TLS or HTTP routing.
With some cooperation, any expected change should be seamless: rolling out a new software on the backends, adding or removing backends, or scaling up or down the load-balancing layer itself.

The problem and its solutions are well known. From recently published articles on the topic, “Introduction to modern network load-balancing and proxying” provides an overview of the state of the art. Google released “Maglev: A Fast and Reliable Software Network Load Balancer” describing their in-house solution in details.2 However, the associated software is not available. Basically, building a load-balancing solution with commodity servers consists of assembling three components:

  • ECMP routing
  • stateless L4 load-balancing
  • stateful L7 load-balancing

In this article, I describe and support a multi-tier solution using Linux and only open-source components. It should offer you the basis to build a production-ready load-balancing layer.

Update (2018.05)

Facebook just released Katran, an L4 load-balancer implemented with XDP and eBPF and using consistent hashing. It could be inserted in the configuration described below.

Last tier: L7 load-balancing🔗

Let’s start with the last tier. Its role is to provide high availability, by forwarding requests to only healthy backends, and scalability, by spreading requests fairly between them. Working in the highest layers of the OSI model, it can also offer additional services, like TLS-termination, HTTP routing, header rewriting, rate-limiting of unauthenticated users, and so on. Being stateful, it can leverage complex load-balancing algorithm. Being the first point of contact with backend servers, it should ease maintenances and minimize impact during daily changes.

L7 load-balancers
The last tier of the load-balancing solution is a set of L7 load-balancers receiving user connections and forwarding them to the backends.

It also terminates client TCP connections. This introduces some loose coupling between the load-balancing components and the backend servers with the following benefits:

  • connections to servers can be kept open for lower resource use and latency,
  • requests can be retried transparently in case of failure,
  • clients can use a different IP protocol than servers, and
  • servers do not have to care about path MTU discovery, TCP congestion control algorithms, avoidance of the TIME-WAIT state and various other low-level details.

Many pieces of software would fit in this layer and an ample literature exists on how to configure them. You could look at HAProxy, Envoy or Træfik. Here is a configuration example for HAProxy:

# L7 load-balancer endpoint
frontend l7lb
  # Listen on both IPv4 and IPv6
  bind :80 v4v6
  # Redirect everything to a default backend
  default_backend servers
  # Healthchecking
  acl dead nbsrv(servers) lt 1
  acl disabled nbsrv(enabler) lt 1
  monitor-uri /healthcheck
  monitor fail if dead || disabled

# IPv6-only servers with HTTP healthchecking and remote agent checks
backend servers
  balance roundrobin
  option httpchk
  server web1 [2001:db8:1:0:2::1]:80 send-proxy check agent-check agent-port 5555
  server web2 [2001:db8:1:0:2::2]:80 send-proxy check agent-check agent-port 5555
  server web3 [2001:db8:1:0:2::3]:80 send-proxy check agent-check agent-port 5555
  server web4 [2001:db8:1:0:2::4]:80 send-proxy check agent-check agent-port 5555

# Fake backend: if the local agent check fails, we assume we are dead
backend enabler
  server enabler [::1]:0 agent-check agent-port 5555

This configuration is the most incomplete piece of this guide. However, it illustrates two key concepts for operability:

  1. Healthchecking of the web servers is done both at HTTP-level (with check and option httpchk) and using an auxiliary agent check (with agent-check). The later makes it easy to put a server to maintenance or to orchestrate a progressive rollout. On each backend, you need a process listening on port 5555 and reporting the status of the service (UP, DOWN, MAINT). A simple socat process can do the trick:3

    socat -ly \
      TCP6-LISTEN:5555,ipv6only=0,reuseaddr,fork \

    Put UP in /etc/lb/agent-check when the service is in nominal mode. If the regular healthcheck is also positive, HAProxy will send requests to this node. When you need to put it in maintenance, write MAINT and wait for the existing connections to terminate. Use READY to cancel this mode.

  2. The load-balancer itself should provide an healthcheck endpoint (/healthcheck) for the upper tier. It will return a 503 error if either there is no backend servers available or if put down the enabler backend through the agent check. The same mechanism as for regular backends can be used to signal the unavailability of this load-balancer.

Additionally, the send-proxy directive enables the proxy protocol to transmit the real clients’ IP addresses. This protocol also works for non-HTTP connections and is supported by a variety of servers, including nginx:

http {
  server {
    listen [::]:80 default ipv6only=off proxy_protocol;
    root /var/www;
    set_real_ip_from ::/0;
    real_ip_header proxy_protocol;

As is, this solution is not complete. We have just moved the availability and scalability problem somewhere else. How do we load-balance the requests between the load-balancers?

First tier: ECMP routing🔗

On most modern routed IP networks, redundant paths exist between clients and servers. For each packet, routers have to choose a path. When the cost associated to each path is equal, incoming flows4 are load-balanced among the available destinations. This characteristic can be used to balance connections among available load-balancers:

ECMP routing
ECMP routing is used as a first tier. Flows are spread among available L7 load-balancers. Routing is stateless and asymmetric. Backend servers are not represented.

There is little control over the load-balancing but ECMP routing brings the ability to scale horizontally both tiers. A common way to implement such a solution is to use BGP, a routing protocol to exchange routes between network equipments. Each load-balancer announces to its connected routers the IP addresses it is serving.

If we assume you already have BGP-enabled routers available, ExaBGP is a flexible solution to let the load-balancers advertise their availability. Here is a configuration for one of the load-balancers:

# Healthcheck for IPv6
process service-v6 {
  run python -m exabgp healthcheck -s --interval 10 --increase 0 --cmd "test -f /etc/lb/v6-ready -a ! -f /etc/lb/disable";
  encoder text;

template {
  # Template for IPv6 neighbors
  neighbor v6 {
    local-address 2001:db8::;
    local-as 65000;
    peer-as 65000;
    hold-time 6;
    family {
      ipv6 unicast;
    api services-v6 {
      processes [ service-v6 ];

# First router
neighbor 2001:db8:: {
  inherit v6;

# Second router
neighbor 2001:db8:: {
  inherit v6;

If /etc/lb/v6-ready is present and /etc/lb/disable is absent, all the IP addresses configured on the lo interface will be announced to both routers. If the other load-balancers use a similar configuration, the routers will distribute incoming flows between them. Some external process should manage the existence of the /etc/lb/v6-ready file by checking for the healthiness of the load-balancer (using the /healthcheck endpoint for example). An operator can remove a load-balancer from the rotation by creating the /etc/lb/disable file.

To get more details on this part, have a look at “High availability with ExaBGP.” If you are in the cloud, this tier is usually implemented by your cloud provider, either using an anycast IP address or a basic L4 load-balancer.

Unfortunately, this solution is not resilient when an expected or unexpected change happens. Notably, when adding or removing a load-balancer, the number of available routes for a destination changes. The hashing algorithm used by routers is not consistent and flows are reshuffled among the available load-balancers, breaking existing connections:

Stability of ECMP routing 1/2
ECMP routing is unstable when a change happens. An additional load-balancer is added to the pool and the flows are routed to different load-balancers, which do not have the appropriate entries in their connection tables.

Moreover, each router may choose its own routes. When a router becomes unavailable, the second one may route the same flows differently:

Stability of ECMP routing 2/2
A router becomes unavailable and the remaining router load-balances its flows differently. One of them is routed to a different load-balancer, which do not have the appropriate entry in its connection table.

If you think this is not an acceptable outcome, notably if you need to handle long connections like file downloads, video streaming or websocket connections, you need an additional tier. Keep reading!

Second tier: L4 load-balancing🔗

The second tier is the glue between the stateless world of IP routers and the stateful land of L7 load-balancing. It is implemented with L4 load-balancing. The terminology can be a bit confusing here: this tier routes IP datagrams (no TCP termination) but the scheduler uses both destination IP and port to choose an available L7 load-balancer. The purpose of this tier is to ensure all members take the same scheduling decision for an incoming packet.

There are two options:

  • stateful L4 load-balancing with state synchronization accross the members, or
  • stateless L4 load-balancing with consistent hashing.

The first option increases complexity and limits scalability. We won’t use it.5 The second option is less resilient during some changes but can be enhanced with an hybrid approach using a local state.

We use IPVS, a performant L4 load-balancer running inside the Linux kernel, with Keepalived, a frontend to IPVS with a set of healthcheckers to kick out an unhealthy component. IPVS is configured to use the Maglev scheduler, a consistent hashing algorithm from Google. Among its family, this is a great algorithm because it spreads connections fairly, minimizes disruptions during changes and is quite fast at building its lookup table. Finally, to improve performance, we let the last tier—the L7 load-balancers—sends back answers directly to the clients without involving the second tier—the L4 load-balancers. This is referred to as direct server return (DSR) or direct routing (DR).

Second tier: L4 load-balancing
L4 load-balancing with IPVS and consistent hashing as a glue between the first tier and the third tier. Backend servers have been omitted. Dotted lines represent the path for the return packets.

With such a setup, we expect packets from a flow to be able to move freely between the components of the two first tiers while sticking to the same L7 load-balancer.


Assuming ExaBGP has already been configured like described in the previous section, let’s start with the configuration of Keepalived:

virtual_server_group VS_GROUP_MH_IPv6 {
  2001:db8:: 80
virtual_server group VS_GROUP_MH_IPv6 {
  lvs_method TUN  # Tunnel mode for DSR
  lvs_sched mh    # Scheduler: Maglev
  sh-port         # Use port information for scheduling
  protocol TCP
  delay_loop 5
  alpha           # All servers are down on start
  omega           # Execute quorum_down on shutdown
  quorum_up   "/bin/touch /etc/lb/v6-ready"
  quorum_down "/bin/rm -f /etc/lb/v6-ready"

  # First L7 load-balancer
  real_server 2001:db8:: 80 {
    weight 1
    HTTP_GET {
      url {
        path /healthcheck
        status_code 200
      connect_timeout 2

  # Many others...

The quorum_up and quorum_down statements define the commands to be executed when the service becomes available and unavailable respectively. The /etc/lb/v6-ready file is used as a signal to ExaBGP to advertise the service IP address to the neighbor routers.

Additionally, IPVS needs to be configured to continue routing packets from a flow moved from another L4 load-balancer. It should also continue routing packets from unavailable destinations to ensure we can drain properly a L7 load-balancer.

# Schedule non-SYN packets
sysctl -qw net.ipv4.vs.sloppy_tcp=1
# Do NOT reschedule a connection when destination
# doesn't exist anymore
sysctl -qw net.ipv4.vs.expire_nodest_conn=0
sysctl -qw net.ipv4.vs.expire_quiescent_template=0

The Maglev scheduling algorithm will be available with Linux 4.18, thanks to Inju Song. For older kernels, I have prepared a backport.6 Use of source hashing as a scheduling algorithm will hurt the resilience of the setup.

DSR is implemented using the tunnel mode. This method is compatible with routed datacenters and cloud environments. Requests are tunneled to the scheduled peer using IPIP encapsulation. It adds a small overhead and may lead to MTU issues. If possible, ensure you are using a larger MTU for communication between the second and the third tier.7 Otherwise, it is better to explicitely allow fragmentation of IP packets:

sysctl -qw net.ipv4.vs.pmtu_disc=0

You also need to configure the L7 load-balancers to handle encapsulated traffic:8

# Setup IPIP tunnel to accept packets from any source
ip tunnel add tunlv6 mode ip6ip6 local 2001:db8::
ip link set up dev tunlv6
ip addr add 2001:db8:: dev tunlv6

Evaluation of the resilience🔗

As configured, the second tier increases the resilience of this setup for two reasons:

  1. The scheduling algorithm is using a consistent hash to choose its destination. Such an algorithm reduces the negative impact of expected or unexpected changes by minimizing the number of flows moving to a new destination. “Consistent Hashing: Algorithmic Tradeoffs” offers more details on this subject.

  2. IPVS keeps a local connection table for known flows. When a change impacts only the third tier, existing flows will be correctly directed according to the connection table.

If we add or remove a L4 load-balancer, existing flows are not impacted because each load-balancer takes the same decision, as long as they see the same set of L7 load-balancers:

L4 load-balancing instability 1/3
Loosing a L4 load-balancer has no impact on existing flows. Each arrow is an example of flow. The dots are flow endpoints bound to the associated load-balancer. If they had moved to another load-balancer, connection would have been lost.

If we add a L7 load-balancer, existing flows are not impacted either because only new connections will be scheduled to it. For existing connections, IPVS will look at its local connection table and continue to forward packets to the original destination. Similarly, if we remove a L7 load-balancer, only existing flows terminating at this load-balancer are impacted. Other existing connections will be forwarded correctly:

L4 load-balancing instability 2/3
Loosing a L7 load-balancer only impacts the flows bound to it.

We need to have simultaneous changes on both levels to get a noticeable impact. For example, when adding both a L4 load-balancer and a L7 load-balancer, only connections moved to a L4 load-balancer without state and scheduled to the new load-balancer will be broken. Thanks to the consistent hashing algorithm, other connections will stay bound to the right L7 load-balancer. During a planned change, this disruption can be minimized by adding the new L4 load-balancers first, waiting a few minutes, then adding the new L7 load-balancers.

L4 load-balancing instability 3/3
Both a L4 load-balancer and a L7 load-balancer come back to life. The consistent hash algorithm ensures that only one fifth of the existing connections would be moved to the incoming L7 load-balancer. Some of them continue to be routed through their original L4 load-balancer, which mitigates the impact.

Additionally, IPVS correctly routes ICMP messages to the same L7 load-balancers as the associated connections. This ensures notably path MTU discovery works and there is no need for smart workarounds.

Tier 0: DNS load-balancing🔗

Optionally, you can add DNS load-balancing to the mix. This is useful either if your setup is spanned accross multiple datacenters, or multiple cloud regions, or if you want to break a large load-balancing cluster into smaller ones. It is not intended to replace the first tier as it doesn’t share the same characteristics: load-balancing is unfair (it is not flow-based) and recovery from a failure is slow.

Complete load-balancing solution
A complete load-balancing solution spanning accross two datacenters.

gdnsd is an authoritative-only DNS server with integrated healthchecking. It can serve zones from master files using the RFC 1035 zone format:

@ SOA ns1 1 7200 1800 259200 900
@ NS
@ NS
@ MX 10 smtp

@     60 DYNA multifo!web
www   60 DYNA multifo!web
smtp     A

The special RR type DYNA will return A and AAAA records after querying the specified plugin. Here, the multifo plugin implements an all-active failover of monitored addresses:

service_types => {
  web => {
    plugin => http_status
    url_path => /healthcheck
    down_thresh => 5
    interval => 5
  ext => {
    plugin => extfile
    file => /etc/lb/ext
    def_down => false

plugins => {
  multifo => {
    web => {
      service_types => [ ext, web ]
      addrs_v4 => [, ]
      addrs_v6 => [ 2001:db8::, 2001:db8:: ]

In nominal state, an A request will be answered with both and An healthcheck failure will update the returned set accordingly. It is also possible to administratively remove an entry by modifying the /etc/lb/ext file. For example, with the following content, will not be advertised anymore: => UP => DOWN
2001:db8::c633:6401 => UP
2001:db8::c633:6402 => UP

You can find all the configuration files and the setup of each tier in the GitHub repository. If you want to replicate this setup at a smaller scale, it is possible to collapse the second and the third tiers by using either localnode or network namespaces. Even if you don’t need its fancy load-balancing services, you should keep the last tier: while backend servers come and go, the L7 load-balancers bring stability, which translates to resiliency.

  1. In this article, “backend servers” are the servers behind the load-balancing layer. To avoid confusion, we will not use the term “frontend.” ↩︎

  2. A good summary of the paper is available from Adrian Colyer. From the same author, you may also have a look at the summary for “Stateless datacenter load-balancing with Beamer.” ↩︎

  3. If you feel this solution is fragile, feel free to develop your own agent. It could coordinate with a key-value store to determine the wanted state of the server. It is possible to centralize the agent in a single location, but you may get a chicken-and-egg problem to ensure its availability. ↩︎

  4. A flow is usually determined by the source and destination IP and the L4 protocol. Alternatively, the source and destination port can also be used. The router hashes these information to choose the destination. For Linux, you may find more information on this topic in “Celebrating ECMP in Linux.” ↩︎

  5. On Linux, it can be implemented by using Netfilter for load-balancing and conntrackd to synchronize state. IPVS only provides active/backup synchronization. ↩︎

  6. The backport is not strictly equivalent to its original version. Be sure to check the README file to understand the differences. Briefly, in Keepalived configuration, you should:

    • not use inhibit_on_failure
    • use sh-port
    • not use sh-fallback


  7. At least 1520 for IPv4 and 1540 for IPv6. ↩︎

  8. As is, this configuration is a insecure. You need to ensure only the L4 load-balancers will be able to send IPIP traffic. ↩︎

23 May, 2018 07:45AM by Vincent Bernat

hackergotchi for Joachim Breitner

Joachim Breitner

The diameter of German+English

Languages never map directly onto each other. The English word fresh can mean frisch or frech, but frish can also be cool. Jumping from one words to another like this yields entertaining sequences that take you to completely different things. Here is one I came up with:

frechfreshfrishcoolabweisenddismissivewegwerfendtrashingverhauendbangingGeklopfeknocking – …

And I could go on … but how far? So here is a little experiment I ran:

  1. I obtained a German-English dictionary. Conveniently, after registration, you can get’s translation file, which is simply a text file with three columns: German, English, Word form.

  2. I wrote a program that takes these words and first canonicalizes them a bit: Removing attributes like [ugs.] [regional], {f}, the to in front of verbs and other embellishment.

  3. I created the undirected, bipartite graph of all these words. This is a pretty big graph – ~750k words in each language, a million edges. A path in this graph is precisely a sequence like the one above.

  4. In this graph, I tried to find a diameter. The diameter of a graph is the longest path between two nodes that you cannot connect with a shorter path.

Because the graph is big (and my code maybe not fully optimized), it ran a few hours, but here it is: The English expression be annoyed by sb. and the German noun Icterus are related by 55 translations. Here is the full list:

  • be annoyed by sb.
  • durch jdn. verärgert sein
  • be vexed with sb.
  • auf jdn. böse sein
  • be angry with sb.
  • jdm. böse sein
  • have a grudge against sb.
  • jdm. grollen
  • bear sb. a grudge
  • jdm. etw. nachtragen
  • hold sth. against sb.
  • jdm. etw. anlasten
  • charge sb. with sth.
  • jdn. mit etw. [Dat.] betrauen
  • entrust sb. with sth.
  • jdm. etw. anvertrauen
  • entrust sth. to sb.
  • jdm. etw. befehlen
  • tell sb. to do sth.
  • jdn. etw. heißen
  • call sb. names
  • jdn. beschimpfen
  • abuse sb.
  • jdn. traktieren
  • pester sb.
  • jdn. belästigen
  • accost sb.
  • jdn. ansprechen
  • address oneself to sb.
  • sich an jdn. wenden
  • approach
  • erreichen
  • hit
  • Treffer
  • direct hit
  • Volltreffer
  • bullseye
  • Hahnenfuß-ähnlicher Wassernabel
  • pennywort
  • Mauer-Zimbelkraut
  • Aaron's beard
  • Großkelchiges Johanniskraut
  • Jerusalem star
  • Austernpflanze
  • goatsbeard
  • Geißbart
  • goatee
  • Ziegenbart
  • buckhorn plantain
  • Breitwegerich / Breit-Wegerich
  • birdseed
  • Acker-Senf / Ackersenf
  • yellows
  • Gelbsucht
  • icterus
  • Icterus

Pretty neat!

So what next?

I could try to obtain an even longer chain by forgetting whether a word is English or German (and lower-casing everything), thus allowing wild jumps like hathuthüttelodge.

Or write a tool where you can enter two arbitrary words and it finds such a path between them, if there exists one. Unfortunately, it seems that the terms of the data dump would not allow me to create such a tool as a web site (but maybe I can ask).

Or I could throw in additional languages!

What would you do?

23 May, 2018 06:30AM by Joachim Breitner (

May 22, 2018

hackergotchi for Jonathan McDowell

Jonathan McDowell

Home Automation: Graphing MQTT sensor data

So I’ve setup a MQTT broker and I’m feeding it temperature data. How do I actually make use of this data? Turns out collectd has an MQTT plugin, so I went about setting it up to record temperature over time.

First problem was that although the plugin supports MQTT/TLS it doesn’t support it for subscriptions until 5.8, so I had to backport the fix to the 5.7.1 packages my main collectd host is running.

The other problem is that collectd is picky about the format it accepts for incoming data. The topic name should be of the format <host>/<plugin>-<plugin_instance>/<type>-<type_instance> and the data is <unixtime>:<value>. I modified my MQTT temperature reporter to publish to collectd/mqtt-host/mqtt/temperature-study, changed the publish line to include the timestamp:

publish.single(pub_topic, str(time.time()) + ':' + str(temp),
            hostname=Broker, port=8883,
            auth=auth, tls={})

and added a new collectd user to the Mosquitto configuration:

mosquitto_passwd -b /etc/mosquitto/mosquitto.users collectd collectdpass

And granted it read-only access to the collectd/ prefix via /etc/mosquitto/mosquitto.acl:

user collectd
topic read collectd/#

(I also created an mqtt-temp user with write access to that prefix for the Python script to connect to.)

Then, on the collectd host, I created /etc/collectd/collectd.conf.d/mqtt.conf containing:

LoadPlugin mqtt

<Plugin "mqtt">
        <Subscribe "ha">
                Host "mqtt-host"
                Port "8883"
                User "collectd"
                Password "collectdpass"
                CACert "/etc/ssl/certs/ca-certificates.crt"
                Topic "collectd/#"

I had some initial problems when I tried setting CACert to the Let’s Encrypt certificate; it actually wants to point to the “DST Root CA X3” certificate that signs that. Or using the full set of installed root certificates as I’ve done works too. Of course the errors you get back are just of the form:

collectd[8853]: mqtt plugin: mosquitto_loop failed: A TLS error occurred.

which is far from helpful. Once that was sorted collectd started happily receiving data via MQTT and producing graphs for me:

Study temperature

This is a pretty long winded way of ending up with some temperature graphs - I could have just graphed the temperature sensor using collectd on the Pi to send it to the monitoring host, but it has allowed a simple MQTT broker, publisher + subscriber setup with TLS and authentication to be constructed and confirmed as working.

22 May, 2018 09:28PM

hackergotchi for Eddy Petri&#537;or

Eddy Petrișor

rust for cortex-m7 baremetal

This is a reminder for myself, if you want to install rust for a baremetal Cortex-M7 target, this seems to be a tier 3 platform:

Higlighting the relevant part:

Target std rustc cargo notes
msp430-none-elf * 16-bit MSP430 microcontrollers
sparc64-unknown-netbsd NetBSD/sparc64
thumbv6m-none-eabi * Bare Cortex-M0, M0+, M1
thumbv7em-none-eabi *

Bare Cortex-M4, M7
thumbv7em-none-eabihf * Bare Cortex-M4F, M7F, FPU, hardfloat
thumbv7m-none-eabi * Bare Cortex-M3
x86_64-unknown-openbsd 64-bit OpenBSD

In order to enable the relevant support, use the nightly build and add the relevant target:
eddy@feodora:~/usr/src/rust-uc$ rustup show
Default host: x86_64-unknown-linux-gnu

installed toolchains

nightly-x86_64-unknown-linux-gnu (default)

active toolchain

nightly-x86_64-unknown-linux-gnu (default)
rustc 1.28.0-nightly (cb20f68d0 2018-05-21)
If not using nightly, switch to that:

eddy@feodora:~/usr/src/rust-uc$ rustup default nightly-x86_64-unknown-linux-gnu
info: using existing install for 'nightly-x86_64-unknown-linux-gnu'
info: default toolchain set to 'nightly-x86_64-unknown-linux-gnu'

  nightly-x86_64-unknown-linux-gnu unchanged - rustc 1.28.0-nightly (cb20f68d0 2018-05-21)
Add the needed target:
eddy@feodora:~/usr/src/rust-uc$ rustup target add thumbv7em-none-eabi
info: downloading component 'rust-std' for 'thumbv7em-none-eabi'
  5.4 MiB /   5.4 MiB (100 %)   5.1 MiB/s ETA:   0 s               
info: installing component 'rust-std' for 'thumbv7em-none-eabi'
eddy@feodora:~/usr/src/rust-uc$ rustup show
Default host: x86_64-unknown-linux-gnu

installed toolchains

nightly-x86_64-unknown-linux-gnu (default)

installed targets for active toolchain


active toolchain

nightly-x86_64-unknown-linux-gnu (default)
rustc 1.28.0-nightly (cb20f68d0 2018-05-21)
Then compile with --target.

22 May, 2018 08:35PM by eddyp (

Reproducible builds folks

Reproducible Builds: Weekly report #160

Here’s what happened in the Reproducible Builds effort between Sunday May 13 and Saturday May 19 2018:

Packages reviewed and fixed, and bugs filed

In addition, build failure bugs were reported by Adrian Bunk (2) and Gilles Filippini (1).

diffoscope development

diffoscope is our in-depth “diff-on-steroids” utility which helps us diagnose reproducibility issues in packages.

reprotest development

reprotest is our tool to build software and check it for reproducibility.

  • kpcyrd:
  • Chris Lamb:
    • Update referencess to Alioth now that the repository has migrated to Salsa. (1, 2, 3) development

There made the following changes to our Jenkins-based testing framework, including:


This week’s edition was written by Bernhard M. Wiedemann, Chris Lamb, Levente Polyak and Mattia Rizzolo & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

22 May, 2018 06:30AM

May 21, 2018

Dima Kogan

More Vnlog demos

More demos of vnlog and feedgnuplot usage! This is pretty pointless, but should be a decent demo of the tools at least. This is a demo, not documentation; so for usage details consult the normal docs.

Each Wednesday night I join a group bike ride. This is an organized affair, and each week an email precedes the ride, very roughly describing the route. The two organizers alternate leading the ride each week, and consequently the emails alternate also. I was getting the feeling that some of the announcements show up in my mailbxo more punctually than others, and after a recent 20-minutes-before-the ride email, I decided this just had to be quantified.

The emails all go to a google-group email. The google-groups people are a wheel-reinventing bunch, so talking to the archive can't be done with normal tools (NNTP? mbox files? No?). A brief search revealed somebody's home-grown tool to programmatically grab the archive:

The docs look funny, but are actually correct: you really do run the script to download stuff and generate another script; and then run that script to download the rest of the stuff.

Anyway, I used that tool to grab all the emails that are available. Then I wrote a quick/dirty script to parse out the data I care about and dump everything into a vnlog:

use strict;
use warnings;

use feature ':5.10';

my %daysofweek = ('Mon' => 0,
                  'Tue' => 1,
                  'Wed' => 2,
                  'Thu' => 3,
                  'Fri' => 4,
                  'Sat' => 5,
                  'Sun' => 6);
my %months = ('Jan' => 1,
              'Feb' => 2,
              'Mar' => 3,
              'Apr' => 4,
              'May' => 5,
              'Jun' => 6,
              'Jul' => 7,
              'Aug' => 8,
              'Sep' => 9,
              'Oct' => 10,
              'Nov' => 11,
              'Dec' => 12);

say '# path ridenum who whenwedh date wordcount subject';

for my $path (<mbox/m.*>)
    my ($ridenum,$who,$date,$whenwedh,$subject);

    my $wordcount = 0;
    my $inbody    = undef;

    open FD, '<', $path;
        if( !$inbody && /^From: *(.*?)\s*$/ )
            $who = $1;
            if(   $who =~ /sean/i)   { $who = 'sean'; }
            elsif($who =~ /nathan/i) { $who = 'nathan'; }
            else                     { $who = 'other'; }
        if( !$inbody &&
            /^Subject: \s*
             (.*?) \s* $/x )
            $subject = $1;
            ($ridenum) = $subject =~ /^(?: \# | (?:=\?ISO-8859-1\?Q\?=23) )
            $subject =~ s/[\s#]//g;
        if( !$inbody && /^Date: *(.*?)\s*$/ )
            $date = $1;

            my ($zone) = $date =~ / (\(.+\) | -0700 | -0800) /x;
            if( !defined $zone)
                die "No timezone in: '$date'";
            if( $zone !~ /PST|PDT|-0700|-0800/)
                die "Unexpected timezone: '$zone'";

            my ($Dayofweek,$D,$M,$Y,$h,$m,$s) = $date =~ /^(...),? +(\d+) +([a-zA-Z]+) +(20\d\d) +(\d\d):(\d\d):(\d\d)/;
            if( !(defined $Dayofweek && defined $h && defined $m && defined $s) )
                die "Unparseable date '$date'";
            my $dayofweek = $daysofweek{$Dayofweek} // die "Unparseable day-of-week '$Dayofweek'";

            my $t     = $dayofweek*24 + $h + ($m + $s/60)/60;
            my $twed0 = 2*24; # start of wed
            $M = $months{$M} // die "Unknown month '$M'. Line: '$_'";
            $date = sprintf('%04d%02d%02d', $Y,$M,$D);

            $whenwedh = $t - $twed0;

        if( !$inbody && /^[\r\n]*$/ )
            $inbody = 1;
        if( $inbody )
            if( /------=_Part/ || /Content-Type:/)
                last if $wordcount > 0;
                $inbody = undef;
            my @words = /(\w+)/g;
            $wordcount += @words;
    close FD;

    $who      //= '-';
    $subject  //= '-';
    $ridenum  //= '-';
    $date     //= '-';
    $whenwedh //= '-';

    say "$path $ridenum $who $whenwedh $date $wordcount $subject";

The script isn't important, and the resulting data is here. Now that I have a log on disk, I can do stuff with it. The first few lines of the log look like this:

dima@scrawny:~/projects/passagemining/google-group-crawler/the-passage-announcements$ < rides.vnl head

# path ridenum who whenwedh date wordcount subject
mbox/m.-EF1u5bbw5A.SywitKQ3y1sJ 265 sean 1.40722222222222 20140903 190 265-Coasting
mbox/m.-JdiiTIvyYs.Jgy_rCiwAGAJ 151 sean 18.6441666666667 20120606 199 151-FinalsWeek
mbox/m.-l6z9-1WC78.SgP3ytLsDAAJ 312 nathan 19.5394444444444 20150812 189 312-SpaceFilling
mbox/m.-vfVuoUxJ0w.FwpRRWC7EgAJ 367 nathan 18.1766666666667 20160831 164 367-Dislocation
mbox/m.-YHTEvmbIyU.HHWjbs_xpesJ 110 sean 10.9108333333333 20110810 407 110-SouslesParcs,laPoubelle
mbox/m.0__GMaUD_O8.Pjupq0AwBAAJ 404 sean 13.5255555555556 20170524 560 404-Bumped
mbox/m.0CT9ybx3uIU.sdZGwo8rSQUJ 53 sean -23.1402777777778 20100629 223 53WeInventedtheRemix
mbox/m.0FtQxCkxVHA.AjhGJ7mgAwAJ 413 nathan 20.4155555555556 20170726 178 413-GradientAssent
mbox/m.0haCNC_N2fY.bJ-93LQSFQAJ 337 nathan 57.3708333333333 20160205 479 337-TheCronutRide

I can align the columns to make it more human-readable:

dima@scrawny:~/projects/passagemining/google-group-crawler/the-passage-announcements$ < rides.vnl head | vnl-align

#             path              ridenum   who       whenwedh        date   wordcount           subject          
mbox/m.-EF1u5bbw5A.SywitKQ3y1sJ 265     sean     1.40722222222222 20140903 190       265-Coasting               
mbox/m.-JdiiTIvyYs.Jgy_rCiwAGAJ 151     sean    18.6441666666667  20120606 199       151-FinalsWeek             
mbox/m.-l6z9-1WC78.SgP3ytLsDAAJ 312     nathan  19.5394444444444  20150812 189       312-SpaceFilling           
mbox/m.-vfVuoUxJ0w.FwpRRWC7EgAJ 367     nathan  18.1766666666667  20160831 164       367-Dislocation            
mbox/m.-YHTEvmbIyU.HHWjbs_xpesJ 110     sean    10.9108333333333  20110810 407       110-SouslesParcs,laPoubelle
mbox/m.0__GMaUD_O8.Pjupq0AwBAAJ 404     sean    13.5255555555556  20170524 560       404-Bumped                 
mbox/m.0CT9ybx3uIU.sdZGwo8rSQUJ  53     sean   -23.1402777777778  20100629 223       53WeInventedtheRemix       
mbox/m.0FtQxCkxVHA.AjhGJ7mgAwAJ 413     nathan  20.4155555555556  20170726 178       413-GradientAssent         
mbox/m.0haCNC_N2fY.bJ-93LQSFQAJ 337     nathan  57.3708333333333  20160205 479       337-TheCronutRide          

If memory serves, we're at around ride 450 right now. Is that right?

$ < rides.vnl vnl-sort -nr -k ridenum | head -n2 | vnl-filter -p ridenum

# ridenum

Cool. This command was longer than it needed to be in order to produce nicer output. If I was exploring the dataset, I'd save keystrokes and do this instead:

$ < rides.vnl vnl-sort -nrk ridenum | head

# path ridenum who whenwedh date wordcount subject
mbox/m.7TnUbcShAz8.67KgwBGhAAAJ 452 nathan 20.7694444444444 20180502 175 452-CastingtoType
mbox/m.ej7Oz6sDzgc.bEnN04VEAQAJ 451 sean 0.780833333333334 20180425 258 451-Recovery
mbox/m.LWfydBtpd_s.35SgEJEqAgAJ 450 nathan 67.9608333333333 20180420 659 450-AnotherGreenWorld
mbox/m.3mv-Cm0EzkM.oAm3MkNYCAAJ 449 sean 17.5875 20180411 290 449-DoYouHaveRockNRoll?
mbox/m.AEV4ukSjO5U.IPlUabfEBgAJ 448 nathan 20.6138888888889 20180404 175 448-TheThirdString
mbox/m.bYTM6kgxtJs.5iHcVQKPBAAJ 447 sean 15.8355555555556 20180328 196 447-PassParticiple
mbox/m.tHMqRWp9o_Y.FQ8hFvnqCQAJ 446 nathan 20.5213888888889 20180321 139 446-Chiaroscuro
mbox/m.jr0SBsDBzgk.UHrbCv4VBQAJ 445 sean 15.3280555555556 20180314 111 445-85%
mbox/m.K2Yg_FRXuAo.SyViTwXXAQAJ 444 nathan 19.6180555555556 20180307 171 444-BackintheLoop

OK, how far back does the archive go? I do the same thing as before, but sort in the opposite order to find the ealiest rides

$ < rides.vnl vnl-sort -n -k ridenum | head -n2 | vnl-filter -p ridenum

# ridenum

Nothing. That's odd. Let me look at whole records, and at more than just the first two lines

$ < rides.vnl vnl-sort -n -k ridenum | head | vnl-align

#             path              ridenum   who       whenwedh       date   wordcount                       subject                      
mbox/m.2gywN9pxMI4.40UBrDjnAwAJ -       nathan  17.6572222222222 20171206  95       Noridetonight;daytimeridethisSaturday!             
mbox/m.49fZsvZac_U.a0CazPinCAAJ -       sean   -34.495           20170320 463       Extraridethisweekend+Passage400save-the-date       
mbox/m.5gJd21W24vo.ICDEHrnQJvcJ -       nathan  12.1063888888889 20130619 172       NoPassageRideTonight;GalleryOpeningTomorrowNight   
mbox/m.7qEbhBWSN1U.Cx6cxYTECgAJ -       nathan  17.7891666666667 20180418 134       Noridetonight;Passage450onSaturday!                
mbox/m.DVssP4Th__4.jXzzu9clZLQJ -       sean    20.9138888888889 20101222 209       TheWrathofTlaloc                                   
mbox/m.E6etBSqEQIc.C35-SkBllHoJ -       sean    50.7575          20131220 292       Noridenextweek;seeyounextyear                      
mbox/m.GyJ16HiK8Ds.z6yNC4W5SeUJ -       sean   -11.5666666666667 20120529 228       NoRideThisWeek!...AIDS/Lifecycle...ThirdAnniversary
mbox/m.H3QGBvjeTfM.CS-xRn1WDQAJ -       sean    17.0180555555555 20171227 257       Noridetonight;nextride1/6                          
mbox/m.K2P6D_BGfYU.ve6a_8l6AAAJ -       sean    37.8166666666667 20170223 150       RemainingPassageRouteMapShirtsAvailableforPurchase

Aha. A bunch of emails aren't announncing a ride, but are announcing that there's no ride that week. Let's ignore those

$ < rides.vnl vnl-filter -p +ridenum | vnl-sort -n -k ridenum | head -n2

# ridenum

Bam. So we have emails going back to ride 52. Good enough. All right. I'm aiming to create a time histogram for Sean's emails and another for Nathan's emails. What about emails that came from neither one? In theory there shouldn't be any of those, but there could be a parsing error, or who knows what.

$ < rides.vnl vnl-filter 'who == "other"'

# path ridenum who whenwedh date wordcount subject
mbox/m.A-I0_i9-YOs.QRX1P99_uiUJ 65 other 65.1413888888889 20100917 330 65-LosAngelesRidesItself+specialscreening
mbox/m.pHpzsjH7H68.O7CP_v6bcEoJ 67 other 16.5663888888889 20101006 50 67Sortition,NotSaturation

OK. Exactly 2 emails out of hundreds. That's not bad, and I'll just ignore those. Out of curiosity, what happened? Is this a parsing error?

$ grep From: $(< rides.vnl vnl-filter 'who == "other"' --eval '{print path}')

mbox/m.A-I0_i9-YOs.QRX1P99_uiUJ:From: The Passage Announcements <>
mbox/m.pHpzsjH7H68.O7CP_v6bcEoJ:From: The Passage Announcements <>

So on rides 65 and 67 "The Passage Announcements" emailed themselves. Oops. Since the ride leaders alternate, I can infer who actually sent these by looking at the few rides around this one:

$ < rides.vnl vnl-filter 'ridenum > 60 && ridenum < 70' -p ridenum,who | vnl-sort -n -k ridenum

# ridenum who
61 sean
62 nathan
63 sean
64 nathan
65 other
66 nathan
67 other
68 nathan
69 sean

That's pretty conclusive: clearly these emails came from Sean. I'm still going to ignore them, though.

The ride is on Wed evening, and the emails generally come in the day or two before then. Does my data set contain any data outside this reasonable range? Hopefully very little, just like the "other" author emails.

$ < rides.vnl vnl-filter --has ridenum -p whenwedh | feedgnuplot --histo 0 --binwidth 1 --xlabel 'Hour (on Wed)' --ylabel 'Email frequency'


The ride starts at 21:00 on Wed, and we see a nice spike immediately before. The smaller cluster prior to that is the emails that go out the night before. There's a tiny number of stragglers going out the previous day (that I'm simply going to ignore). And there're a number of emails going out after Wed. These likely announce an occasional weekend ride that I will also ignore. But let's do check. How many are there?

$ < rides.vnl vnl-filter --has ridenum 'whenwedh > 22' | wc -l


Looking at these manually, most are indeed weekend rides, with a small number of actual extra-early announcements for Wed. I can parse the email text more fancily to pull those out, but that's really not worth my time.

OK. I'm now ready for the main thing.

$ < rides.vnl | 
    vnl-filter --has ridenum 'who != "other"' -p who,whenwedh |
    feedgnuplot --dataid --autolegend
                --histo sean,nathan --binwidth 0.5
                --style sean   'with boxes fill transparent solid 0.3 border lt -1'
                --style nathan 'with boxes fill transparent pattern 1 border lt -1'
                --xmin -12 --xmax 24
                --xlabel "Time (hour)" --ylabel 'Email frequency'
                --set 'xtics ("12\n(Tue)" -12,"16\n(Tue)" -8,"20\n(Tue)" -4,"0\n(Wed)" 0,"4\n(Wed)" 4,"8\n(Wed)" 8,"12\n(Wed)" 12,"16\n(Wed)" 16,"21\n(Wed)" 21,"0\n(Thu)" 24)'
                --set 'arrow from 21, graph 0 to 21, graph 1 nohead lw 3 lc "red"'
                --title "Passage email timing distribution"


This looks verbose, but most of the plotting command is there to make things look nice. When analyzing stuff, I'd omit most of that. Anyway, I can now see what I suspected: Nathan is a procrastinator! His emails almost always come in on Wed, usually an hour or two before the deadline. Sean's emails are bimodal: one set comes in on Wed afternoon, and another in the extreme early morning on Wed. Presumably he sleeps in-between.

We have more data, so we can make more pointless plots. For instance, what does the verbosity of the emails look like? Is one sender more verbose than another?

$ < rides.vnl vnl-sort -n -k ridenum |
  vnl-filter 'who != "other"' -p +ridenum,who,wordcount |
  feedgnuplot --lines --domain --dataid --autolegend
              --xlabel 'Ride number' --ylabel 'Words per email'


$ < rides.vnl vnl-filter 'who != "other"' --has ridenum -p who,wordcount |
  feedgnuplot --dataid --autolegend
              --histo sean,nathan --binwidth 20
              --style sean   'with boxes fill transparent solid 0.3 border lt -1'
              --style nathan 'with boxes fill transparent pattern 1 border lt -1'
              --xlabel "Words per email" --ylabel 'frequency'
              --title "Passage verbosity distribution"


The time series doesn't obviously say anything, but from the histogram, it looks like Sean is a bit more verbose, maybe? What's the average?

$ < rides.vnl vnl-filter --eval 'ridenum != "-" { if(who == "sean")   { Ns++; Ws+=wordcount; }
                                                  if(who == "nathan") { Nn++; Wn+=wordcount; } }
                                 END { print "Mean verbosity sean,nathan: "Ws/Ns, Wn/Nn }'

Mean verbosity sean,nathan: 304.955 250.425

Indeed. Is the verbosity time-dependent? Is anybody getting more or less verbose over the years? The time-series plot above is pretty noisy, so it's not clear. Let's filter it to reduce the noise. We're getting into an area that's too complicated for these tools, and moving to something more substantial at this point would be warranted. But I'll do one more thing with these tools, and then stop. I can implement a half-assed filter by time-shifting the verbosity series, re-joining the shifted series, and computing the mean. I do this separately for the two email authors, and then re-combine the series. I could join these two, but simply catting the two data sets together is sufficient here.

$ < rides.vnl vnl-sort -n -k ridenum |
    vnl-filter 'who == "nathan"' --has ridenum |
    vnl-filter -p ridenum,idx=NR,wordcount > nathanrp0

$ < rides.vnl vnl-sort -n -k ridenum |
    vnl-filter 'who == "nathan"' --has ridenum |
    vnl-filter -p ridenum,idx=NR-1,wordcount > nathanrp-1

$ < rides.vnl vnl-sort -n -k ridenum |
    vnl-filter 'who == "nathan"' --has ridenum |
    vnl-filter -p ridenum,idx=NR+1,wordcount > nathanrp+1

$ ... same for Sean ...

$ cat <(vnl-join --vnl-suffix2 after --vnl-sort n -j idx
                 <(vnl-join --vnl-suffix2 before --vnl-sort n -j idx
                 nathanrp+1 |
        vnl-filter -p ridenum,who='"nathan"','wordcountfiltered=(wordcount+wordcountbefore+wordcountafter)/3')

      <(vnl-join --vnl-suffix2 after --vnl-sort n -j idx
                 <(vnl-join --vnl-suffix2 before --vnl-sort n -j idx
                 seanrp+1 |
        vnl-filter -p ridenum,who='"sean"','wordcountfiltered=(wordcount+wordcountbefore+wordcountafter)/3') |
  feedgnuplot --lines --domain --dataid --autolegend
              --xlabel 'Ride number' --ylabel 'Words per email'


Whew. Clearly this was doable, but that's a one-liner that has clearly gotten out of hand, and pushing it further would be unwise. Looking at the data there isn't any obvious time dependence. But what you can clearly see is the extra verbiage around the round-number rides 100, 200, 300, 350, 400, etc. These were often a special weekend ride, with the email containing lots of extra instructions and such.

This was all clearly a waste of time, but as a demo of vnlog workflows, this was ok.

21 May, 2018 09:43PM by Dima Kogan

hackergotchi for Daniel Pocock

Daniel Pocock

OSCAL'18 Debian, Ham, SDR and GSoC activities

Over the weekend I've been in Tirana, Albania for OSCAL 2018.

Crowdfunding report

The crowdfunding campaign to buy hardware for the radio demo was successful. The gross sum received was GBP 110.00, there were Paypal fees of GBP 6.48 and the net amount after currency conversion was EUR 118.29. Here is a complete list of transaction IDs for transparency so you can see that if you donated, your contribution was included in the total I have reported in this blog. Thank you to everybody who made this a success.

The funds were used to purchase an Ultracell UCG45-12 sealed lead-acid battery from Tashi in Tirana, here is the receipt. After OSCAL, the battery is being used at a joint meeting of the Prishtina hackerspace and SHRAK, the amateur radio club of Kosovo on 24 May. The battery will remain in the region to support any members of the ham community who want to visit the hackerspaces and events.

Debian and Ham radio booth

Local volunteers from Albania and Kosovo helped run a Debian and ham radio/SDR booth on Saturday, 19 May.

The antenna was erected as a folded dipole with one end joined to the Tirana Pyramid and the other end attached to the marquee sheltering the booths. We operated on the twenty meter band using an RTL-SDR dongle and upconverter for reception and a Yaesu FT-857D for transmission. An MFJ-1708 RF Sense Switch was used for automatically switching between the SDR and transceiver on PTT and an MFJ-971 ATU for tuning the antenna.

I successfully made contact with 9A1D, a station in Croatia. Enkelena Haxhiu, one of our GSoC students, made contact with Z68AA in her own country, Kosovo.

Anybody hoping that Albania was a suitably remote place to hide from media coverage of the British royal wedding would have been disappointed as we tuned in to GR9RW from London and tried unsuccessfully to make contact with them. Communism and royalty mix like oil and water: if a deceased dictator was already feeling bruised about an antenna on his pyramid, he would probably enjoy water torture more than a radio transmission celebrating one of the world's most successful hereditary monarchies.

A versatile venue and the dictator's revenge

It isn't hard to imagine communist dictator Enver Hoxha turning in his grave at the thought of his pyramid being used for an antenna for communication that would have attracted severe punishment under his totalitarian regime. Perhaps Hoxha had imagined the possibility that people may gather freely in the streets: as the sun moved overhead, the glass facade above the entrance to the pyramid reflected the sun under the shelter of the marquees, giving everybody a tan, a low-key version of a solar death ray from a sci-fi movie. Must remember to wear sunscreen for my next showdown with a dictator.

The security guard stationed at the pyramid for the day was kept busy chasing away children and more than a few adults who kept arriving to climb the pyramid and slide down the side.

Meeting with Debian's Google Summer of Code students

Debian has three Google Summer of Code students in Kosovo this year. Two of them, Enkelena and Diellza, were able to attend OSCAL. Albania is one of the few countries they can visit easily and OSCAL deserves special commendation for the fact that it brings otherwise isolated citizens of Kosovo into contact with an increasingly large delegation of foreign visitors who come back year after year.

We had some brief discussions about how their projects are starting and things we can do together during my visit to Kosovo.

Workshops and talks

On Sunday, 20 May, I ran a workshop Introduction to Debian and a workshop on Free and open source accounting. At the end of the day Enkelena Haxhiu and I presented the final talk in the Pyramid, Death by a thousand chats, looking at how free software gives us a unique opportunity to disable a lot of unhealthy notifications by default.

21 May, 2018 08:44PM by Daniel.Pocock

hackergotchi for Daniel Silverstone

Daniel Silverstone

Runtime typing

I have been wrestling with a problem for a little while now and thought I might send this out into the ether for others to comment upon. (Or, in other words, Dear Lazyweb…)

I am writing system which collects data from embedded computers in my car (ECUs) over the CAN bus, using the on-board diagnostics port in the vehicle. This requires me to generate packets on the CAN bus, listen to responses, including managing flow control, and then interpret the resulting byte arrays.

I have sorted everything but the last little bit of that particular data pipeline. I have a prototype which can convert the byte arrays into "raw" values by interpreting them either as bitfields and producing booleans, or as anything from an unsigned 8 bit integer to a signed 32 bit integer in either endianness. Fortunately none of the fields I'd need to interpret are floats.

This is, however, pretty clunky and nasty. Since I asked around and a majority of people would prefer that I keep the software configurable at runtime rather than doing meta-programming to describe these fields, I need to develop a way to have the data produced by reading these byte arrays (or by processing results already interpreted out of the arrays) type-checked.

As an example, one field might be the voltage of the main breaker in the car. It's represented as a 16 bit big-endian unsigned field, in tenths of a volt. So the field must be divided by ten and then given the type "volts". Another field is the current passing through that main breaker. This is a 16 bit big-endian signed value measured in tenths of an amp, so must be interpreted as as such, divided by ten, and then given the type "amps". I intend for all values handled beyond the raw byte arrays themselves to simply be floats, so there'll be signedness available regardless.

What I'd like, is to later have a "computed" value, let's call it "power flow", which is the voltage multiplied by the current. Naturally this would need to be given the type 'watts'. What I'd dearly love is to build into my program the understanding that volts times amps equals watts, and then have the reader of the runtime configuration type-check the function for "power flow".

I'm working on this in Rust, though for now the language is less important than the algorithms involved in doing this (unless you know of a Rust library which will help me along). I'd dearly love it if someone out there could help me to understand the right way to handle such expression type checking without having to build up a massively complex type system.

Currently I am considering things (expressed for now in yaml) along the lines of:

- name: main_voltage
  type: volts
  expr: u16_be(raw_bmc, 14) / 10
- name: main_current
  type: amps
  expr: i16_be(raw_bmc, 12) / 10
- name: power_flow
  type: watts
  expr: main_voltage * main_current

What I'd like is for each expression to be type-checked. I'm happy for untyped scalars to end up auto-labelled (so the u16_be() function would return an untyped number which then ends up marked as volts since 10 is also untyped). However when power_flow is typechecked, it should be able to work out that the type of the expression is volts * amps which should then typecheck against watts and be accepted. Since there's also consideration needed for times, distances, booleans, etc. this is not a completely trivial thing to manage. I will know the set of valid types up-front though, so there's that at least.

If you have any ideas, ping me on IRC or perhaps blog a response and then drop me an email to let me know about it.

Thanks in advance.

21 May, 2018 02:53PM by Daniel Silverstone

hackergotchi for Sune Vuorela

Sune Vuorela

Managing cooking recipes

I like to cook. And sometimes store my recipes. Over the years I have tried KRecipes, kept my recipes in BasKet notes, in KJots notes, in more or less random word processor documents.

I liked the free form entering recipes in various notes applications and word processor documents, but I lacked some kind of indexing them. What I wanted was free-ish text for writing recipes, and some thing that could help me find them by tags I give them. By Title. By how I organize them. And maybe by Ingredient if I don’t know how to get rid of the soon-to-be-bad in my refridgerator.

Given I’m a software developer, maybe I should try scratch my own itch. And I did in the last month and a half during some evenings. This is also where my latest Qt and modern C++ blog posts comes from

The central bit is basically a markdown viewer, and the file format is some semi structured markdown in one file per recipe. Structured in the file system however you like it.

There is a recipes index which simply is a file system view with pretty titles on top.

There is a way to insert tags into recipes.

I can find them by title.

And I can find recipes by ingredients.

Given it is plain text, it can easily be synced using Git or NextCloud or whatever solution you want for that.

You can give it a spin if you want. It lives here There is a blueprint for a windows installer here:

There is a markdown file describing the specifics of the file format. It is not declared 100% stable yet, but I need good reasons to break stuff.

My recipe collection is in my native language Danish, so I’m not sure sharing it for demo purposes makes too much sense.

21 May, 2018 11:08AM by Sune Vuorela


First GSoC Report

To whom it may concern, this is my report over the first few weeks of gsoc under the umbrella of the Debian project. I’m writing this on my way back from the minidebconf in Hamburg, which was a nice experience, maybe there will be another post about that ;) So, the goal of my GSOC project is to design and implement a new SSO solution for Debian. But that only touches one part of the projects deliveries.

21 May, 2018 08:28AM

hackergotchi for Steve Kemp

Steve Kemp

This month has been mostly golang-based

This month has mostly been about golang. I've continued work on the protocol-tester that I recently introduced:

This has turned into a fun project, and now all my monitoring done with it. I've simplified the operation, such that everything uses Redis for storage, and there are now new protocol-testers for finger, nntp, and more.

Sample tests are as basic as this: must run smtp must run smtp with port 587 must run imaps must run http with content 'Prayer Webmail service'

Results are stored in a redis-queue, where they can picked off and announced to humans via a small daemon. In my case alerts are routed to a central host, via HTTP-POSTS, and eventually reach me via the pushover

Beyond the basic network testing though I've also reworked a bunch of code - so the markdown sharing site is now golang powered, rather than running on the previous perl-based code.

As a result of this rewrite, and a little more care, I now score 99/100 + 100/100 on Google's pagespeed testing service. A few more of my sites do the same now, thanks to inline-CSS, inline-JS, etc. Nothing I couldn't have done before, but this was a good moment to attack it.

Finally my "silly" Linux security module, for letting user-space decide if binaries should be executed, can-exec has been forward-ported to v4.16.17. No significant changes.

Over the coming weeks I'll be trying to move more stuff into the cloud, rather than self-hosting. I'm doing a lot of trial-and-error at the moment with Lamdas, containers, and dynamic-routing to that end.

Interesting times.

21 May, 2018 07:00AM

hackergotchi for Martin Pitt

Martin Pitt

De-Googling my phone, reloaded

Three weeks ago I blogged about how to get rid of non-free Google services and moving to free software on my Android phone. I’ve got a lot of feedback via email, lwn, and Google+, many thanks to all of you for helpful hints! As this is obviously important to many people, I want to tie up some lose ends and publish the results of these discussions.

Alternative apps and stores

  • Yalp is a free app that is able to search, install, and update installed apps from the Google Play Store. It doesn’t even need you to have a Google account, although you can use it to install already paid apps (however, you can’t buy apps within Yalp). I actually prefer that over uptodown now.

  • I moved from FreeOTP to AndOTP. The latter offers backing up your accounts with password or GPG encryption, which is certainly much more convenient than what I’ve previously been doing with noting down the accounts and TOTP secrets in an encrypted file on my laptop.

  • We often listen to internet radio at home. I replaced the non-free ad-ware TuneIn with Transistor, a simple and free app that even has convenient launcher links for a chosen station, so it’s exactly what we want. It does not have a builtin radio station list/search, but if you care about that, take a look at RadioDroid (but that doesn’t have the convenient quick starters).


In this area the situation is now much happier than my first post indicated. As promised I used for booking some tickets (both for Deutsche Bahn and also on Thalys), and indeed this does a fine job. Same price, European rebate cards like BahnCard 50 are supported, and being able to book with a lot of European train services with just one provider is really neat. However, I’m missing a lot of DB navigator’s great features: realtime information and alternatives, seat selection, car position indicator, regional tariffs, or things like “Länderticket”.

Fortunately it turns out that DB Navigator works just great with a trick: Disable the “Karte anzeigen” option in the menu, and it will immediately stop complaining about missing Play Services after each action. Also, logging in with your DB account never finishes, but after terminating and restarting the app you are logged in and everything works fine. That might be a “regular” bug or just a side effect without Play Services.

Wrt. rental bikes: is an awesome project and freely available API that shows available bikes on a map all over Europe. The OpenBikeSharing uses that on Android. That plus the ordinary Nextbike app works well enough.


A lot of people pointed out microG as a free implementation of Google Play Service APIs. Indeed I did try this even before my first blog post; but I didn’t mention it as I wanted to find out which apps actually need this API.

Also, this really appears to be something for the daunting: On my rooted Nexus 4 with LineageOS I didn’t get it to work, even after installing the handful of hacks that you need for signature spoofing; and I daresay that on a standard vendorized installation without root/replaced bootloader it’s outright impossible.

Fortunately there are LineageOS builds with microG included, which gets you much further. But even with that e. g. location still does not work out of the box, but one needs to hunt down and install various providers. I’ve heard from several people that they use this successfully, but as this wasn’t the point of my exercise I just gave up after that.

A really useful piece of functionality of Play Services is tracking and remote-controlling (lock, warn tone, erase) lost or stolen phones. With having backup, encryption and proper locking, a stolen phone is not the end of the world, but it’s still relatively important for me (even though I never had to actually use it yet). The only alternative that I found is Cerberus which looks quite comprehensive. It’s not free though (neither as in beer nor in speech), so unless you particularly distrust Google and are not a big company, it might just be better to keep using Play Services for this functionality.

Calendar and Contacts

I’m really happy with DAVDroid and radicale after using them for over a month. But most people don’t have a personal server to run these. etesync looks like an interesting alternative which provide the hosting for you for five coffees a year, and also offer (free) self-hosting for those who can and want to.

21 May, 2018 12:00AM

May 20, 2018

Andrej Shadura

Porting inputplug to XCB

5 years ago I wrote inputplug, a tiny daemon which connects to your X server and monitors its input devices, running an external command each time a device is connected or disconnected.

I have used a custom keyboard layout and a fairly non-standard settings for my pointing devices since 2012. I always annoyed me those settings would be re-set every time the device was disconnected and reconnected again, for example, when the laptop was brought back up from the suspend mode. I usually solved that by putting commands to reconfigure my input settings into the resume hook scripts, but that obviously didn’t solve the case of connecting external keyboards and mice. At some point those hook scripts stopped to work because they would run too early when the keyboard and mice were not they yet, so I decided to write inputplug.

Inputplug was the first program I ever wrote which used X at a low level, and I had to use Xlib to access the low-level features I needed. More specifically, inputplug uses XInput X extension and listens to XIHierarchyChanged events. In June 2014, Vincent Bernat contributed a patch to rely on XInput2 only.

During the MiniDebCamp, I had a typical case of yak shaving despite not having any yaks around: I wanted to migrate inputplug’s packaging from Alioth to Salsa, and I had an idea to update the package itself as well. I had an idea of adding optional systemd user session integration, and the easiest way to do that would be to have inputplug register a D-Bus service. However, if I just registered the service, introspecting it would cause annoying delays since it wouldn’t respond to any of the messages the clients would send to it. Handling messages would require me to integrate polling into the event loop, and it turned out it’s not easy to do while sticking to Xlib, so I decided to try and port inputplug to XCB.

For those unfamiliar with XCB, here’s a bit of background: XCB is a library which implements the X11 protocol and operates on a slightly lower level than Xlib. Unlike Xlib, it only works with structures which map directly to the wire protocol. The functions XCB provides are really atomic: in Xlib, it not unusual for a function to perform multiple X transactions or to juggle the elements of the structures a bit. In XCB, most of the functions are relatively thin wrappers to enable packing and unpacking of the data. Let me give you an example.

In Xlib, if you wanted to check whether the X server supports a specific extension, you would write something like this:

XQueryExtension(display, "XInputExtension", &xi_opcode, &event, &error)

Internally, XQueryExtension would send a QueryExtension request to the X server, wait for a reply, parse the reply and return the major opcode, the first event code and the first error code.

With XCB, you need to separately send the request, receive the reply and fetch the data you need from the structure you get:

const char ext[] = "XInputExtension";

xcb_query_extension_cookie_t qe_cookie;
qe_cookie = xcb_query_extension(conn, strlen(ext), ext);

xcb_query_extension_reply_t *rep;
rep = xcb_query_extension_reply(conn, qe_cookie, NULL);

At this point, rep has its field preset set to true if the extension is present. The rest of the things are in the structure as well, which you have to free yourself after the use.

Things get a bit more tricky with requests returning arrays, like XIQueryDevice. Since the xcb_input_xi_query_device_reply_t structure is difficult to parse manually, XCB provides an iterator, xcb_input_xi_device_info_iterator_t which you can use to iterate over the structure: xcb_input_xi_device_info_next does the necessary parsing and moves the pointer so that each time it is run the iterator points to the next element.

Since replies in the X protocol can have variable-length elements, e.g. device names, XCB also provides wrappers to make accessing them easier, like xcb_input_xi_device_info_name.

Most of the code of XCB is generated: there is an XML description of the X protocol which is used in the build process, and the C code to parse and generate the X protocol packets is generated each time the library is built. This means, unfortunately, that the documentation is quite useless, and there aren’t many examples online, especially if you’re going to use rarely used functions like XInput hierarchy change events.

I decided to do the porting the hard way, changing Xlib calls to XCB calls one by one, but there’s an easier way: since Xlib is now actually based on XCB, you can #include <X11/Xlib-xcb.h> and use XGetXCBConnection to get an XCB connection object corresponding to the Xlib’s Display object. Doing that means there will still be a single X connection, and you will be able to mix Xlib and XCB calls.

When porting, it often is useful to have a look at the sources of Xlib: it becomes obvious what XCB functions to use when you know what Xlib does internally (thanks to Mike Gabriel for pointing this out!).

Another thing to remember is that the constants and enums Xlib and XCB define usually have the same values (mandated by the X protocol) despite having slightly different names, so you can mix them too. For example, since inputplug passes the XInput event names to the command it runs, I decided to keep the names as Xlib defines them, and since I’m creating the corresponding strings by using a C preprocessor macro, it was easier for me to keep using XInput2.h instead of defining those strings by hand.

If you’re interested in the result of this porting effort, have a look at the code in the Mercurial repo. Unfortunately, it cannot be packaged for Debian yet since the Debian package for XCB doesn’t ship the module for XInput (see bug #733227).

P.S. Thanks again to Mike Gabriel for providing me important help — and explaining where to look for more of it ;)

20 May, 2018 07:50PM by Andrej Shadura

hackergotchi for Sune Vuorela

Sune Vuorela

Where KDEInstallDirs points to

The other day, some user of Extra CMake Modules (A collection of utilities and find modules created by KDE), asked if there was an easy way to query cmake for wherever the KDEInstallDirs points to (KDEInstallDirs is a set of default paths that mostly is good for your system, iirc based upon GNUInstallDirs but with some extensions for various Qt, KDE and XDG common paths, as well as some cross platform additions). I couldn’t find an easy way of doing it without writing a couple of lines of CMake code.

Getting the KDE_INSTALL_(full_)APPDIR with default options is:

$ cmake -DTYPE=APPDIR ..

and various other options can be set as well.

$ cmake -DCMAKE_INSTALL_PREFIX=/opt/mystuff -DTYPE=BINDIR ..
KDE_INSTALL_FULL_BINDIR: /opt/mystuff/bin

This is kind of simple, but let’s just share it with the world:

cmake_minimum_required(VERSION 3.0)
find_package(ECM REQUIRED)



I don’t think it is complex enough to claim any sorts of copyrights, but if you insist, you can use it under one of the following licenses: CC0, Public Domain (if that’s in your juristiction), MIT/X11, WTFPL (any version), 3-clause BSD, GPL (any version), LGPL (any version) and .. erm. whatever.

I was trying to get it to work as a cmake -P script, but some of the find_package calls requires working CMakeCache. Comments welcome.

20 May, 2018 05:28PM by Sune Vuorela

hackergotchi for Holger Levsen

Holger Levsen


So, the MiniDebConf Hamburg 2018 is about to end, it's sunny, no clouds are visible and people seem to be happy.

And, I have time to write this blog post! So, just as a teaser for now, I'll present to you the content of some slides of our "Reproducible Buster" talk today. Watch the video!

Debian is wrong

93% is a lie. We need infrastructure, processes and policies. (And testing. Currently we only have testing and a vague goal.)

With the upcoming list of bugs (skipped here) we don't want to fingerpoint at individual teams, instead I think we can only solve this if we as Debian decide we want to solve it for buster.

I think this is not happening because people believe things have been sorted out and we take care of them. But we are not, we can't do this alone.

Debian stretch

the 'reproducibly in theory but not in practice' release

Debian buster

the 'we should be reproducible but we are not' release?

Debian bullseye

the 'we are almost there but still haven't sorted out...' release???

I rather hope for:

Debian buster

the release is still far away and we haven't frozen yet! ;-)

20 May, 2018 04:00PM

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

Rcpp 0.12.17: More small updates

Another bi-monthly update and the seventeenth release in the 0.12.* series of Rcpp landed on CRAN late on Friday following nine (!!) days in gestation in the incoming/ directory of CRAN. And no complaints: we just wish CRAN were a little more forthcoming with what is happenening when, and/or would let us help supplying additional test information. I do run a fairly insane amount of backtests prior to releases, only to then have to wait another week or more is ... not ideal. But again, we all owe CRAN and immense amount of gratitude for all they do, and do so well.

So once more, this release follows the 0.12.0 release from July 2016, the 0.12.1 release in September 2016, the 0.12.2 release in November 2016, the 0.12.3 release in January 2017, the 0.12.4 release in March 2016, the 0.12.5 release in May 2016, the 0.12.6 release in July 2016, the 0.12.7 release in September 2016, the 0.12.8 release in November 2016, the 0.12.9 release in January 2017, the 0.12.10.release in March 2017, the 0.12.11.release in May 2017, the 0.12.12 release in July 2017, the 0.12.13.release in late September 2017, the 0.12.14.release in November 2017, the 0.12.15.release in January 2018 and the 0.12.16.release in March 2018 making it the twenty-first release at the steady and predictable bi-montly release frequency.

Rcpp has become the most popular way of enhancing GNU R with C or C++ code. As of today, 1362 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with another 138 in the current BioConductor release 3.7.

Compared to other releases, this release contains again a relatively small change set, but between Kevin and Romain cleaned a few things up. Full details are below.

Changes in Rcpp version 0.12.17 (2018-05-09)

  • Changes in Rcpp API:

    • The random number Generator class no longer inhreits from RNGScope (Kevin in #837 fixing #836).

    • A spurious parenthesis was removed to please gcc8 (Dirk fixing #841)

    • The optional Timer class header now undefines FALSE which was seen to have side-effects on some platforms (Romain in #847 fixing #846).

    • Optional StoragePolicy attributes now also work for string vectors (Romain in #850 fixing #849).

Thanks to CRANberries, you can also look at a diff to the previous release. As always, details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads page, the browseable doxygen docs and zip files of doxygen output for the standard formats. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

20 May, 2018 02:26PM

Joerg Jaspert

Mini DebConf Hamburg

Since Friday around noon time, I and my 6-year-old son are at the Mini DebConf in Hamburg. Attending together with my son is quite a different experience than plain alone or with also having my wife around. Though he is doing pretty good, it mostly means the day ends for me around 2100 when he needs to go to sleep.


Friday we had a nice train trip up here, with a change to schedule, needed to switch to local trains to actually get where we wanted. Still, arrived in time for lunch, which is always good, afterwards we first went to buy drinks for the days and discovered a nice playground just around the corner.

The evening, besides dinner, consisted of chatting, hacking and getting Nils busy with something - for the times he came to me. He easily found others around and is fast in socialising with people, so free hacking time for me.


The day started with a little bit of a hurry to, as Nils suddenly got the offer to attend a concert in the Elbphilharmonie and I had to get him over there fast. He says he liked it, even though it didn’t make much sense. Met him later for lunch again, followed by a visit to the playground, and then finally hacking time again.

While Nils was off looking after other conference attendees (and appearently getting ice cream too), after attending the Salsa talk, I could hack on stuff, and that meant dozens of merge requests for dak got processed (waldi and lamby are on a campaign against flake8 errors, it appears).

Apropos Salsa: The gitlab instance is the best that happened to Debian in terms of collaboration for a long time. It allows a so much better handling of any git related stuff, its worlds between earlier and now.

Holger showed Nils and me the venue, including climbing up one of the towers, quite an adventure for Nils, but a real nice view from up there.

In the evening the dak master branch was ready to get merged into our deploy branch - and as such automagically deployed on all machines where we run. It consisted of 64 commits and appearently a bug, that thankfully I found a merge request to fix it from waldi in the morning.

Oh, and the most important thing: THERE HAVE BEEN PANCAKES!


Started the morning, after breakfast, with merging the fixup for the bug, and getting it into the deploy branch. Also asked DSA to adjust group rights for the ftpteam, today we got one promotion from ftptrainee to ftpteam, everybody tell your condolences to waldi. Also added more ftptrainees as we got more volunteers, and removed inactive ones.

Soon we have to start our way back home, but I am sure to come back for another Mini Conf, if it happens again here.

20 May, 2018 12:15PM

hackergotchi for Ben Hutchings

Ben Hutchings

Help the Debian kernel team to help you

I gave the first talk this morning at Mini-DebConf Hamburg, titled "Help the kernel team to help you". I briefly described several ways that Debian users and developers can make it easier (or harder) for us to deal with their requests. The slides are up in on my talks page, and video should be available soon.

20 May, 2018 09:48AM

Andrej Shadura

Porting inputplug to XCB

5 years ago I wrote inputplug, a tiny daemon which connects to your X server and monitors its input devices, running an external command each time a device is connected or disconnected.

I have used a custom keyboard layout and a fairly non-standard settings for my pointing devices since 2012. I always annoyed me those settings would be re-set every time the device was disconnected and reconnected again, for example, when the laptop was brought back up from the suspend mode. I usually solved that by putting commands to reconfigure my input settings into the resume hook scripts, but that obviously didn’t solve the case of connecting external keyboards and mice. At some point those hook scripts stopped to work because they would run too early when the keyboard and mice were not they yet, so I decided to write inputplug.

Inputplug was the first program I ever wrote which used X at a low level, and I had to use Xlib to access the low-level features I needed. More specifically, inputplug uses XInput X extension and listens to XIHierarchyChanged events. In June 2014, Vincent Bernat contributed a patch to rely on XInput2 only.

During the MiniDebCamp, I had a typical case of yak shaving despite not having any yaks around: I wanted to migrate inputplug’s packaging from Alioth to Salsa, and I had an idea to update the package itself as well. I had an idea of adding optional systemd user session integration, and the easiest way to do that would be to have inputplug register a D-Bus service. However, if I just registered the service, introspecting it would cause annoying delays since it wouldn’t respond to any of the messages the clients would send to it. Handling messages would require me to integrate polling into the event loop, and it turned out it’s not easy to do while sticking to Xlib, so I decided to try and port inputplug to XCB.

For those unfamiliar with XCB, here’s a bit of background: XCB is a library which implements the X11 protocol and operates on a slightly lower level than Xlib. Unlike Xlib, it only works with structures which map directly to the wire protocol. The functions XCB provides are really atomic: in Xlib, it not unusual for a function to perform multiple X transactions or to juggle the elements of the structures a bit. In XCB, most of the functions are relatively thin wrappers to enable packing and unpacking of the data. Let me give you an example.

In Xlib, if you wanted to check whether the X server supports a specific extension, you would write something like this:

XQueryExtension(display, "XInputExtension", &xi_opcode, &event, &error)

Internally, XQueryExtension would send a QueryExtension request to the X server, wait for a reply, parse the reply and return the major opcode, the first event code and the first error code.

With XCB, you need to separately send the request, receive the reply and fetch the data you need from the structure you get:

const char ext[] = "XInputExtension";

xcb_query_extension_cookie_t qe_cookie;
qe_cookie = xcb_query_extension(conn, strlen(ext), ext);

xcb_query_extension_reply_t *rep;
rep = xcb_query_extension_reply(conn, qe_cookie, NULL);

At this point, rep has its field preset set to true if the extension is present. The rest of the things are in the structure as well, which you have to free yourself after the use.

Things get a bit more tricky with requests returning arrays, like XIQueryDevice. Since the xcb_input_xi_query_device_reply_t structure is difficult to parse manually, XCB provides an iterator, xcb_input_xi_device_info_iterator_t which you can use to iterate over the structure: xcb_input_xi_device_info_next does the necessary parsing and moves the pointer so that each time it is run the iterator points to the next element.

Since replies in the X protocol can have variable-length elements, e.g. device names, XCB also provides wrappers to make accessing them easier, like xcb_input_xi_device_info_name.

Most of the code of XCB is generated: there is an XML description of the X protocol which is used in the build process, and the C code to parse and generate the X protocol packets is generated each time the library is built. This means, unfortunately, that the documentation is quite useless, and there aren’t many examples online, especially if you’re going to use rarely used functions like XInput hierarchy change events.

I decided to do the porting the hard way, changing Xlib calls to XCB calls one by one, but there’s an easier way: since Xlib is now actually based on XCB, you can #include <X11/Xlib-xcb.h> and use XGetXCBConnection to get an XCB connection object corresponding to the Xlib’s Display object. Doing that means there will still be a single X connection, and you will be able to mix Xlib and XCB calls.

When porting, it often is useful to have a look at the sources of Xlib: it becomes obvious what XCB functions to use when you know what Xlib does internally (thanks to Mike Gabriel for pointing this out!).

Another thing to remember is that the constants and enums Xlib and XCB define usually have the same values (mandated by the X protocol) despite having slightly different names, so you can mix them too. For example, since inputplug passes the XInput event names to the command it runs, I decided to keep the names as Xlib defines them, and since I’m creating the corresponding strings by using a C preprocessor macro, it was easier for me to keep using XInput2.h instead of defining those strings by hand.

If you’re interested in the result of this porting effort, have a look at the code in the Mercurial repo. Unfortunately, it cannot be packaged for Debian yet since the Debian package for XCB doesn’t ship the module for XInput (see bug #733227).

P.S. Thanks again to Mike Gabriel for providing me important help — and explaining where to look for more of it ;)

20 May, 2018 08:05AM by Andrej Shadura

Russ Allbery

California state election

Hm, I haven't done one of these in a while. Well, time to alienate future employers and make awkward mistakes in public that I have to explain if I ever run for office! (Spoiler: I'm highly unlikely to ever run for office.)

This is only of direct interest to California residents. To everyone else, RIP your feed reader, and I'm sorry for the length. (My hand-rolled blog software doesn't do cut tags.) I'll spare you all the drill-down into the Bay Area regional offices. (Apparently we elect our coroner, which makes no sense to me.)


I'm not explaining these because this is already much too long; those who aren't in California and want to follow along can see the voter guide.

Proposition 68: YES. Still a good time to borrow money, and what we're borrowing money for here seems pretty reasonable. State finances are in reasonable shape; we have the largest debt of any state for the obvious reason that we have the most people and the most money.

Proposition 69: YES. My instinct is to vote no because I have a general objection to putting restrictions on how the state manages its budget. I don't like dividing tax money into locked pools for the same reason that I stopped partitioning hard drives. That said, this includes public transit in the spending pool from gasoline taxes (good), the opposition is incoherent, and there are wide-ranging endorsements. That pushed me to yes on the grounds that maybe all these people understand something about budget allocations that I don't.

Proposition 70: NO. This is some sort of compromise with Republicans because they don't like what cap-and-trade money is being spent on (like high-speed rail) and want a say. If I wanted them to have a say, I'd vote for them. There's a reason why they have to resort to backroom tricks to try to get leverage over laws in this state, and it's not because they have good ideas.

Proposition 71: YES. Entirely reasonable change to say that propositions only go into effect after the election results are final. (There was a real proposition where this almost caused a ton of confusion, and prompted this amendment.)

Proposition 72: YES. I'm grumbling about this because I think we should get rid of all this special-case bullshit in property taxes and just readjust them regularly. Unfortunately, in our current property tax regime, you have to add more exemptions like this because otherwise the property tax hit (that would otherwise not be incurred) is so large that it kills the market for these improvements. Rainwater capture is to the public benefit in multiple ways, so I'll hold my nose and vote for another special exception.

Federal Offices

US Senator: Kevin de León. I'll vote for Feinstein in the general, and she's way up on de León in the polls, but there's no risk in voting for the more progressive candidate here since there's no chance Feinstein won't get the most votes in the primary. De León is a more solidly progressive candidate than Feinstein. I'd love to see a general election between the two of them.

State Offices

I'm omitting all the unopposed ones, and all the ones where there's only one Democrat running in the primary. (I'm not going to vote for any Republican except for one exception noted below, and third parties in the US are unbelievably dysfunctional and not ready to govern.) For those outside the state, California has a jungle primary where the top two vote-getters regardless of party go to the general election, so this is more partisan and more important than other state primaries.

Governor: Delaine Eastin. One always has to ask, in our bullshit voting system, whether one has to vote tactically instead of for the best candidate. But, looking at polling, I think there's no chance Gavin Newsom (the second-best candidate and the front-runner) won't advance to the general election, so I get to vote for the candidate I actually want to win, even though she's probably not going to. Eastin is by far the most progressive candidate running who actually has the experience required to be governor. (Spoiler: Newsom is going to win, and I'll definitely vote for him in the general against Villaraigosa.)

Lieutenant Governor: Eleni Kounalakis. She and Bleich are the strongest candidates. I don't see a ton of separation between them, but Kounalakis's endorsements are a bit stronger for me. She's also the one candidate who has a specific statement about what she plans to do with the lieutenant governor role of oversight over the university system, which is almost it's only actual power. (This political office is stupid and we should abolish it.)

Secretary of State: Alex Padilla. I agree more with Ruben Major's platform (100% paper ballots is the correct security position), but he's an oddball outsider and I don't think he can accomplish as much. Padilla has an excellent track record as the incumbant and is doing basically the right things, just less dramatically.

Treasurer: Fiona Ma. I like Vivek Viswanathan and support his platform, but Fiona Ma has a lot more political expertise and I think will be more effective. I look forward to voting for Viswanathan for something else someday.

Attorney General: Dave Jones. Xavier Becerra hasn't been doing a bad job fighting off bad federal policy, but that seems to be all that he's interested in, and he's playing partisan games with the office. He has an air of amateurishness and political hackery. Dave Jones holds the same positions in a more effective way, is more professional, and has done a good job as Insurance Commissioner.

Insurance Commissioner: Steve Poizner. I'm going to vote for the (former) Republican here. Poizner expressed some really bullshit views on immigration when he ran for governor (which he's apologized for). I wouldn't support him for a more political office. But he was an excellent insurance commissioner (see, for instance, the response to Blue Cross's rate increase request). I'm closer to Ricardo Lara politically, but in his statements to the press he comes across as a clown: self-driving car insurance problems, cannabis insurance, climate change insurance, and a bunch of other nonsense that makes me think he doesn't understand the job. The other democrat, Mahmood, seems like less of a partisan hack, but he's a virtual unknown. If this were an important partisan office, I'd hold my nose and vote for Lara, but the job of insurance commissioner is more to be an auditor and negotiator, and Poizner was really good at it.

Superintendent of Public Instruction: Tony Thurmond. The other front-runner is Marshall Tuck, who is a charter school advocate. I hate charter schools with the passion of a burning sun.

Local Measures

Regional Measure 3: YES. Even more hyper-local than the rest of this post, but mentioning it because it was a narrow call. Bridge tolls are regressive, and I'm not a big fan of raising them as opposed to, say, increasing property taxes (yes please) or income taxes. That said, taxing cars to pay for (largely) public transit is the direction the money should flow. It was thinly balanced for me, but the thrust of the projects won out over the distaste at the regressive tax.

20 May, 2018 05:07AM

May 19, 2018

Free software log (April 2018)

This is rather late since I got distracted by various other things including, ironically, releasing a bunch of software. This is for April, so doesn't include the releases from this month.

The main release I worked on was remctl 3.14, which fixed a security bug introduced in 3.12 with the sudo configuration option. This has since been replaced by 3.15, which has more thorough maintainer testing infrastructure to hopefully prevent this from happening again.

I also did the final steps of the release process for INN 2.6.2, although as usual Julien ÉLIE did all of the hard work.

On the Debian side, I uploaded a new rssh package for the migration to GitLab ( I have more work to do on that front, but haven't yet had the time. I've been prioritizing some of my own packages over doing more general Debian work.

Finally, I looked at my Perl modules on CPANTS (the CPAN testing service) and made note of a few things I need to fix, plus filed a couple of bugs for display issues (one of which turned out to be my fault and fixed in Git). I also did a bit of research on the badges that people in the Rust community use in their documentation and started adding support to DocKnot, some of which made it into the subsequent release I did this month.

19 May, 2018 11:15PM

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

RcppGSL 0.3.5

A maintenance update of RcppGSL just brought version 0.3.5 to CRAN, a mere twelve days after the RcppGSL 0.3.4. release. Just like yesterday's upload of inline 0.3.15 it was prompted by a CRAN request to update the per-package manual page; see the inline post for details.

The RcppGSL package provides an interface from R to the GNU GSL using the Rcpp package.

No user-facing new code or features were added. The NEWS file entries follow below:

Changes in version 0.3.5 (2018-05-19)

  • Update package manual page using references to DESCRIPTION file [CRAN request].

Courtesy of CRANberries, a summary of changes to the most recent release is available.

More information is on the RcppGSL page. Questions, comments etc should go to the issue tickets at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

19 May, 2018 08:11PM

hackergotchi for Martín Ferrari

Martín Ferrari

MiniDebConf Hamburg - Friday/Saturday

MiniDebCamp Hamburg - Friday 18/5, Saturday 19/5

Friday and Saturday have been very productive days, I love events where there is time to hack!

I had more chats about contributors.d.o with Ganneff and Formorer, and if all goes according to plan, soon salsa will start streaming commit information to contributors and populate information about different teams: not only about normal packaging repos, but also about websites, tools, native packages, etc.

Note that the latter require special configuration, and the same goes if you want to have separate stats for your team (like for the Go team or the Perl team). So if you want to offer proper attribution to members of your team, please get in touch!

I spent loads of time working on Prometheus packages, and finally today (after almost a year) I uploaded a new version of prometheus-alertmanager to experimental. I decided to just drop all the web interface, as packaging all the Elm framework would take me months of work. If anybody feels like writing a basic HTML/JS interface, I would be happy to include it in the package!

While doing that, I found bugs in the CI pipeline for Go packages in Salsa. Solving these will hopefully make the automatic testing more reliable, as API breakage is sadly a big problem in the Go ecosystem.

I am loving the venue here. Apart from hosting some companies and associations, there is an art gallery which currently has a photo exhibition called Echo park; there were parties happening last night, and tonight apparently there will be more. This place is amazing!


19 May, 2018 05:40PM

hackergotchi for Thorsten Glaser

Thorsten Glaser

Progress report from the Movim packaging sprint at MiniDebconf

Nik wishes you to know that the Movim packaging sprint (sponsored by the DPL, thank you!) is handled under the umbrella of the Debian Edu sprint (similarily sponsored) since this package is handled by the Teckids Debian Task Force, personnel from Teckids e.V.

After arriving, I’ve started collecting knowledge first. I reviewed upstream’s composer.json file and Wiki page about dependencies and, after it quickly became apparent that we need much more information (e.g. which versions are in sid, what the package names are, and, most importantly, recursive dependencies), a Wiki page of our own grew. Then I made a hunt for information about how to package stuff that uses PHP Composer upstream, and found the, ahem, wonderfully abundant, structured, plentiful and clear documentation from the Debian PHP/PEAR Packaging team. (Some time and reverse-engineering later I figured out that we just ignore composer and read its control file in pkg-php-tools converting dependency information to Debian package relationships. Much time later I also figured out it mangles package names in a specific way and had to rename one of the packages I created in the meantime… thankfully before having uploaded it.) Quickly, the Wiki page grew listing the package names we’re supposed to use. I created a package which I could use as template for all others later.

The upstream Movim developer arrived as well — we have quite an amount of upstream developers of various projects attending MiniDebConf, to the joy of the attendees actually directly involved in Debian, and this makes things much easier, as he immediately started removing dependencies (to make our job easier) and fixing bugs and helping us understand how some of those dependencies work. (I also contributed code upstream that replaces some Unicode codepoints or sequences thereof, such as 3⃣ or ‼ or 👱🏻‍♀️, with <img…/> tags pointing to the SVG images shipped with Movim, with a description (generated from their Unicode names) in the alt attribute.)

Now, Saturday, all dependencies are packaged so far, although we’re still waiting for maintainer feedback for those two we’d need to NMU (or have them upload or us take the packages over); most are in NEW of course, but that’s no problem. Now we can tackle packaging Movim itself — I guess we’ll see whether those other packages actually work then ☺

We also had a chance to fix bugs in other packages, like guacamole-client and musescore.

In the meantime we’ve also had the chance to socialise, discuss, meet, etc. other Debian Developers and associates and enjoy the wonderful food and superb coffee of the “Cantina” at the venue; let me hereby express heartfelt thanks to the MiniDebConf organisation for this good location pick!

Update, later this night: we took over the remaining two packages with permission from their previous team and uploader, and have already started with actually packaging Movim, discovering untold gruesome things in the upstream of the two webfonts it bundles.

19 May, 2018 04:45PM

Mike Hommey

Announcing git-cinnabar 0.5.0 beta 3

Git-cinnabar is a git remote helper to interact with mercurial repositories. It allows to clone, pull and push from/to mercurial remote repositories, using git.

Get it on github.

These release notes are also available on the git-cinnabar wiki.

What’s new since 0.5.0 beta 2?

  • Fixed incompatibilities with Mercurial >= 4.4.
  • Miscellaneous metadata format changes.
  • Move more operations to the helper, hopefully making things faster.
  • Updated git to 2.17.0 for the helper.
  • Properly handle clones with bundles when the repository doesn’t contain anything newer than the bundle.
  • Fixed tag cache, which could lead to missing tags.

19 May, 2018 05:26AM by glandium

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

inline 0.3.15

A maintenance release of the inline package arrived on CRAN today. inline facilitates writing code in-line in simple string expressions or short files. The package is mature and in maintenance mode: Rcpp used it greatly for several years but then moved on to Rcpp Attributes so we have a much limited need for extensions to inline. But a number of other package have a hard dependence on it, so we do of course look after it as part of the open source social contract (which is a name I just made up, but you get the idea...)

This release was triggered by a (as usual very reasonable) CRAN request to update the per-package manual page which had become stale. We now use Rd macros, you can see the diff for just that file at GitHub; I also include it below. My pkgKitten package-creation helper uses the same scheme, I wholeheartedly recommend it -- as the diff shows, it makes things a lot simpler.

Some other changes reflect both two user-contributed pull request, as well as standard minor package update issues. See below for a detailed list of changes extracted from the NEWS file.

Changes in inline version 0.3.15 (2018-05-18)

  • Correct requireNamespace() call thanks (Alexander Grueneberg in #5).

  • Small simplification to .travis.yml; also switch to https.

  • Use seq_along instead of seq(along=...) (Watal M. Iwasaki) in #6).

  • Update package manual page using references to DESCRIPTION file [CRAN request].

  • Minor packaging updates.

Courtesy of CRANberries, there is a comparison to the previous release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

19 May, 2018 01:04AM

May 18, 2018

hackergotchi for Joey Hess

Joey Hess

fridge 0.1

Imagine something really cool, like a fridge connected to a powerwall, powered entirely by solar panels. What could be cooler than that?

How about a fridge powered entirely by solar panels without the powerwall? Zero battery use, and yet it still preserves your food.

That's much cooler, because batteries, even hyped ones like the powerwall, are expensive and innefficient and have limited cycles. Solar panels are cheap and efficient now. With enough solar panels that the fridge has power to cool down most days (even cloudy days), and a smart enough control system, the fridge itself becomes the battery -- a cold battery.

I'm live coding my fridge, with that goal in mind. You can follow along in this design thread on secure scuttlebutt, and my git commits, and you can watch real-time data from my fridge.

Over the past two days, which were not especially sunny, my 1 kilowatt of solar panels has managed to cool the fridge down close to standard fridge temperatures. The temperature remains steady overnight thanks to added thermal mass in the fridge. My food seems safe in it, despite it being powered off for 14 hours each night.

graph of fridge temperature, starting at 13C and trending downwards to 5C over 24 hours

(Numbers in this graph are running higher than the actual temps of food in the fridge, for reasons explained in the scuttlebutt thread.)

Of course, the longterm viability of a fridge that never draws from a battery is TBD; I'll know within a year if it works for me.

bunch of bananas resting on top of chest freezer fridge conversion

I've written about the coding side of this project before, in my haskell controlled offgrid fridge. The reactive-banana-automation library is working well in this application. My AIMS inverter control board and easy-peasy-devicetree-squeezy were other groundwork for this project.

18 May, 2018 11:26PM

Andrej Shadura

Goodbye Octopress, hello Pelican

Hi from MiniDebConf in Hamburg!

As you may have noticed, I don’t update this blog often. One of the reasons why this was happening was that until now it was incredibly difficult to write posts. The software I used, Octopress (based on Jekyll) was based on Ruby, and it required quite specific versions of its dependencies. I had the workspace deployed on one of my old laptops, but when I attempted to reproduce it on the laptop I currently use, I failed to. Some dependencies could not be installed, others failed, and my Ruby skills weren’t enough to fix that mess. (I have to admit my Ruby skills improved insignificantly since the time I installed Octopress, but that wasn’t enough to help in this case.)

I’ve spent some time during this DebCamp to migrate to Pelican, which is written in Python, packaged in Debian, and its dependencies are quite straighforward to install. I had to install (and write) a few plugins to make the migration easier, and port my custom Octopress Bootstrap theme to Pelican.

I no longer include any scripts from Twitter or Facebook (I made Tweet and Share button static links), and the Disqus comments are loaded only on demand, so reading this blog will respect your privacy better than before.

See you at MiniDebConf tomorrow!

18 May, 2018 08:05PM by Andrej Shadura

hackergotchi for Joachim Breitner

Joachim Breitner

Proof reuse in Coq using existential variables

This is another technical post that is only of interest only to Coq users.

TL;DR: Using existential variable for hypotheses allows you to easily refactor a complicated proof into an induction schema and the actual proofs.


As a running example, I will use a small theory of “bags”, which you can think of as lists represented as trees, to allow an O(1) append operation:

Require Import Coq.Arith.Arith.
Require Import Psatz.
Require FunInd.

(* The data type *)
Inductive Bag a : Type :=
  | Empty : Bag a
  | Unit  : a -> Bag a
  | Two   : Bag a -> Bag a -> Bag a.

Arguments Empty {_}.
Arguments Unit {_}.
Arguments Two {_}.

Fixpoint length {a} (b : Bag a) : nat :=
  match b with
  | Empty     => 0
  | Unit _    => 1
  | Two b1 b2 => length b1 + length b2

(* A smart constructor that ensures that a [Two] never
   has [Empty] as subtrees. *)
Definition two {a} (b1 b2 : Bag a) : Bag a := match b1 with
  | Empty => b2
  | _ => match b2 with | Empty => b1
                       | _ => Two b1 b2 end end.

Lemma length_two {a} (b1 b2 : Bag a) :
  length (two b1 b2) = length b1 + length b2.
Proof. destruct b1, b2; simpl; lia. Qed.

(* A first non-trivial function *)
Function take {a : Type} (n : nat) (b : Bag a) : Bag a :=
  if n =? 0
  then Empty
  else match b with
       | Empty     => b
       | Unit x    => b
       | Two b1 b2 => two (take n b1) (take (n - length b1) b2)

The theorem

The theorem that I will be looking at in this proof describes how length and take interact:

Theorem length_take''':
  forall {a} n (b : Bag a),
  length (take n b) = min n (length b).

Before I dive into it, let me point out that this example itself is too simple to warrant the techniques that I will present in this post. I have to rely on your imagination to scale this up to appreciate the effect on significantly bigger proofs.

Naive induction

How would we go about proving this lemma? Surely, induction is the way to go! And indeed, this is provable using induction (on the Bag) just fine:

  revert n.
  induction b; intros n.
  * simpl.
    destruct (Nat.eqb_spec n 0).
    + subst. rewrite Nat.min_0_l. reflexivity.
    + rewrite Nat.min_0_r. reflexivity.
  * simpl.
    destruct (Nat.eqb_spec n 0).
    + subst. rewrite Nat.min_0_l. reflexivity.
    + simpl. lia.
  * simpl.
    destruct (Nat.eqb_spec n 0).
    + subst. rewrite Nat.min_0_l. reflexivity.
    + simpl. rewrite length_two, IHb1, IHb2. lia.

But there is a problem: A proof by induction on the Bag argument immediately creates three subgoals, one for each constructor. But that is not how take is defined, which first checks the value of n, independent of the constructor. This means that we have to do the case-split and the proof for the case n = 0 three times, although they are identical. It’s a one-line proof here, but imagine something bigger...

Proof by fixpoint

Can we refactor the proof to handle the case n = 0 first? Yes, but not with a simple invocation of the induction tactic. We could do well-founded induction on the length of the argument, or we can do the proof using the more primitive fix tactic. The latter is a bit hairy, you won’t know if your proof is accepted until you do Qed (or check with Guarded), but when it works it can yield some nice proofs.

  intros a.
  fix IH 2.
  rewrite take_equation.
  destruct (Nat.eqb_spec n 0).
  + subst n. rewrite Nat.min_0_l. reflexivity.
  + destruct b.
    * rewrite Nat.min_0_r. reflexivity.
    * simpl. lia.
    * simpl. rewrite length_two, !IH. lia.

Nice: we eliminated the duplication of proofs!

A functional induction lemma

Again, imagine that we jumped through more hoops here ... maybe some well-founded recursion with a tricky size measure and complex proofs that the measure decreases ... or maybe you need to carry around an invariant about your arguments and you have to work hard to satisfy the assumption of the induction hypothesis.

As long as you do only one proof about take, that is fine. As soon as you do a second proof, you will notice that you have to repeat all of that, and it can easily make up most of your proof...

Wouldn’t it be nice if you can do the common parts of the proofs only once, obtain a generic proof scheme that you can use for (most) proofs about take, and then just fill in the blanks?

Incidentally, the Function command provides precisely that:

     : forall (a : Type) (P : nat -> Bag a -> Bag a -> Prop),
       (forall (n : nat) (b : Bag a), (n =? 0) = true -> P n b Empty) ->
       (forall (n : nat) (b : Bag a), (n =? 0) = false -> b = Empty -> P n Empty b) ->
       (forall (n : nat) (b : Bag a), (n =? 0) = false -> forall x : a, b = Unit x -> P n (Unit x) b) ->
       (forall (n : nat) (b : Bag a),
        (n =? 0) = false ->
        forall b1 b2 : Bag a,
        b = Two b1 b2 ->
        P n b1 (take n b1) ->
        P (n - length b1) b2 (take (n - length b1) b2) ->
        P n (Two b1 b2) (two (take n b1) (take (n - length b1) b2))) ->
       forall (n : nat) (b : Bag a), P n b (take n b)

which is great if you can use Function (although not perfect – we’d rather see n = 0 instead of (n =? 0) = true), but often Function is not powerful enough to define the function you care about.

Extracting the scheme from a proof

We could define our own take_ind' by hand, but that is a lot of work, and we may not get it right easily, and when we change out functions, there is now this big proof statement to update.

Instead, let us use existentials, which are variables where Coq infers their type from how we use them, so we don’t have to declare them. Unfortunately, Coq does not support writing just

Lemma take_ind':
  forall (a : Type) (P : nat -> Bag a -> Bag a -> Prop),
  forall (IH1 : ?) (IH2 : ?) (IH3 : ?) (IH4 : ?),
  forall n b, P n b (take n b).

where we just leave out the type of the assumptions (Isabelle does...), but we can fake it using some generic technique.

We begin with stating an auxiliary lemma using a sigma type to say “there exist some assumption that are sufficient to show the conclusion”:

Lemma take_ind_aux:
  forall a (P : _ -> _ -> _ -> Prop),
  { Hs : Prop |
    Hs -> forall n (b : Bag a), P n b (take n b)

We use the [eexist tactic])( (existential exists) to construct the sigma type without committing to the type of Hs yet.

  intros a P.
  intros Hs.

This gives us an assumption Hs : ?Hs – note the existential type. We need four of those, which we can achieve by writing

  pose proof Hs as H1. eapply proj1 in H1. eapply proj2 in Hs.
  pose proof Hs as H2. eapply proj1 in H2. eapply proj2 in Hs.
  pose proof Hs as H3. eapply proj1 in H3. eapply proj2 in Hs.
  rename Hs into H4.

we now have this goal state:

1 subgoal
a : Type
P : nat -> Bag a -> Bag a -> Prop
H4 : ?Goal2
H1 : ?Goal
H2 : ?Goal0
H3 : ?Goal1
forall (n : nat) (b : Bag a), P n b (take n b)

At this point, we start reproducing the proof of length_take: The same approach to induction, the same case splits:

  fix IH 2.
  rewrite take_equation.
  destruct (Nat.eqb_spec n 0).
  + subst n.
    revert b.
    refine H1.
  + rename n0 into Hnot_null.
    destruct b.
    * revert n Hnot_null.
      refine H2.
    * rename a0 into x.
      revert x n Hnot_null.
      refine H3.
    * assert (IHb1 : P n b1 (take n b1)) by apply IH.
      assert (IHb2 : P (n - length b1) b2 (take (n - length b1) b2)) by apply IH.
      revert n b1 b2 Hnot_null IHb1 IHb2.
      refine H4.
Defined. (* Important *)

Inside each case, we move all relevant hypotheses into the goal using revert and refine with the corresponding assumption, thus instantiating it. In the recursive case (Two), we assert that P holds for the subterms, by induction.

It is important to end this proofs with Defined, and not Qed, as we will see later.

In a next step, we can remove the sigma type:

Definition take_ind' a P := proj2_sig (take_ind_aux a P).

The type of take_ind' is as follows:

     : forall (a : Type) (P : nat -> Bag a -> Bag a -> Prop),
       proj1_sig (take_ind_aux a P) ->
       forall n b, P n b (take n b)

This looks almost like an induction lemma. The assumptions of this lemma have the not very helpful type proj1_sig (take_ind_aux a P), but we can already use this to prove length_take:

Theorem length_take:
  forall {a} n (b : Bag a),
  length (take n b) = min n (length b).
  intros a.
  apply take_ind' with (P := fun n b r => length r = min n (length b)).
  repeat apply conj; intros.
  * rewrite Nat.min_0_l. reflexivity.
  * rewrite Nat.min_0_r. reflexivity.
  * simpl. lia.
  * simpl. rewrite length_two, IHb1, IHb2. lia.

In this case I have to explicitly state P where I invoke take_ind', because Coq cannot figure out this instantiation on its own (it requires higher-order unification, which is undecidable and unpredictable). In other cases I had more luck.

After I apply take_ind', I have this proof goal:

proj1_sig (take_ind_aux a (fun n b r => length r = min n (length b)))

which is the type that Coq inferred for Hs above. We know that this is a conjunction of a bunch of assumptions, and we can split it as such, using repeat apply conj. At this point, Coq needs to look inside take_ind_aux; this would fail if we used Qed to conclude the proof of take_ind_aux.

This gives me four goals, one for each case of take, and the remaining proofs really only deals with the specifics of length_take – no more general dealing with worrying about getting the induction right and doing the case-splitting the right way.

Also note that, very conveniently, Coq uses the same name for the induction hypotheses IHb1 and IHb2 that we used in take_ind_aux!

Making it prettier

It may be a bit confusing to have this proj1_sig in the type, especially when working in a team where others will use your induction lemma without knowing its internals. But we can resolve that, and also turn the conjunctions into normal arrows, using a bit of tactic support. This is completely generic, so if you follow this procedure, you can just copy most of that:

Lemma uncurry_and: forall {A B C}, (A /\ B -> C) -> (A -> B -> C).
Proof. intros. intuition. Qed.
Lemma under_imp:   forall {A B C}, (B -> C) -> (A -> B) -> (A -> C).
Proof. intros. intuition. Qed.
Ltac iterate n f x := lazymatch n with
  | 0 => x
  | S ?n => iterate n f uconstr:(f x)
Ltac uncurryN n x :=
  let n' := eval compute in n in
  lazymatch n' with
  | 0 => x
  | S ?n => let uc := iterate n uconstr:(under_imp) uconstr:(uncurry_and) in
            let x' := uncurryN n x in
            uconstr:(uc x')

With this in place, we can define our final proof scheme lemma:

Definition take_ind'' a P
  := ltac:(let x := uncurryN 3 (proj2_sig (take_ind_aux a P)) in exact x).
Opaque take_ind''.

The type of take_ind'' is now exactly what we’d wish for: All assumptions spelled out, and the n =? 0 already taken of (compare this to the take_ind provided by the Function command above):

     : forall (a : Type) (P : nat -> Bag a -> Bag a -> Prop),
       (forall b : Bag a, P 0 b Empty) ->
       (forall n : nat, n <> 0 -> P n Empty Empty) ->
       (forall (x : a) (n : nat), n <> 0 -> P n (Unit x) (Unit x)) ->
       (forall (n : nat) (b1 b2 : Bag a),
        n <> 0 ->
        P n b1 (take n b1) ->
        P (n - length b1) b2 (take (n - length b1) b2) ->
        P n (Two b1 b2) (two (take n b1) (take (n - length b1) b2))) ->
       forall (n : nat) (b : Bag a), P n b (take n b)

At this point we can mark take_ind'' as Opaque, to hide how we obtained this lemma.

Our proof does not change a lot; we merely no longer have to use repeat apply conj:

Theorem length_take''':
  forall {a} n (b : Bag a),
  length (take n b) = min n (length b).
  intros a.
  apply take_ind'' with (P := fun n b r => length r = min n (length b)); intros.
  * rewrite Nat.min_0_l. reflexivity.
  * rewrite Nat.min_0_r. reflexivity.
  * simpl. lia.
  * simpl. rewrite length_two, IHb1, IHb2. lia.

Is it worth it?

It was in my case: Applying this trick in our ongoing work of verifying parts of the Haskell compiler GHC separated a somewhat proof into a re-usable proof scheme (go_ind), making the actual proofs (go_all_WellScopedFloats, go_res_WellScoped) much neater and to the point. It saved “only” 60 lines (if I don’t count the 20 “generic” lines above), but the pay-off will increase as I do even more proofs about this function.

18 May, 2018 12:51PM by Joachim Breitner (

hackergotchi for Martín Ferrari

Martín Ferrari

MiniDebConf Hamburg - Thursday

MiniDebCamp Hamburg - Thursday 17/5

I missed my flight on Wednesday, and for a moment I thought I would have to cancel my attendance, but luckily I was able to buy a ticket for Thursday for a good price.

I arrived at the venue just in time for a "stand-up" meeting, where people introduced themselves and shared what are they working on / planning to work on. That gave me a great feeling, having an idea of what other people are doing, and gave me motivation to work on my projects.

The venue seems to be some kind of cooperative, with office space for different associations, there is also a small guest house (where I am sleeping), and a "cantina". The building seems very pretty, but is going through some renovations, so the scaffolding does not let you see it much. It also has a big outdoors area, which is always welcomed.

I had a good chat about mapping support in IkiWiki, so my rewrite of the OSM plugin might get some users even before it is completely merged!

I also worked for a while on Prometheus packages, I am hoping to finally get a new version of prometheus-alertmanager packaged soon.

I realised I still had some repos in my home directory in alioth, so I moved these away to salsa. On the same vein, I started discussions about migrating my data-collection scripts for contributors.d.o to salsa; this is quite important if we want to keep contributors being relevant and useful.


18 May, 2018 09:43AM

hackergotchi for Olivier Berger

Olivier Berger

Virtualized lab demonstration using a tweaked Labtainers running in a container

I’ve recorded a screencast: Labtainers in docker demonstration (embedded below) demonstrating how I’ve tweaked Labtainers so as to run it inside its own Docker container.

I’m currently very much excited by the Labtainers framework for delivering virtual labs, for instance in the context of MOOCs.

Labtainers is quite interesting as it allows isolating a lab in several containers running in their own dedicated virtual network, which helps distributing a lab without needing to install anything locally.

My tweak allows to run what I called the “master” container which contains the labtainers scripts, instead of having to install labtainers on a Linux host. This should help installation and distribution of labtainers, as well as deploying it on cloud platforms, some day soon. In the meantime Labtainer containers of the labs run with privileges so it’s advised to be careful, and running the whole of these containers in a VM may be safer. Maybe Labtainers will evolve in the future to integrate a containerization of its scripts. My patches are pending, but the upstream authors are currently focused on some other priorities.

Another interesting property of labtainers that is shown in the demo is the auto-grading feature that uses traces of what was performed inside the lab environment by the student, to evaluate the activities. Here, the telnetlab that I’ve shown, is evaluated by looking at text input on the command line or messages appearing on the stdout or in logs : the student launched both telnet or ssh, some failed login appeared, etc.

However, the demo is a bit confusing, in that I recorded a second lab execution whereas I had previously attempted a first try at the same telnetlab. In labtainers, traces of execution can accumulate : the student wil make a first attempt, and restart later, before sending it all to the professor (unless a is issued). This explanes that the  grading appears to give a different result than what I performed in the screencast.

Stay tuned for more news about my Labtainers adventures.

P.S. thanks to labtainers authors, and obs-studio folks for the screencast recording tool 🙂

18 May, 2018 08:39AM by Olivier Berger

hackergotchi for Louis-Philippe Véronneau

Louis-Philippe Véronneau

Running systemd in the Gitlab docker runner CI

At the DebConf videoteam, we use ansible to manage our machines. Last fall in Cambridge, we migrated our repositories on and I started playing with the Gitlab CI. It's pretty powerful and helped us catch a bunch of errors we had missed.

As it was my first time playing with continuous integration and docker, I had trouble when our playbooks used systemd in a way or another and I couldn't figure out a way to have systemd run in the Gitlab docker runner.

Fast forward a few months and I lost another day and a half working on this issue. I haven't been able to make it work (my conclusion is that it's not currently possible), but I thought I would share what I learned in the process with others. Who knows, maybe someone will have a solution!

10 steps to failure

I first stated by creating a privileged Gitlab docker runner on a machine that is dedicated to running Gitlab CI runners. To run systemd in docker you either need to run privileged docker instances or to run them with the --add-cap=SYS_ADMIN permission.

If you were trying to run a docker container that runs with systemd directly, you would do something like:

$ docker run -it --cap-add SYS_ADMIN -v /sys/fs/cgroup:/sys/fs/cgroup:ro debian-systemd

I tried replicating this behavior with the Gitlab runner by mounting the right volumes in the runner and giving it the right cap permissions.

The thing is, normally your docker container runs a entrypoint command such as CMD ["/lib/systemd/systemd"]. To run its CI scripts, the Gitlab runner takes that container but replaces the entrypoint command by:

sh -c 'if [ -x /usr/local/bin/bash ]; then\n\texec /usr/local/bin/bash \nelif [ -x /usr/bin/bash ]; then\n\texec /usr/bin/bash \nelif [ -x /bin/bash ]; then\n\texec /bin/bash \nelif [ -x /usr/local/bin/sh ]; then\n\texec /usr/local/bin/sh \nelif [ -x /usr/bin/sh ]; then\n\texec /usr/bin/sh \nelif [ -x /bin/sh ]; then\n\texec /bin/sh \nelse\n\techo shell not found\n\texit 1\nfi\n\n'

That is to say, it tries to run bash.

If you try to run commands that require systemd such as systemctl status, you'll end up with this error message since systemd is not running:

Failed to get D-Bus connection: Operation not permitted

Trying to run systemd manually once the container has been started won't work either, since systemd needs to be PID 1 in order to work (and PID 1 is bash). You end up with this error:

Trying to run as user instance, but the system has not been booted with systemd.

At this point, I came up with a bunch of creative solutions to try to bypass Gitlab's entrypoint takeover. Turns out you can tell the Gitlab runner to override the container's entrypoint with your own. Sadly, the runner then appends its long bash command right after.

For example, if you run a job with this gitlab-ci entry:

  name: debian-systemd
  entrypoint: "/lib/systemd/systemd"
- /usr/local/bin/my-super-script

You will get this entrypoint:

/lib/systemd/systemd sh -c 'if [ -x /usr/local/bin/bash ]; then\n\texec /usr/local/bin/bash \nelif [ -x /usr/bin/bash ]; then\n\texec /usr/bin/bash \nelif [ -x /bin/bash ]; then\n\texec /bin/bash \nelif [ -x /usr/local/bin/sh ]; then\n\texec /usr/local/bin/sh \nelif [ -x /usr/bin/sh ]; then\n\texec /usr/bin/sh \nelif [ -x /bin/sh ]; then\n\texec /bin/sh \nelse\n\techo shell not found\n\texit 1\nfi\n\n'

This obviously fails. I then tried to be clever and use this entrypoint: ["/lib/systemd/systemd", "&&"]. This does not work either, since docker requires the entrypoint to be only one command.

Someone pointed out to me that you could try to use exec /lib/systemd/systemd to PID 1 bash by systemd, but that also fails with an error telling you the system has not been booted with systemd.

One more level down

Since it seems you can't run systemd in the Gitlab docker runner directly, why not try to run systemd in docker in docker (dind)? dind is used quite a lot in the Gitlab CI to build containers, so we thought it might work.

Sadly, we haven't been able to make this work either. You need to mount volumes in docker to run systemd properly and it seems docker doesn't like to mount volumes from a docker container that already have been mounted from the docker host... Ouf.

If you have been able to run systemd in the Gitlab docker runner, please contact me!

Paths to explore

The only Gitlab runner executor I've used at the moment is the docker one, since it's what most Gitlab instances run. I have no experience with it, but since there is also an LXC executor, it might be possible to run Gitlab CI tests with systemd this way.

18 May, 2018 04:00AM by Louis-Philippe Véronneau

Join us in Hamburg for the Hamburg Mini-DebConf!

Thanks to Debian, I have the chance to be able to attend the Hamburg Mini-DebConf, taking place in Hamburg from May 16th to May 20th. We are hosted by Dock Europe in the amazing Viktoria Kaserne building.

Viktoria Kaserne

As always, the DebConf videoteam has been hard at work! Our setup is working pretty well and we only have minor fixes to implement before the conference starts.

For those of you who couldn't attend the mini-conf, you can watch the live stream here. Videos will be uploaded shortly after to the DebConf video archive.

Olasd resting on our makeshift cubes podium

18 May, 2018 04:00AM by Louis-Philippe Véronneau

hackergotchi for Norbert Preining

Norbert Preining

Docker, cron, environment variables, and Kubernetes

I recently mentioned that I am running cron in some of the docker containers I need for a new service. Now that we moved to Kubernetes and Rancher for deployment, I moved most of the configuration into Kubernetes ConfigMaps, and expose the key/value pairs their as environment variables. Sounded like a good idea, but …

but well, reading the documentation would have helped. Cron scripts do not see the normal environment of the surrounding process (cron, init, whatever), but get a cleaned up environment. As a consequence, none of the configuration keys available in the environment did show up in the cron jobs – and as a consequence they badly failed of course 😉

After some thinking and reading, I came up with two solutions, one “Dirty Harry^WNorby” solution and one clean and nice, but annoying solution.

Dirty Harry^WNorby solution

What is available in the environment of the cron jobs is minimal, and in fact more or less what is defined in /etc/environment and stuff set by the shell (if it is a shell script). So the solution was adding the necessary variable definitions to /etc/environment in a way that they are properly set. For that, I added the following code in the start-syslog-cron script that is the entry point of the container:

# prepare for export of variables to cron jobs
if [ -r /env-vars-to-be-exported ]
  for i in `cat /env-vars-to-be-exported`
    echo "$i=${!i}" >> /etc/environment

Meaning, if the container contains a file /env-vars-to-be-exported then the lines of it are considered variable names and are set in the environment with the respective values at the time of invocation.

Using this quick and dirty trick it is now dead-easy to get the ConfigMap variables into the cron job’s environment by adding the necessary variables names to the file /env-vars-to-be-exported. Thus, no adaption of the original source code was necessary – a big plus!

Be warned, there is no error checking etc, so one can mess up the container quite easily 😉

Standards solution

The more standard and clean solution is mounting the ConfigMap and reading the values from the exported files. This is possible, has the big advantage that one can change the values without restarting the containers (mounted ConfigMaps are updated when the ConfigMaps are changed – besides a few corner cases), and no nasty trickery in the initialization.

Disadvantage is that the code of the cron jobs needs to be changed to read the variables from the config files instead of the environment.

18 May, 2018 01:04AM by Norbert Preining

May 16, 2018

hackergotchi for Jonathan McDowell

Jonathan McDowell

Home Automation: Raspberry Pi as MQTT temperature sensor

After setting up an MQTT broker I needed some data to feed it. It made sense to start basic and gradually build up bits and pieces that would form a bigger home automation setup. As it happened I have an old Raspberry Pi B (original rev 1 [2 if you look at /proc/cpuinfo] with 256MB RAM) and some DS18B20 1-Wire temperature sensors lying around, so I decided to make a heavyweight temperature sensor (long term I’m hoping to do something with some ESP8266s).

There are plenty of guides out there about hooking up the DS18B20 to the Pi; Adafruit has a reasonable one. The short version is that GPIO4 can be easily configured to be a 1-Wire bus and you hook the DS18B20 up with a 4k7Ω resistor across the data + 3v3 power pins. An initial check can be performed by enabling the DT overlay on the fly:

sudo dtoverlay w1-gpio

Detection of 1-Wire devices is automatic so you should see an entry in dmesg looking like:

w1_master_driver w1_bus_master1: Attaching one wire slave 28.012345678abcd crc ef

You can then do

$ cat /sys/bus/w1/devices/28-*/w1_slave
1e 01 4b 46 7f ff 0c 10 18 : crc=18 YES
1e 01 4b 46 7f ff 0c 10 18 t=17875

Which shows a current temperature of 17.875°C in my sudy. Once that’s working (and you haven’t swapped GND and DATA like I did on the first go) you can make the Pi bootup with 1-Wire enabled by adding a dtoverlay=w1-gpio line to /boot/config.txt. The next step is to get that fed into the MQTT broker. A simple Python client seemed like the right approach. Debian has paho-mqtt but sadly not in a stable release. Thankfully the python3-paho-mqtt 1.3.1-1 package in testing installed just fine on the Raspbian stretch image my Pi is running. I dropped the following in /usr/locals/bin/mqtt-temp:


import glob
import time
import paho.mqtt.publish as publish

Broker = 'mqtt-host'
auth = {
    'username': 'user2',
    'password': 'bar',

pub_topic = 'test/temperature'

base_dir = '/sys/bus/w1/devices/'
device_folder = glob.glob(base_dir + '28-*')[0]
device_file = device_folder + '/w1_slave'

def read_temp():
    valid = False
    temp = 0
    with open(device_file, 'r') as f:
        for line in f:
            if line.strip()[-3:] == 'YES':
                valid = True
            temp_pos = line.find(' t=')
            if temp_pos != -1:
                temp = float(line[temp_pos + 3:]) / 1000.0

    if valid:
        return temp
        return None

while True:
    temp = read_temp()
    if temp is not None:
        publish.single(pub_topic, str(temp),
                hostname=Broker, port=8883,
                auth=auth, tls={})

And finished it off with a systemd unit file - I know a lot of people complain about systemd, but it really does make it easy to just spin up a minimal service as a unique non-privileged user. The following went in /etc/systemd/system/mqtt-temp.service:

Description=MQTT Temperature sensor

# Hack because Python can't cope with a DynamicUser with no HOME

RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX


Start it up and enable for subsequent reboots:

systemctl start mqtt-temp
systemctl enable mqtt-temp

And then watch on my Debian test box as before:

$ mosquitto_sub -h mqtt-host -p 8883 --capath /etc/ssl/certs/ -v -t '#' -u user1 -P foo
test/temperature 17.875
test/temperature 17.937

16 May, 2018 08:16PM

hackergotchi for Jonathan Carter

Jonathan Carter

Video Channel Updates

Last month, I started doing something that I’ve been meaning to do for years, and that’s to start a video channel and make some free software related videos.

I started out uploading to my YouTube channel which has been dormant for a really long time, and then last week, I also uploaded my videos to my own site, It’s a MediaDrop instance, a video hosting platform written in Python.

I’ll still keep uploading to YouTube, but ultimately I’d like to make my self-hosted site the primary source for my content. Not sure if I’ll stay with MediaDrop, but it does tick a lot of boxes, and if its easy enough to extend, I’ll probably stick with it. MediaDrop might also be a good platform for viewing the Debian meetings videos like the DebConf videos. 

My current topics are very much Debian related, but that doesn’t exclude any other types of content from being included in the future. Here’s what I have so far:

  • Video Logs: Almost like a blog, in video format.
  • Howto: Howto videos.
  • Debian Package of the Day: Exploring packages in the Debian archive.
  • Debian Package Management: Howto series on Debian package management, a precursor to a series that I’ll do on Debian packaging.
  • What’s the Difference: Just comparing 2 or more things.
  • Let’s Internet: Read stuff from Reddit, Slashdot, Quora, blogs and other media.

It’s still early days and there’s a bunch of ideas that I still want to implement, so the content will hopefully get a lot better as time goes on.

I have also quit Facebook last month, so I dusted off my old Mastodon account and started posting there again:

You can also subscribe to my videos via RSS:

Other than that I’m open to ideas, thanks for reading :)

16 May, 2018 06:19PM by jonathan

hackergotchi for Shirish Agarwal

Shirish Agarwal

FOSS game community slump and question about getting images in palepeli

There is a thread in which I have been following for the past few weeks.

In the back-and-forth argument, there I believe most of the arguments shared were somewhat wrong.

While we have AAA projects like 0ad and others, the mainstay of our games should be ones which doesn’t need any high-quality textures and still does the work.

I have been looking at a Let’s play playlist of an indie gem called ‘Dead in Vinland’

Youtube playlist link –

If you look at the game, it doesn’t have much in terms of animation apart from bits of role-playing in encounters but is more oriented towards towards roll of dice.

The characters in the game are sort of cut-out characters very much like the cut-out cardboard/paperboard characters that we used to play while as children.

Where the game innovates is more in terms of an expansive dialog-tree and at the most 100-200 images of the characters. It basically has a group of 4 permanent characters, any one of them dies and the gamer is defeated.

If anybody has played rogue or any of the games in the new debian games-rogue tasks they would know that foss shows variety of stories even in the rpg world.

$ aptitude show games-rogue
Package: games-rogue
Version: 2.3
State: installed
Automatically installed: no
Priority: optional
Section: metapackages
Maintainer: Debian Games Team
Architecture: all
Uncompressed Size: 24.6 k
Depends: games-tasks (= 2.3)
Recommends: allure, angband, crawl, gearhead, gearhead2, hearse, hyperrogue, lambdahack, meritous, moria, nethack-x11, omega-rpg, slashem
Description: Debian's roguelike games
This metapackage will install dungeon crawling games in the spirit of Rogue.

I took rpg those are the kind of games I have liked always and even in that turn-based rpg’s although do hate the fact that I can’t save scum as in most traditional rpg’s.

Variety and Palapeli

I now turn my attention to a package called variety.

While looking at it, also pushed a wishlist bug for packaging the new version which would fix a bug I reported about a month back. It was in the same conversation that I came to know that the upstream releaser and the downstream maintainer is one and the same.

I have also been talking with upstream about various features or what could be done to make variety better.

Now while that’s all well and good and variety does a good job of being a wallpaper changer, I need and want the wallpapers which is the output of variety to be the input of palapeli BUT without having to do lots of manual intervention as it currently requires. Variety does a pretty good job of giving good quality wallpapers and goes beyond in even giving pretty much info. on the images as it saves the metadata of the images unlike many online image services. To make things easier for myself. I made a directory called Variety in Pictures and copied everything from ~/.config/variety/Favorites by doing –

~/.config/variety/Favorites$ cp -r --preserve . /home/shirish/Pictures/Variety/

And this method works out fine enough. The –preserve hook is essential as can be seen from the cp manpage –

preserve the specified attributes (default: mode,ownership,timestamps), if possible additional
attributes: context, links, xattr, all

Even the metadata of the image is pretty good enough as can be seen from any random picture –

~/Pictures/Variety$ exiftool 7527881664_024e44f8bf_o.jpg
ExifTool Version Number : 10.96
File Name : 7527881664_024e44f8bf_o.jpg
Directory : .
File Size : 1175 kB
File Modification Date/Time : 2018:04:09 03:56:15+05:30
File Access Date/Time : 2018:05:14 22:59:22+05:30
File Inode Change Date/Time : 2018:05:15 08:01:42+05:30
File Permissions : rw-r--r--
File Type : JPEG
File Type Extension : jpg
MIME Type : image/jpeg
Exif Byte Order : Little-endian (Intel, II)
Image Description :
Make : Canon
Camera Model Name : Canon EOS 5D
X Resolution : 240
Y Resolution : 240
Resolution Unit : inches
Software : Adobe Photoshop Lightroom 3.6 (Windows)
Modify Date : 2012:07:08 17:58:13
Artist : Peter Levi
Copyright : Peter Levi
Exposure Time : 1/50
F Number : 7.1
Exposure Program : Aperture-priority AE
ISO : 160
Exif Version : 0230
Date/Time Original : 2011:05:03 12:17:50
Create Date : 2011:05:03 12:17:50
Shutter Speed Value : 1/50
Aperture Value : 7.1
Exposure Compensation : 0
Max Aperture Value : 4.0
Metering Mode : Multi-segment
Flash : Off, Did not fire
Focal Length : 45.0 mm
User Comment :
Focal Plane X Resolution : 3086.925795
Focal Plane Y Resolution : 3091.295117
Focal Plane Resolution Unit : inches
Custom Rendered : Normal
Exposure Mode : Auto
White Balance : Auto
Scene Capture Type : Standard
Owner Name : Tsvetan ROUSTCHEV
Serial Number : 1020707385
Lens Info : 24-105mm f/?
Lens Model : EF24-105mm f/4L IS USM
XP Comment :
Compression : JPEG (old-style)
Thumbnail Offset : 908
Thumbnail Length : 18291
XMP Toolkit : XMP Core 4.4.0-Exiv2
Creator Tool : Adobe Photoshop Lightroom 3.6 (Windows)
Metadata Date : 2012:07:08 17:58:13+03:00
Lens : EF24-105mm f/4L IS USM
Image Number : 1
Flash Compensation : 0
Firmware : 1.1.1
Format : image/jpeg
Version : 6.6
Process Version : 5.7
Color Temperature : 4250
Tint : +8
Exposure : 0.00
Shadows : 2
Brightness : +65
Contrast : +28
Sharpness : 25
Luminance Smoothing : 0
Color Noise Reduction : 25
Chromatic Aberration R : 0
Chromatic Aberration B : 0
Vignette Amount : 0
Shadow Tint : 0
Red Hue : 0
Red Saturation : 0
Green Hue : 0
Green Saturation : 0
Blue Hue : 0
Blue Saturation : 0
Fill Light : 0
Highlight Recovery : 23
Clarity : 0
Defringe : 0
Gray Mixer Red : -8
Gray Mixer Orange : -17
Gray Mixer Yellow : -21
Gray Mixer Green : -25
Gray Mixer Aqua : -19
Gray Mixer Blue : +8
Gray Mixer Purple : +15
Gray Mixer Magenta : +4
Split Toning Shadow Hue : 0
Split Toning Shadow Saturation : 0
Split Toning Highlight Hue : 0
Split Toning Highlight Saturation: 0
Split Toning Balance : 0
Parametric Shadows : 0
Parametric Darks : 0
Parametric Lights : 0
Parametric Highlights : 0
Parametric Shadow Split : 25
Parametric Midtone Split : 50
Parametric Highlight Split : 75
Sharpen Radius : +1.0
Sharpen Detail : 25
Sharpen Edge Masking : 0
Post Crop Vignette Amount : 0
Grain Amount : 0
Color Noise Reduction Detail : 50
Lens Profile Enable : 1
Lens Manual Distortion Amount : 0
Perspective Vertical : 0
Perspective Horizontal : 0
Perspective Rotate : 0.0
Perspective Scale : 100
Convert To Grayscale : True
Tone Curve Name : Medium Contrast
Camera Profile : Adobe Standard
Camera Profile Digest : 9C14C254921581D1141CA0E5A77A9D11
Lens Profile Setup : LensDefaults
Lens Profile Name : Adobe (Canon EF 24-105mm f/4 L IS USM)
Lens Profile Filename : Canon EOS-1Ds Mark III (Canon EF 24-105mm f4 L IS USM) - RAW.lcp
Lens Profile Digest : 0387279C5E7139287596C051056DCFAF
Lens Profile Distortion Scale : 100
Lens Profile Chromatic Aberration Scale: 100
Lens Profile Vignetting Scale : 100
Has Settings : True
Has Crop : False
Already Applied : True
Document ID : xmp.did:8DBF8F430DC9E11185FD94228EB274CE
Instance ID : xmp.iid:8DBF8F430DC9E11185FD94228EB274CE
Original Document ID : xmp.did:8DBF8F430DC9E11185FD94228EB274CE
Source URL :
Source Type : flickr
Author : peter-levi
Source Name : Flickr
Image URL :
Author URL :
Source Location :;user_id:93647178@N00;
Creator : Peter Levi
Rights : Peter Levi
Subject : paris, france, lyon, lyonne
Tone Curve : 0, 0, 32, 22, 64, 56, 128, 128, 192, 196, 255, 255
History Action : derived, saved
History Parameters : converted from image/x-canon-cr2 to image/jpeg, saved to new location
History Instance ID : xmp.iid:8DBF8F430DC9E11185FD94228EB274CE
History When : 2012:07:08 17:58:13+03:00
History Software Agent : Adobe Photoshop Lightroom 3.6 (Windows)
History Changed : /
Derived From :
Displayed Units X : inches
Displayed Units Y : inches
Current IPTC Digest : 044f85a540ebb92b9514cab691a3992d
Coded Character Set : UTF8
Application Record Version : 4
Date Created : 2011:05:03
Time Created : 12:17:50+03:00
Digital Creation Date : 2011:05:03
Digital Creation Time : 12:17:50+03:00
By-line : Peter Levi
Copyright Notice : Peter Levi
Headline : IMG_0779
Keywords : paris, france, lyon, lyonne
Caption-Abstract :
Photoshop Thumbnail : (Binary data 18291 bytes, use -b option to extract)
IPTC Digest : d8a6ddc0b5eacb05874d4b676f4cb439
Profile CMM Type : Linotronic
Profile Version : 2.1.0
Profile Class : Display Device Profile
Color Space Data : RGB
Profile Connection Space : XYZ
Profile Date Time : 1998:02:09 06:49:00
Profile File Signature : acsp
Primary Platform : Microsoft Corporation
CMM Flags : Not Embedded, Independent
Device Manufacturer : Hewlett-Packard
Device Model : sRGB
Device Attributes : Reflective, Glossy, Positive, Color
Rendering Intent : Perceptual
Connection Space Illuminant : 0.9642 1 0.82491
Profile Creator : Hewlett-Packard
Profile ID : 0
Profile Copyright : Copyright (c) 1998 Hewlett-Packard Company
Profile Description : sRGB IEC61966-2.1
Media White Point : 0.95045 1 1.08905
Media Black Point : 0 0 0
Red Matrix Column : 0.43607 0.22249 0.01392
Green Matrix Column : 0.38515 0.71687 0.09708
Blue Matrix Column : 0.14307 0.06061 0.7141
Device Mfg Desc : IEC
Device Model Desc : IEC 61966-2.1 Default RGB colour space - sRGB
Viewing Cond Desc : Reference Viewing Condition in IEC61966-2.1
Viewing Cond Illuminant : 19.6445 20.3718 16.8089
Viewing Cond Surround : 3.92889 4.07439 3.36179
Viewing Cond Illuminant Type : D50
Luminance : 76.03647 80 87.12462
Measurement Observer : CIE 1931
Measurement Backing : 0 0 0
Measurement Geometry : Unknown
Measurement Flare : 0.999%
Measurement Illuminant : D65
Technology : Cathode Ray Tube Display
Red Tone Reproduction Curve : (Binary data 2060 bytes, use -b option to extract)
Green Tone Reproduction Curve : (Binary data 2060 bytes, use -b option to extract)
Blue Tone Reproduction Curve : (Binary data 2060 bytes, use -b option to extract)
DCT Encode Version : 100
APP14 Flags 0 : [14]
APP14 Flags 1 : (none)
Color Transform : YCbCr
Image Width : 1920
Image Height : 1280
Encoding Process : Baseline DCT, Huffman coding
Bits Per Sample : 8
Color Components : 3
Y Cb Cr Sub Sampling : YCbCr4:4:4 (1 1)
Aperture : 7.1
Date/Time Created : 2011:05:03 12:17:50+03:00
Digital Creation Date/Time : 2011:05:03 12:17:50+03:00
Image Size : 1920x1280
Megapixels : 2.5
Scale Factor To 35 mm Equivalent: 1.0
Shutter Speed : 1/50
Thumbnail Image : (Binary data 18291 bytes, use -b option to extract)
Circle Of Confusion : 0.030 mm
Field Of View : 43.5 deg
Focal Length : 45.0 mm (35 mm equivalent: 45.1 mm)
Hyperfocal Distance : 9.51 m
Lens ID : Canon EF 24-105mm f/4L IS USM
Light Value : 10.6

One of the things I found wanting, now I dunno whether a field for CC-SA 3.0 (Creative Commons – Share Alike 3.0) . It does have everything else to know how a particular picture is/was taken.

I use the images I found as input in palapeli.

$ aptitude show palapeli
Package: palapeli
Version: 4:17.12.2-1
State: installed
Automatically installed: no
Priority: optional
Section: games
Maintainer: Debian/Kubuntu Qt/KDE Maintainers
Architecture: amd64
Uncompressed Size: 1,138 k
Depends: palapeli-data (>= 4:17.12.2-1), kio, libc6 (>= 2.14), libkf5archive5 (>= 4.96.0), libkf5completion5 (>= 4.97.0), libkf5configcore5 (>= 4.98.0), libkf5configgui5 (>= 4.97.0), libkf5configwidgets5 (>= 4.96.0), libkf5coreaddons5(>= 4.100.0), libkf5crash5 (>= 5.15.0), libkf5i18n5 (>= 4.97.0), libkf5itemviews5 (>= 4.96.0), libkf5kiowidgets5(>= 4.99.0), libkf5notifications5 (>= 4.96.0), libkf5service-bin, libkf5service5 (>= 4.99.0), libkf5widgetsaddons5 (>= 4.96.0), libkf5xmlgui5 (>= 4.98.0), libqt5core5a (>= 5.9.0~beta), libqt5gui5 (>=5.8.0), libqt5svg5 (>= 5.6.0~beta), libqt5widgets5 (>= 5.7.0~), libstdc++6 (>= 4.4.0)
Recommends: khelpcenter, qhull-bin
Description: jigsaw puzzle game
Palapeli is a jigsaw puzzle game. Unlike other games in that genre, you are not limited to aligning pieces on imaginary
grids. The pieces are freely moveable.

Palapeli is the Finnish word for jigsaw puzzle.

This package is part of the KDE games module.

Now to make a new puzzle for palapeli, it needs 3 info. at the very least,

a. The name of the image. For many images you need to make one up as many a times either photographers are either not imaginative or do not have the time to give meaningful descriptive name. I have had an interesting discussion on a similar topic with the authors of palapeli. There are lots to do that in getting descriptions right. At least in free software I dunno of any way to get images processed so that descriptions can be told/found out.

b. A comment – This is optional but if you have a memorable image that you have captured, this is the best way to do it. I hate to bring bing wallapers but have seen they have some of the best description for what I’m trying to say –

shirish@debian:~/Pictures/Variety$ exiftool ManateeMom_EN-US9983570199_1920x1080.jpg | grep Caption-Abstract
Caption-Abstract : West Indian manatee mom and baby at Three Sisters Springs, Florida (© James R.D. Scott/Getty Images)
shirish@debian:~/Pictures/Variety$ exiftool ManateeMom_EN-US9983570199_1920x1080.jpg | grep Comment
User Comment : West Indian manatee mom and baby at Three Sisters Springs, Florida (© James R.D. Scott/Getty Images)
XP Comment :
Comment : West Indian manatee mom and baby at Three Sisters Springs, Florida (© James R.D. Scott/Getty Images)

c. Name of the Author – Many a time I just write either ‘nameless’ or ‘unknown’ as the author hasn’t taken any pains to share who they are.

Why I have shared or talked about palapeli as they are the only ones I know of who has done some extensive works on getting unique slicers so that the shapes of the puzzle piece come all different. But it’s still a trial and error method as it doesn’t have a preview mode.

While there’s lot that could be improved in both the above projects, I would be happy just if atm there would be a script which can take the images as an input, add or fib details (although best would be to have actual details) and do the work.

I am sure if I put my mind to it, I’ll probably find ways in which even exiftool can be improved but can’t thank enough the people who have made such a wonderful tool.

For instance, while looking at the metadata of quite a few pictures, found that many people use arbitrary fields to tell where the picture was shot at, some use headline, some use comment or Description while others use subject. If somebody wanted to write a script it would be difficult as there may be images which may have all the 3 fields and have different content on them (possibly) under context known by the author only.

The good thing is we have got the latest version on Debian testing –

$ exiftool -ver

Peace out.

16 May, 2018 03:16PM by shirishag75

hackergotchi for Jonathan Dowland

Jonathan Dowland

Imaging DVD-Rs, Step 2: Initial Import

This is part 2 in a series about a project to read/import a large collection of home-made optical media. Part 1 was Imaging DVD-Rs: Overview > and Step 1; the summary page for the whole project is imaging discs.

Last time we prepared for the import by gathering all our discs together and organising storage for them in two senses: real-world (i.e. spindles) and a more future-proof digital storage system for the data, in my case, a NAS. This time we're actually going to read some discs. I suggest doing a quick first pass over your collection to image all the trouble-free discs (and identify the ones that are going to be harder to read). We will return to the troublesome ones in a later part.

For reading home-made optical discs, you could simply use cp:

cp /dev/sr0 disc-image.iso

This has the attraction of being a very simple solution but I don't recommend it, because of a lack of options for error handling. Instead I recommend using GNU ddrescue. It is designed to be fault tolerant and retries bad sectors in various ways to try and coax every last byte out of the medium. Crucially, a partially imported disc image can be further added to by subsequent runs of ddrescue, even on a separate computer.

For the first import, I recommend the suggested options from the ddrescue manual:

ddrescue -n -b2048 /dev/cdrom cdimage.iso cdimage.log

This will create a cdimage.iso file, hopefully containing your data, and a map file cdimage.log, describing what ddrescue managed to achieve. You should archive both!

This will either complete reasonably quickly (within one to two minutes), or will run potentially indefinitely. Once you've got a feel for how long a successful extraction takes, I'd recommend terminating any attempt that lasts much longer than that, and putting those discs to one side in a "needs attention" pile, to be re-attempted later. If ddrescue does finish, it will tell you if it couldn't read any of the disc. If so, put that disc in the "needs attention" pile too.

commercially-made discs

Above, I wrote that I recommend this approach for home-made data discs. Broadly, I am assuming that such discs use a limited set of options and features available to disc authors: they'll either be single session, or multisession but you aren't interested in any files that are masked by later sessions; they won't be mixed mode (no Audio tracks); there won't be anything unusual or important stored in the disc metadata, title, or subcodes; etcetera.

This is not always the case for commercial discs, or audio CDs or video DVDs. For those, you may wish to recover more information than is available to you via ddrescue. These aren't my focus right now, so I don't have much advice on how to handle them, although I might in the future.

labelling and storing images

If your discs are labelled as poorly or inconsistently as mine, it might not be obvious what filename to give each disc image. For my project I decided to append a new label to all imported discs, something like "blahX", where X is an incrementing number. So, for a fourth disc being imported with the label "my files", the image name would be my_files.blah5.iso. If you are keeping the physical discs after importing them, You could also mark the disc with "blah5".

where are we now

You should now have a pile of discs that you have successfully imported, a corresponding collection of disc image files/ddrescue log file pairs, and possibly a pile of "needs attention" discs.

In future parts, we will look at how to explore what's actually on the discs we have imaged: how to handle partially read or corrupted disc images; how to map the files on a disc to the sectors you have read, to identify which files are corrupted; and how to try to coax successful reads out of troublesome discs.

16 May, 2018 01:19PM

May 15, 2018

hackergotchi for Rapha&#235;l Hertzog

Raphaël Hertzog

Freexian’s report about Debian Long Term Support, April 2018

A Debian LTS logoLike each month, here comes a report about the work of paid contributors to Debian LTS.

Individual reports

In March, about 183 work hours have been dispatched among 13 paid contributors. Their reports are available:

  • Abhijith PA did 5 hours (out of 10 hours allocated, thus keeping 5 extra hours for May).
  • Antoine Beaupré did 12h.
  • Ben Hutchings did 17 hours (out of 15h allocated + 2 remaining hours).
  • Brian May did 10 hours.
  • Chris Lamb did 16.25 hours.
  • Emilio Pozuelo Monfort did 11.5 hours (out of 16.25 hours allocated + 5 remaining hours, thus keeping 9.75 extra hours for May).
  • Holger Levsen did nothing (out of 16.25 hours allocated + 16.5 hours remaining, thus keeping 32.75 extra hours for May). He did not get hours allocated for May and is expected to catch up.
  • Hugo Lefeuvre did 20.5 hours (out of 16.25 hours allocated + 4.25 remaining hours).
  • Markus Koschany did 16.25 hours.
  • Ola Lundqvist did 11 hours (out of 14 hours allocated + 9.5 remaining hours, thus keeping 12.5 extra hours for May).
  • Roberto C. Sanchez did 7 hours (out of 16.25 hours allocated + 15.75 hours remaining, but immediately gave back the 25 remaining hours).
  • Santiago Ruano Rincón did 8 hours.
  • Thorsten Alteholz did 16.25 hours.

Evolution of the situation

The number of sponsored hours did not change. But a few sponsors interested in having more than 5 years of support should join LTS next month since this was a pre-requisite to benefit from extended LTS support. I did update Freexian’s website to show this as a benefit offered to LTS sponsors.

The security tracker currently lists 20 packages with a known CVE and the dla-needed.txt file 16. At two week from Wheezy’s end-of-life, the number of open issues is close to an historical low.

Thanks to our sponsors

New sponsors are in bold.

No comment | Liked this article? Click here. | My blog is Flattr-enabled.

15 May, 2018 03:32PM by Raphaël Hertzog

hackergotchi for Norbert Preining

Norbert Preining

Specification and Verification of Software with CafeOBJ – Part 3 – First steps with CafeOBJ

This blog continues Part 1 and Part 2 of our series on software specification and verification with CafeOBJ.

We will go through basic operations like starting and stopping the CafeOBJ interpreter, getting help, doing basic computations.

Starting and leaving the interpreter

If CafeOBJ is properly installed, a call to cafeobj will greet you with information about the current version of CafeOBJ, as well as build dates and which build system has been used. The following is what is shown on my Debian system with the latest version of CafeOBJ installed:

$ cafeobj
-- loading standard prelude

            -- CafeOBJ system Version 1.5.7(PigNose0.99) --
                   built: 2018 Feb 26 Mon 6:01:31 GMT
                         prelude file: std.bin
                      2018 Apr 19 Thu 2:20:40 GMT
                            Type ? for help
                  -- Containing PigNose Extensions --
                             built on SBCL

After the initial information there is the prompt CafeOBJ> indicating that the interpreter is ready to process your input. By default several files (the prelude as it is called above) is loaded, which defines certain basic sorts and operations.

If you have enough of playing around, simply press Ctrl-D (the Control key and d at the same time), or type in quit:

CafeOB> quit

Getting help

Besides the extensive documentation available at the website (reference manual, user manual, tutorials, etc), the reference manual is also always at your fingertips within the CafeOBJ interpreter using the ? group of commands:

  • ? – gives general help
  • ?com class – shows available commands classified by ‘class’
  • ? name – gives the reference manual entry for name
  • ?ex name – gives examples (if available) for name
  • ?ap name – (apropos) searches the reference manual for appearances of name

To give an example on the usage, let us search for the term operator and then look at the documentation concerning one of them:

CafeOBJ> ?ap op
Found the following matches:
 . `:theory  :  ->  { assoc | comm | id:  }`
 . `op  :  ->  {  }`
 . on-the-fly declaration

CafeOBJ> ? op
`op  :  ->  {  }`
Defines an operator by its domain, co-domain, and the term construct.
`` is a space separated list of sort names, `` is a
single sort name. 

I have shortened the output a bit indicated by ....

Simple computations

By default, CafeOBJ is just a barren landscape, meaning that there are now rules or axioms active. Everything is encapsulated into so called modules (which in mathematical terms are definitions of order-sorted algebras). One of these modules is NAT which allows computations in the natural numbers. To activate a module we use open:

CafeOBJ> open NAT .

The ... again indicate quite some output of the CafeOBJ interpreter loading additional files.

There are two things to note in the above:

  • One finishes a command with a literal dot . – this is necessary due to the complete free syntax of the CafeOBJ language and indicates the end of a statement, similar to semicolons in other programming languages.
  • The prompt has changed to NAT> to indicate that the playground (context) we are currently working are the natural numbers.

To actually carry out computations we use the command red or reduce. Recall from the previous post that the computational model of CafeOBJ is rewriting, and in this setting reduce means kicking of the rewrite process. Let us do this for a simple computation:

%NAT>; red 2 + 3 * 4 .
-- reduce in %NAT : (2 + (3 * 4)):NzNat
(0.0000 sec for parse, 0.0000 sec for 2 rewrites + 2 matches)


Things to note in the above output:

  • Correct operator precedence: CafeOBJ correctly computes 14 due to the proper use of operator precedence. If you want to override the parsing you can use additional parenthesis.
  • CafeOBJ even gives a sort (or type) information for the return value: (14):NzNat, indicating that the return value of 14 is of sort NzNat, which refers to non-zero natural numbers.
  • The interpreter tells you how much time it spent in parsing and rewriting.

If we have enough of this playground, we close the opened module with close which returns us to the original prompt:

%NAT> close .

Now if you think this is not so interesting, let us to some more funny things, like computation with rational numbers, which are provided by CafeOBJ in the RAT module. Rational numbers can be written as slashed expressions: a / b. If we don’t want to actually reduce a given expression, we can use parse to tell CafeOBJ to parse the next expression and give us the parsed expression together with a sort:

CafeOBJ> open RAT .
%RAT> parse 3/4 .

Again, CafeOBJ correctly determined that the given value is a non-zero rational number. More complex expression can be parsed the same way, as well as reduced to minimal representation:

%RAT> parse 2 / ( 4 * 3 ) .
(2 / (4 * 3)):NzRat

%RAT> red 2 / ( 4 * 3 ) .
-- reduce in %RAT : (2 / (4 * 3)):NzRat
(0.0000 sec for parse, 0.0000 sec for 2 rewrites + 2 matches)


NAT and RAT are not the only built-in sorts, there are several more, and others can be defined easily (see next blog). The currently available data types, together with their sort order (recall that we are in order sorted algebras, so one sort can contain others):
NzNat < Nat < NzInt < Int < NzRat < Rat
which refer to non-zero natural numbers, natural numbers, non-zero integers, integers, non-zero rational numbers, rational numbers, respectively.

Then there are other data types unrelated (not ordered) to any other:
Triv, Bool, Float, Char, String, 2Tuple, 3Tuple, 4Tuple.


CafeOBJ does not have functions in the usual sense, but operators defined via there arity and a set of (rewriting) equations. Let us take a look at two simple functions in the natural numbers: square which takes one argument and returns the square of it, and a function sos which takes two arguments and returns the sum of squares of the arguments. In mathematical writing: square(a) = a * a and sos(a,b) = a*a + b*b.

This can be translated into CafeOBJ code as follows (from now on I will be leaving out the prompts):

open NAT .
vars A B : Nat
op square : Nat -> Nat .
eq square(A) = A * A .
op sos : Nat Nat -> Nat .
eq sos(A, B) = A * A + B * B .

This first declares two variables A and B to be of sort Nat (note that the module names and sort names are not the same, but the module names are usually the uppercase of the sort names). Then the operator square is introduced by providing its arity. In general an operator can have several input variables, and for each of them as well as the return value we have to provide the sorts:

  op  NAME : Sort1 Sort2 ... -> Sort

defines an operator NAME with several input parameters of the given sorts, and the return sort Sort.

The next line gives one (the only necessary) equation governing the computation rules of square. Equations are introduced by eq (and some variants of it), followed by an expression, and equal sign, and another expression. This indicates that CafeOBJ may rewrite the left expression to the right expression.

In our case we told CafeOBJ that it may rewrite an expression of the form square(A) to A * A, where A can be anything of sort Nat (for now we don't go into details how order-sorted rewriting works in general).

The next two lines do the same for the operator sos.

Having this code in place, we can easily do computations with it by using the already introduced reduce command:

red square(1) .
-- reduce in %NAT : (square(10)):Nat

red sos(10,20) .
-- reduce in %NAT : (f(10,20)):Nat

What to do if one equation does not service? Let us look at a typical recursive definition of sum of natural numbers: sum(0) = 0 and for a > 0 we have sum(a) = a + sum(a-1). This can be easily translated into CafeOBJ as follows:

open NAT .
op sum : Nat -> Nat .
eq sum(0) = 0 .
eq sum(A:NzNat) = A + sum(p A) .
red sum(10) .

where p (for predecessor) indicates the next smaller natural number. This operator is only defined on non-zero natural numbers, though.

In the above fragment we also see a new style of declaring variables, on the fly: The first occurrence of a variable in an equation can carry a sort declaration, which extends all through the equation.

Running the above code we get, not surprisingly 55, in particular:

-- reduce in %NAT : (sum(10)):Nat
(0.0000 sec for parse, 0.0000 sec for 31 rewrites + 41 matches)

As a challenge the reader might try to give definitions of the factorial function and the Fibonacci function, the next blog will present solutions for it.

This concludes the second part. In the next part we will look at defining modules (aka algebras aka theories) and use them to define lists.

15 May, 2018 02:38PM by Norbert Preining

Enrico Zini

Starting user software in X

There are currently many ways of starting software when a user session starts.

This is an attempt to collect a list of pointers to piece the big picture together. It's partial and some parts might be imprecise or incorrect, but it's a start, and I'm happy to keep it updated if I receive corrections.


man xsession

  • Started by the display manager for example, /usr/share/lightdm/lightdm.conf.d/01_debian.conf or /etc/gdm3/Xsession
  • Debian specific
  • Runs scripts in /etc/X11/Xsession.d/
  • /etc/X11/Xsession.d/40x11-common_xsessionrc sources ~/.xsessionrc which can do little more than set env vars, because it is run at the beginning of X session startup
  • At the end, it starts the session manager (gnome-session, xfce4-session, and so on)

systemd --user

  • Started by pam_systemd, so it might not have a DISPLAY variable set in the environment yet
  • Manages units in:
    • /usr/lib/systemd/user/ where units provided by installed packages belong.
    • ~/.local/share/systemd/user/ where units of packages that have been installed in the home directory belong.
    • /etc/systemd/user/ where system-wide user units are placed by the system administrator.
    • ~/.config/systemd/user/ where the users put their own units.
  • A trick to start a systemd user unit when the X session has been set up and the DISPLAY variable is available, is to call systemctl start from a .desktop autostart file.

dbus activation

X session manager

xdg autostart

Other startup notes


To connect to an X server, a client needs to send a token from ~/.Xauthority, which proves that they can read the user's provate data.

~/.Xauthority contains a token generated by display manager and communicated to X at startup.

To view its contents, use xauth -i -f ~/.Xauthority list

15 May, 2018 12:06PM

hackergotchi for Wouter Verhelst

Wouter Verhelst

Digitizing my DVDs

I have a rather sizeable DVD collection. The database that I created of them a few years back after I'd had a few episodes where I accidentally bought the same movie more than once claims there's over 300 movies in the cabinet. Additionally, I own a number of TV shows on DVD, which, if you count individual disks, will probably end up being about the same number.

A few years ago, I decided that I was tired of walking to the DVD cabinet, taking out a disc, and placing it in the reader. That instead, I wanted to digitize them and use kodi to be able to watch a movie whenever I felt like it. So I made some calculations, and came up with a system with enough storage (on ZFS, of course) to store all the DVDs without needing to re-encode them.

I got started on ripping most of the DVDs using dvdbackup, but it quickly became apparent that I'd made a miscalculation; where I thought that most of the DVDs would be 4.7G ones, it turns out that most commercial DVDs are actually of the 9G type. Come to think of it, that does make a lot of sense. Additionally, now that I had a home server that had some significant reduntant storage, I found that I had some additional uses for such things. The storage that I had, vast enough though it may be, wouldn't suffice.

So, I gave this some more thought, but then life interfered and nothing happened for a few years.

Recently however, I've picked it up again, changing my workflow. I started using handbrake to re-encode the DVDs so they wouldn't take up quite so much space; having chosen VP9 as my preferred codec, I end up storing the DVDs as about 1 to 2 G per main feature, rather than the 8 to 9 that it used to be -- a significant gain. However, my first workflow wasn't very efficient; I would run the handbrake GUI from my laptop on ssh -X sessions to multiple machines, encoding the videos directly from DVD that way. That worked, but it meant I couldn't shut down my laptop to take it to work without interrupting work that was happening; also, it meant that if a DVD finished encoding in the middle of the night, I wouldn't be there to replace it, so the system would be sitting idle for several hours. Clearly some form of improvement was necessary if I was going to do this in any reasonable amount of time.

So after fooling around a bit, I came up with the following:

  • First, I use dvdbackup -M -r a to read the DVD without re-encoding anything. This can be done at the speed of the optical medium, and can therefore be done much more efficiently than to use handbrake directly from the DVD. The -M option tells dvdbackup to read everything from the DVD (to make a mirror of it, in effect). The -r a option tells dvdbackup to abort if it encounters a read error; I found that DVDs sometimes can be read successfully if I eject the drive and immediately reinsert it, or if I give the disk another clean, or even just try again in a different DVD reader. Sometimes the disk is just damaged, and then using dvdbackup's default mode of skipping the unreadable blocks makes sense, but not in a first attempt.
  • Then, I run a small little perl script that I wrote. It basically does two things:

    1. Run HandBrakeCLI -i <dvdbackup output> --previews 1 -t 0, parse its stderr output, and figure out what the first and the last titles on the DVD are.
    2. Run qsub -N <movie name> -v FILM=<dvdbackup output> -t <first title>-<last title> convert-film
  • The convert-film script is a bash script, which (in its first version) did this:

    mkdir -p "$OUTPUTDIR/$FILM/tmp"
    HandBrakeCLI -x "threads=1" --no-dvdnav -i "$INPUTDIR/$FILM" -e vp9 -E copy -T -t $SGE_TASK_ID --all-audio --all-subtitles -o "$OUTPUTDIR/$FILM/tmp/T${SGE_TASK_ID}.mkv"

    Essentially, that converts a single title to a VP9-encoded matroska file, with all the subtitles and audio streams intact, and forcing it to use only one thread -- having it use multiple threads is useful if you care about a single DVD converting as fast as possible, but I don't, and having four DVDs on a four-core system all convert at 100% CPU seems more efficient than having two convert at about 180% each. I did consider using HandBrakeCLI's options to only extract the "interesting" audio and subtitle tracks, but I prefer to not have dubbed audio (to have subtitled audio instead); since some of my DVDs are originally in non-English languages, doing so gets rather complex. The audio and subtitle tracks don't take up that much space, so I decided not to bother with that in the end.

The use of qsub, which submits the script into gridengine, allows me to hook up several encoder nodes (read: the server plus a few old laptops) to the same queue.

That went pretty well, until I wanted to figure out how far along something was going. HandBrakeCLI provides progress information on stderr, and I can just do a tail -f of the stderr output logs, but that really works well only for one one DVD at a time, not if you're trying to follow along with about a dozen of them.

So I made a database, and wrote another perl script. This latter will parse the stderr output of HandBrakeCLI, fish out the progress information, and put the completion percentage as well as the ETA time into a database. Then it became interesting:

  IF (TG_OP = 'INSERT') OR (TG_OP = 'UPDATE' AND (NEW.progress != OLD.progress) OR NEW.finished = TRUE) THEN
    PERFORM pg_notify('transjob', row_to_json(NEW)::varchar);
$$ LANGUAGE plpgsql;
CREATE TRIGGER transjob_tcn_trigger

This uses PostgreSQL's asynchronous notification feature to send out a notification whenever an interesting change has happened to the table.

#!/usr/bin/perl -w

use strict;
use warnings;

use Mojolicious::Lite;
use Mojo::Pg;


helper dbh => sub { state $pg = Mojo::Pg->new->dsn("dbi:Pg:dbname=transcode"); };

websocket '/updates' => sub {
    my $c = shift;
    my $cb = $c->dbh->pubsub->listen(transjob => sub { $c->send(pop) });
    $c->on(finish => sub { shift->dbh->pubsub->unlisten(transjob => $cb) });


This uses the Mojolicious framework and Mojo::Pg to send out the payload of the "transjob" notification (which we created with the FOR EACH ROW trigger inside PostgreSQL earlier, and which contains the JSON version of the table row) over a WebSocket. Then it's just a small matter of programming to write some javascript which dynamically updates the webpage whenever that happens, and Tadaa! I have an online overview of the videos that are transcoding, and how far along they are.

That only requires me to keep the queue non-empty, which I can easily do by running dvdbackup a few times in parallel every so often. That's a nice saturday afternoon project...

15 May, 2018 08:44AM

Reproducible builds folks

Reproducible Builds: Weekly report #159

Here’s what happened in the Reproducible Builds effort between Sunday May 6 and Saturday May 12 2018:

Packages reviewed and fixed, and bugs filed

diffoscope development

diffoscope is our in-depth “diff-on-steroids” utility which helps us diagnose reproducibility issues in packages. This week, version 94 was uploaded to Debian unstable and PyPI by Chris Lamb. It included contributions already convered by posts in previous weeks as well as new ones from:

Mattia Rizzolo subsequently backported this version to stretch.

After the release of version 94, the development continued with the following contributions from Mattia Rizzolo:

disorderfs development

Version 0.5.3-1 of disorderfs (our FUSE-based filesystem that introduces non-determinism) was uploaded to unstable by Chris Lamb. It included contributions already convered by posts in previous weeks as well as new ones from: development

Mattia Rizzolo made the following changes to our Jenkins-based testing framework, including:


This week’s edition was written by Bernhard M. Wiedemann, Chris Lamb, Mattia Rizzolo & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

15 May, 2018 07:20AM

Russ Allbery

Review: Thanks for the Feedback

Review: Thanks for the Feedback, by Douglas Stone & Sheila Heen

Publisher: Penguin
Copyright: 2014
Printing: 2015
ISBN: 1-101-61427-7
Format: Kindle
Pages: 322

Another book read for the work book club.

I was disappointed when this book was picked. I already read two excellent advice columns (Captain Awkward and Ask a Manager) and have read a lot on this general topic. Many workplace-oriented self-help books also seem to be full a style of pop psychology that irritates me rather than informs. But the point of a book club is that you read the book anyway, so I dove in. And was quite pleasantly surprised.

This book is about receiving feedback, not about giving feedback. There are tons of great books out there about how to give feedback, but, as the authors say in the introduction, almost no one giving you feedback is going to read any of them. It would be nice if we all got better at giving feedback, but it's not going to happen, and you can't control other people's feedback styles. You can control how you receive feedback, though, and there's quite a lot one can do on the receiving end. The footnoted subtitle summarizes the tone of the book: The Science and Art of Receiving Feedback Well (even when it is off base, unfair, poorly delivered, and, frankly, you're not in the mood).

The measure of a book like this for me is what I remember from it several weeks after reading it. Here, it was the separation of feedback into three distinct types: appreciation, coaching, and evaluation. Appreciation is gratitude and recognition for what one has accomplished, independent of any comparison against other people or an ideal for that person. Coaching is feedback aimed at improving one's performance. And evaluation, of course, is feedback that measures one against a standard, and usually comes with consequences (a raise, a positive review, a relationship break-up). We all need all three but different people need different mixes, sometimes quite dramatically so. And one of the major obstacles in the way of receiving feedback well is that they tend to come mixed or confused.

That framework makes it easier to see where one's reaction to feedback often goes off the rails. If you come into a conversation needing appreciation ("I've been working long hours to get this finished on time, and a little thanks would be nice"), but the other person is focused on an opportunity for coaching ("I can point out a few tricks and improvements that will let you not work as hard next time"), the resulting conversation rarely goes well. The person giving the coaching is baffled at the resistance to some simple advice on how to improve, and may even form a negative opinion of the other person's willingness to learn. And the person receiving the feedback comes away feeling unappreciated and used, and possibly fearful that their hard work is merely a sign of inadequate skills. There are numerous examples of similar mismatches.

I found this framing immediately useful, particularly in the confusion between coaching and evaluation. It's very easy to read any constructive advice as negative evaluation, particularly if one is already emotionally low. Having words to put to these types of feedback makes it easier to evaluate the situation intellectually rather than emotionally, and to explicitly ask for clarifying evaluation if coaching is raising those sorts of worries.

The other memorable concept I took away from this book is switchtracking. This is when the two people in a conversation are having separate arguments simultaneously, usually because each person has a different understanding of what the conversation is "really" about. Often this happens when the initial feedback sets off a trigger, particularly a relationship or identity trigger (other concepts from this book), in the person receiving it. The feedback giver may be trying to give constructive feedback on how to lay out a board presentation, but the receiver is hearing that they can't be trusted to talk to the board on their own. The receiver will tend to switch the conversation away to whether or not they can be trusted, quite likely confusing the initial feedback giver, or possibly even prompting another switchtrack into a third topic of whether they can receive criticism well.

Once you become aware of this tendency, you start to see it all over the place. It's sadly common. The advice in the book, which is accompanied with a lot of concrete examples, is to call this out explicitly, clearly separate and describe the topics, and then pick one to talk about first based on how urgent the topics are to both parties. Some of those conversations may still be difficult, but at least both parties are having the same conversation, rather than talking past each other.

Thanks for the Feedback fleshes out these ideas and a few others (such as individual emotional reaction patterns to criticism and triggers that interfere with one's ability to accept feedback) with a lot of specific scenarios. The examples are refreshingly short and to the point, avoiding a common trap of books like this to get bogged down into extended artificial dialogue. There's a bit of a work focus, since we get a lot of feedback at work, but there's nothing exclusively work-related about the advice here. Many of the examples are from personal relationships of other kinds. (I found an example of a father teaching his daughters to play baseball particularly memorable. One daughter takes this as coaching and the other as evaluation, resulting in drastically different reactions.) The authors combine matter-of-fact structured information with a gentle sense of humor and great pacing, making this surprisingly enjoyable to read.

I was feeling oversaturated with information on conversation styles and approaches and still came away from this book with some useful additional structure. If you're struggling with absorbing feedback or finding the right structure to use it constructively instead of getting angry, scared, or depressed, give this a try. It's much better than I had expected.

Rating: 7 out of 10

15 May, 2018 04:35AM

May 14, 2018

hackergotchi for Daniel Pocock

Daniel Pocock

A closer look at power and PowerPole

The crowdfunding campaign has so far raised enough money to buy a small lead-acid battery but hopefully with another four days to go before OSCAL we can reach the target of an AGM battery. In the interest of transparency, I will shortly publish a summary of the donations.

The campaign has been a great opportunity to publish some information that will hopefully help other people too. In particular, a lot of what I've written about power sources isn't just applicable for ham radio, it can be used for any demo or exhibit involving electronics or electrical parts like motors.

People have also asked various questions and so I've prepared some more details about PowerPoles today to help answer them.

OSCAL organizer urgently looking for an Apple MacBook PSU

In an unfortunate twist of fate while I've been blogging about power sources, one of the OSCAL organizers has a MacBook and the Apple-patented PSU conveniently failed just a few days before OSCAL. It is the 85W MagSafe 2 PSU and it is not easily found in Albania. If anybody can get one to me while I'm in Berlin at Kamailio World then I can take it to Tirana on Wednesday night. If you live near one of the other OSCAL speakers you could also send it with them.

If only Apple used PowerPole...

Why batteries?

The first question many people asked is why use batteries and not a power supply. There are two answers for this: portability and availability. Many hams like to operate their radios away from their home sometimes. At an event, you don't always know in advance whether you will be close to a mains power socket. Taking a battery eliminates that worry. Batteries also provide better availability in times of crisis: whenever there is a natural disaster, ham radio is often the first mode of communication to be re-established. Radio hams can operate their stations independently of the power grid.

Note that while the battery looks a lot like a car battery, it is actually a deep cycle battery, sometimes referred to as a leisure battery. This type of battery is often promoted for use in caravans and boats.

Why PowerPole?

Many amateur radio groups have already standardized on the use of PowerPole in recent years. The reason for having a standard is that people can share power sources or swap equipment around easily, especially in emergencies. The same logic applies when setting up a demo at an event where multiple volunteers might mix and match equipment at a booth.

WICEN, ARES / RACES and RAYNET-UK are some of the well known groups in the world of emergency communications and they all recommend PowerPole.

Sites like eBay and Amazon have many bulk packs of PowerPoles. Some are genuine, some are copies. In the UK, I've previously purchased PowerPole packs and accessories from sites like Torberry and Sotabeams.

The pen is mightier than the sword, but what about the crimper?

The PowerPole plugs for 15A, 30A and 45A are all interchangeable and they can all be crimped with a single tool. The official tool is quite expensive but there are many after-market alternatives like this one. It takes less than a minute to insert the terminal, insert the wire, crimp and make a secure connection.

Here are some packets of PowerPoles in every size:

Example cables

It is easy to make your own cables or to take any existing cables, cut the plugs off one end and put PowerPoles on them.

Here is a cable with banana plugs on one end and PowerPole on the other end. You can buy cables like this or if you already have cables with banana plugs on both ends, you can cut them in half and put PowerPoles on them. This can be a useful patch cable for connecting a desktop power supply to a PowerPole PDU:

Here is the Yaesu E-DC-20 cable used to power many mobile radios. It is designed for about 25A. The exposed copper section simply needs to be trimmed and then inserted into a PowerPole 30:

Many small devices have these round 2.1mm coaxial power sockets. It is easy to find a packet of the pigtails on eBay and attach PowerPoles to them (tip: buy the pack that includes both male and female connections for more versatility). It is essential to check that the devices are all rated for the same voltage: if your battery is 12V and you connect a 5V device, the device will probably be destroyed.

Distributing power between multiple devices

There are a wide range of power distribution units (PDUs) for PowerPole users. Notice that PowerPoles are interchangeable and in some of these devices you can insert power through any of the inputs. Most of these devices have a fuse on every connection for extra security and isolation. Some of the more interesting devices also have a USB charging outlet. The West Mountain Radio RigRunner range includes many permutations. You can find a variety of PDUs from different vendors through an Amazon search or eBay.

In the photo from last week's blog, I have the Fuser-6 distributed by Sotabeams in the UK (below, right). I bought it pre-assembled but you can also make it yourself. I also have a Windcamp 8-port PDU purchased from Amazon (left):

Despite all those fuses on the PDU, it is also highly recommended to insert a fuse in the section of wire coming off the battery terminals or PSU. It is easy to find maxi blade fuse holders on eBay and in some electrical retailers:

Need help crimping your cables?

If you don't want to buy a crimper or you would like somebody to help you, you can bring some of your cables to a hackerspace or ask if anybody from the Debian hams team will bring one to an event to help you.

I'm bringing my own crimper and some PowerPoles to OSCAL this weekend, if you would like to help us power up the demo there please consider contributing to the crowdfunding campaign.

14 May, 2018 07:25PM by Daniel.Pocock

hackergotchi for Olivier Berger

Olivier Berger

Implementing an example Todo-Backend REST API with Symfony 4 and api-platform

Todo-Backend lists many implementations of the same REST API with different backend-oriented Web development frameworks.

I’ve proposed my own version using Symfony 4 in PHP, and the api-platform project which helps implementing REST APIs.

I’ve documented the way I did it in the project’s documentation in details, for those curious about Symfony development of a very simple REST API (JSON based). See its README file (of course redacted with the mandatory org-mode ;).

You can find the rest of the code here :

AFAICS api-platform offers a great set of features for Linked-Data/REST development with Symfony in general. However, some tweaks were necessary to conform the TodoBackend specs, mainly because TodoBackend is JSON only and doesn’t support JSON-LD…

Oh, and the hardest part was deploying on Heroku, making sure that the CORS headers would work as expected :-/



14 May, 2018 02:20PM by Olivier Berger

Russ Allbery

Review: Twitter and Tear Gas

Review: Twitter and Tear Gas, by Zeynep Tufekci

Publisher: Yale University Press
Copyright: 2017
ISBN: 0-300-21512-6
Format: Kindle
Pages: 312

Subtitled The Power and Fragility of Networked Protest, Twitter and Tear Gas is a close look at the effect of social media (particularly, but not exclusively, Twitter and Facebook) on protest movements around the world. Tufekci pays significant attention to the Tahrir Square protests in Egypt, the Gezi Park protests in Turkey, Occupy Wall Street and the Tea Party in the United States, Black Lives Matter also in the United States, and the Zapatista uprising in Mexico early in the Internet era, as well as more glancing attention to multiple other protest movements since the advent of the Internet. She avoids both extremes of dismissal of largely on-line movements and the hailing of social media as a new era of mass power, instead taking a detailed political and sociological look at how protest movements organized and fueled via social media differ in both strengths and weaknesses from the movements that came before.

This is the kind of book that could be dense and technical but isn't. Tufekci's approach is analytical but not dry or disengaged. She wants to know why some protests work and others fail, what the governance and communication mechanisms of protest movements say about their robustness and capabilities, and how social media has changed the tools and landscape used by protest movements. She's also been directly involved: she's visited the Zapatistas, grew up in Istanbul and is directly familiar with the politics of the Gezi Park protests, and includes in this book a memorable story of being caught in the Antalya airport in Turkey during the 2016 attempted coup. There are some drier and more technical chapters where she's laying the foundations of terminology and analysis, but they just add rigor to an engaging, thoughtful examination of what a protest is and why it works or doesn't work.

My favorite part of this book, by far, was the intellectual structure it gave me for understanding the effectiveness of a protest. That's something about which media coverage tends to be murky, at least in situations short of a full-blown revolutionary uprising (which are incredibly rare). The goal of a protest is to force a change, and clearly sometimes this works. (The US Civil Rights movement and the Indian independence movement are obvious examples. The Arab Spring is a more recent if more mixed example.) However, sometimes it doesn't; Tufekci's example is the protests against the Iraq War. Why?

A key concept of this book is that protests signal capacity, particularly in democracies. That can be capacity to shape a social narrative and spread a point of view, capacity to disrupt the regular operations of a system of authority, or capacity to force institutional change through the ballot box or other political process. Often, protests succeed to the degree that they signal capacity sufficient to scare those currently in power into compromising or acquiescing to the demands of the protest movement. Large numbers of people in the streets matter, but not usually as a show of force. Violent uprisings are rare and generally undesirable for everyone. Rather, they matter because they demand and hold media attention (allowing them to spread a point of view), can shut down normal business and force an institutional response, and because they represent people who can exert political power or be tapped by political rivals.

This highlights one of the key differences between protest in the modern age and protest in a pre-Internet age. The March on Washington at the height of the Civil Rights movement was an impressive demonstration of capacity largely because of the underlying organization required to pull off a large and successful protest in that era. Behind the scenes were impressive logistical and governance capabilities. The same organizational structure that created the March could register people to vote, hold politicians accountable, demand media attention, and take significant and effective economic action. And the government knew it.

One thing that social media does is make organizing large protests far easier. It allows self-organizing, with viral scale, which can create numerically large movements far easier than the dedicated organizational work required prior to the Internet. This makes protest movements more dynamic and more responsive to events, but it also calls into question how much sustained capacity the movement has. The government non-reaction to the anti-war protests in the run-up to the Iraq War was an arguably correct estimation of the signaled capacity: a bet that the anti-war sentiment would not turn into sustained institutional pressure because large-scale street protests no longer indicated the same underlying strength.

Signaling capacity is not, of course, the only purpose of protests. Tufekci also spends a good deal of time discussing the sense of empowerment that protests can create. There is a real sense in which protests are for the protesters, entirely apart from whether the protest itself forces changes to government policies. One of the strongest tools of institutional powers is to make each individual dissenter feel isolated and unimportant, to feel powerless. Meeting, particularly in person, with hundreds of other people who share the same views can break that illusion of isolation and give people the enthusiasm and sense of power to do something about their beliefs. This, however, only becomes successful if the protesters then take further actions, and successful movements have to provide some mechanism to guide and unify that action and retain that momentum.

Tufekci also provides a fascinating analysis of the evolution of government responses to mass protests. The first reaction was media blackouts and repression, often by violence. Although we still see some of that, particularly against out groups, it's a risky and ham-handed strategy that dramatically backfired for both the US Civil Rights movement (due to an independent press that became willing to publish pictures of the violence) and the Arab Spring (due to social media providing easy bypass of government censorship attempts). Governments do learn, however, and have become increasingly adept at taking advantage of the structural flaws of social media. Censorship doesn't work; there are too many ways to get a message out. But social media has very little natural defense against information glut, and the people who benefit from the status quo have caught on.

Flooding social media forums with government propaganda or even just random conspiratorial nonsense is startlingly effective. The same lack of institutional gatekeepers that destroys the effectiveness of central censorship also means there are few trusted ways to determine what is true and what is fake on social media. Governments and other institutional powers don't need to convince people of their point of view. All they need to do is create enough chaos and disinformation that people give up on the concept of objective truth, until they become too demoralized to try to weed through the nonsense and find verifiable and actionable information. Existing power structures by definition benefit from apathy, disengagement, delay, and confusion, since they continue to rule by default.

Tufekci's approach throughout is to look at social media as a change and a new tool, which is neither inherently good or bad but which significantly changes the landscape of political discourse. In her presentation (and she largely convinced me in this book), the social media companies, despite controlling the algorithms and platform, don't particularly understand or control the effects of their creation except in some very narrow and profit-focused ways. The battlegrounds of "fake news," political censorship, abuse, and terrorist content are murky swamps less out of deliberate intent and more because companies have built a platform they have no idea how to manage. They've largely supplanted more traditional political spheres and locally-run social media with huge international platforms, are now faced with policing the use of those platforms, and are way out of their depth.

One specific example vividly illustrates this and will stick with me. Facebook is now one of the centers of political conversation in Turkey, as it is in many parts of the world. Turkey has a long history of sharp political divisions, occasional coups, and a long-standing, simmering conflict between the Turkish government and the Kurds, a political and ethnic minority in southeastern Turkey. The Turkish government classifies various Kurdish groups as terrorist organizations. Those groups unsurprisingly disagree. The arguments over this inside Turkey are vast and multifaceted.

Facebook has gotten deeply involved in this conflict by providing a platform for political arguments, and is now in the position of having to enforce their terms of service against alleged terrorist content (or even simple abuse), in a language that Facebook engineers largely don't speak and in a political context that they largely know nothing about. They of course hire Turkish speakers to try to understand that content to process abuse reports. But, as Tufekci (a Turkish native) argues, a Turkish speaker who has the money, education, and family background to be working in an EU Facebook office in a location like Dublin is not randomly chosen from the spectrum of Turkish politics. They are more likely to have connections to or at least sympathies for the Turkish government or business elites than to be related to a family of poor and politically ostracized Kurds. It's therefore inevitable that bias will be seen in Facebook's abuse report handling, even if Facebook management intends to stay neutral.

For Turkey, you can substitute just about any other country about which US engineers tend to know little. (Speaking as a US native, that's a very long list.) You may even be able to substitute the US for Turkey in some situations, given that social media companies tend to outsource the bulk of the work to countries that can provide low-paid workers willing to do the awful job of wading through the worst of humanity and attempting to apply confusing and vague terms of service. Much of Facebook's content moderation is done in the Philippines, by people who may or may not understand the cultural nuances of US political fights (and, regardless, are rarely given enough time to do more than cursorily glance at each report).

Despite the length of this review, there are yet more topics in this book I haven't mentioned, such as movement governance. (As both an advocate for and critic of consensus-based decision-making, Tufekci's example of governance in Occupy Wall Street had me both fascinated and cringing.) This is excellent stuff, full of personal anecdotes and entertaining story-telling backed by thoughtful and structured analysis. If you have felt mystified by the role that protests play in modern politics, I highly recommend reading this.

Rating: 9 out of 10

14 May, 2018 03:45AM

May 13, 2018

hackergotchi for Norbert Preining

Norbert Preining

Gaming: Lara Croft – Rise of the Tomb Raider: 20 Year Celebration

I have to admit, this is the first time that I playing something like this. Somehow, Lara Croft – Rise of the Tomb Raider was on sale, and some of the trailers were so well done that I was tempted in getting this game. And to my surprise, it actually works pretty well on Linux, too – yeah!

So I am a first time player in this kind of league, and had a hard time getting used to controlling lovely Lara, but it turned out easier than I thought – although I guess a controller instead of mouse and kbd would be better. One starts out somewhere in the moutains (I probably bought the game because there is so much of mountaineering in the parts I have seen till now 😉 trying to evade breaking crevices, jumping from ledges to ledges, getting washed away by avalanches, full program.

But my favorite till now in the game is that Lara always carries an ice ax. Completely understandable in the mountain trips, where she climbs frozen ice falls, hanging cliffs, everything, like a super-pro. Wow, I would like to be such an ice climber! BUt even in the next mission in Syria, she still has her ice ax with here, very conveniently dangling from her side. How suuuper-cool!

After being washed away by an avalanche we find Lara back on a trip in Syria, followed and nearly killed by the mysterious Trinity organization. During the Syria operation she needs to learn quite a lot of Greek, unfortunately the player doesn’t have to learn with her – I could need some polishing of my Ancient Greek.

The game is a first-of-its-kind for me, with long cinematic parts between the playing actions. The switch between cinematic and play is so well done that I sometimes have the feeling I need to control Lara during these times, too. The graphics are also very stunning to my eyes, impressive.

I never have played and Lara game or seen and Lara movie, but my first association was to the Die Hard movie series – always these dirty clothes, scratches and dirt covered body. Lara is no exception here. Last but not least, the deaths of Lara (one – at least I – often dies in these games) are often quite funny and entertaining: spiked in some tombs, smashed to pieces by a falling stone column, etc. I really have to learn it the hard way.

I only have finished two expeditions, no idea how many of them are there to come. But seems like I will continue. Good thing is that there are lots of restart points and permanent saves, so if one dies, or the computer dies, one doesn’t have to redo the whole bunch. Well done.

13 May, 2018 11:29PM by Norbert Preining

Renata D'Avila

Debian Women in Curitiba

This post is long overdue, but I have been so busy lately that I didn't have the time to sit down and write it in the past few weeks. What have I been busy with? Let's start with this event, that happened back in March:

Debian Women meeting in Curitiba (March 10th, 2018)

The eight women who attended the meeting gathered together in front of a tv with the Debian Women logo

At MiniDebConf Curitiba last year, few women attended. And, as I mentioned on a previous post, there was not even a single women speaking at MiniDebConf last year.

I didn't want MiniDebConf Curitiba 2018 to be a repeat of last year. Why? In part, because I have involved in other tech communities and I know it doesn't have to be like that (unless, of course, the community insists in being mysoginistic...).

So I came up with the idea of having a meeting for women in Curitiba one month before MiniDebConf. The main goal was to create a good enviroment for women to talk about Debian, whether they had used GNU/Linux before or not, whether they were programmers or not.

Miriam and Kira, two other women from the state of Parana interested in Debian, came along and helped out with planning. We used a collaborative pad to organize the tasks and activities and to create the text for the folder about Debian we had printed (based on Debian's documentation).

For the final version of the folder, it's important to acknowledge the help Luciana gave us, all the way from Minas Gerais. She collaborated with the translations, reviewed the texts and fixed the layout.

A pile with folded Debian Women folders. The writings are in Portuguese and it's possible to see a QR code.

The final odg file, in Portuguese, can be downloaded here: folder_debian_30cm.odg

Very quickly, because we had so little time (we settled on a date and a place a little over one month before the meeting), I created a web page and put it online the only way I could at that moment, using Github Pages.

We used Mate Hackers' instance of to register for the meeting, simply because we had to plan accordingly. This was the address for registration:

Through the Training Center, a Brazilian tech community, we got to Lucio, who works at Pipefy and offered us the space so we could hold the meeting. Thank you, Lucio, Training Center and Pipefy!

Pipefy logo

Because Miriam and I weren't in Curitiba, we had to focus the promotion of this meeting online. Not the ideal when someone wants to be truly inclusive, but we worked with the resources we had. We reached out to TechLadies and invited them - as we did with many other groups.

This was our schedule:


09:00 - Welcome coffee

10:00 - What is Free Software? Copyright, licenses, sharing

10:30 - What is Debian?

12:00 - Lunch Break


14:30 - Internships with Debian - Outreachy and Google Summer of Code

15:00 - Install fest / helping with users issues

16:00 - Editing the Debian wiki to register this meeting

17:30 - Wrap up

Take outs from the meeting:

  • Because we knew more or less how many people would attend, we were able to buy the food accordingly right before the meeting - and ended up spending much less than if we had ordered some kind of catering.

  • Sadly, it would be almost as expensive to print a dozen of folders than it would be to print out hundred of them. So we ended up printing 100 folders (which was expensive enough). The good part is that we would end up handing them out during MiniDebConf Curitiba.

  • We attempted a live stream of the meeting using Jitsi, but I don't think we were very successful, because we didn't have a microphone for the speakers.

  • Most of our public ended up being women who, in fact, already knew and/or used Debian, but weren't actively involved with the community.

  • It was during this meeting that the need for a mailing list in Portuguese for women interested in Debian came up. Because, yes, in a country where English is taught so poorly in the schools, the language can still be a barrier. We also wanted to keep in touch and share information about the Brazilian community and what we are doing. We want next years' DebConf to have a lot of women, specially Brazilian women who are interested and/or who are users and/or contribute to Debian. The request for this mailing list would be put through by Helen during MiniDebConf, using the bug report system. If you can, please support us:

Pictures from the meeting:

The breakfast table with food

Our breakfast table!

Miriam telling the women about Free Software, six women listening

Miriam's talk: What is Free Software? Copyright, licenses, sharing

Renata and Miriam talking about What is Debian a tv among them shows the title of the talk

Miriam and Renata's talk: What is Debian?

Renata talking about internships with Debian

Renata talking about internships with Debian

Thank you to all the women who participated!

The participants with the two men who helped with the meeting.

And to our lovely staff. Thank you, Lucio, for getting us the space and thank you, Pipefy!

This has been partly documented at Debian Wiki (DebianWomen/History) because the very next day after this meeting, Debian Wiki completely blocked ProtonVPN from even accessing the Wiki. Awesome. If anyone is able to, feel free to copy/paste any of this text there.

13 May, 2018 08:49PM by Renata

Russ Allbery

Review: Deep Work

Review: Deep Work, by Cal Newport

Publisher: Grand Central
Copyright: January 2016
ISBN: 1-4555-8666-8
Format: Kindle
Pages: 287

If you follow popular psychology at all, you are probably aware of the ongoing debate over multitasking, social media, smartphones, and distraction. Usually, and unfortunately, this comes tainted by generational stereotyping: the kids these days who spend too much time with their phones and not enough time getting off their elders' lawns, thus explaining their inability to get steady, high-paying jobs in an economy designed to avoid steady, high-paying jobs. However, there is some real science under the endless anti-millennial think-pieces. Human brains are remarkably bad at multitasking, and it causes significant degredation of performance. Worse, that performance degredation goes unnoticed by the people affected, who continue to think they're performing tasks at their normal proficiency. This comes into harsh conflict with modern workplaces heavy on email and chat systems, and even harsher conflict with open plan offices.

Cal Newport is an associate professor of computer science at Georgetown University with a long-standing side profession of writing self-help books, initially focused on study habits. In this book, he argues that the ability to do deep work — focused, concentrated work that pushes the boundaries of what one understands and is capable of — is a valuable but diminishing skill. If one can develop both the habit and the capability for it (more on that in a moment), it can be extremely rewarding and a way of differentiating oneself from others in the same field.

Deep Work is divided into two halves. The first half is Newport's argument that deep work is something you should consider trying. The second, somewhat longer half is his techniques for getting into and sustaining the focus required.

In making his case for this approach, Newport puts a lot of effort into avoiding broader societal prescriptions, political stances, or even general recommendations and tries to keep his point narrow and focused: the ability to do deep, focused work is valuable and becoming rarer. If you develop that ability, you will have an edge. There's nothing exactly wrong with this, but much of it is obvious and he belabors it longer than he needed to. (That said, I'm probably more familiar with research on concentration and multitasking than some.)

That said, I did like his analysis of busyness as a proxy for productivity in many workplaces. The metrics and communication methods most commonly used in office jobs are great at measuring responsiveness and regular work on shallow tasks in the moment, and bad at measuring progress towards deeper, long-term goals, particularly ones requiring research or innovation. The latter is recognized and rewarded once it finally pays off, but often treated as a mysterious capability some people have and others don't. Meanwhile, the day-to-day working environment is set up to make it nearly impossible, in Newport's analysis, to develop and sustain the habits required to achieve those long-term goals. It's hard to read this passage and not be painfully aware of how much time one spends shallowly processing email, and how that's rewarded in the workplace even though it rarely leads to significant accomplishments.

The heart of this book is the second half, which is where Deep Work starts looking more like a traditional time management book. Newport lays out four large areas of focus to increase one's capacity for deep work: create space to work deeply on a regular basis, embrace boredom, quit social media, and cut shallow work out of your life. Inside those areas, he provides a rich array of techniques, some rather counter-intuitive, that have worked for him. This is in line with traditional time management guidance: focus on a few important things at a time, get better at saying no, put some effort into planning your day and reviewing that plan, and measure what you're trying to improve. But Newport has less of a focus on any specific system and more of a focus on what one should try to cut out of one's life as much as possible to create space for thinking deeply about problems.

Newport's guidance is constructed around the premise (which seems to have some grounding in psychological research) that focused, concentrated work is less a habit that one needs to maintain than a muscle that one needs to develop. His contention is that multitasking and interrupt-driven work isn't just a distraction that can be independently indulged or avoided each day, but instead degrades one's ability to concentrate over time. People who regularly jump between tasks lose the ability to not jump between tasks. If they want to shift to more focused work, they have to regain that ability with regular, mindful practice. So, when Newport says to embrace boredom, it's not just due to the value of quiet and unstructured moments. He argues that reaching for one's phone to scroll through social media in each moment of threatened boredom undermines one's ability to focus in other areas of life.

I'm not sure I'm as convinced as Newport is, but I've been watching my own behavior closely since I read this book and I think there's some truth here. I picked this book up because I've been feeling vaguely dissatisfied with my ability to apply concentrated attention to larger projects, and because I have a tendency to return to a comfort zone of unchallenging tasks that I already know how to do. Newport would connect that to a job with an open plan office, a very interrupt-driven communications culture, and my personal habits, outside of work hours, of multitasking between TV, on-line chat, and some project I'm working on.

I'm not particularly happy about that diagnosis. I don't like being bored, I greatly appreciate the ability to pull out my phone and occupy my mind while I'm waiting in line, and I have several very enjoyable hobbies that only take "half a brain," which I neither want to devote time to exclusively nor want to stop doing entirely. But it's hard to argue with the feeling that my brain skitters away from concentrating on one thing for extended periods of time, and it does feel like an underexercised muscle.

Some of Newport's approach seems clearly correct: block out time in your schedule for uninterrupted work, find places to work that minimize distractions, and batch things like email and work chat instead of letting yourself be constantly interrupted by them. I've already noticed how dramatically more productive I am when working from home than working in an open plan office, even though the office doesn't bother me in the moment. The problems with an open plan office are real, and the benefits seem largely imaginary. (Newport dismantles the myth of open office creativity and contrasts it with famously creative workplaces like MIT and Bell Labs that used a hub and spoke model, where people would encounter each other to exchange ideas and then retreat into quiet and isolated spaces to do actual work.) And Newport's critique of social media seemed on point to me: it's not that it offers no benefits, but it is carefully designed to attract time and attention entirely out of proportion to the benefits that it offers, because that's the business model of social media companies.

Like any time management book, some of his other advice is less convincing. He makes a strong enough argument for blocking out every hour of your day (and then revising the schedule repeatedly through the day as needed) that I want to try it again, but I've attempted that in the past and it didn't go well at all. I'm similarly dubious of my ability to think through a problem while walking, since most of the problems I work on rely on the ability to do research, take notes, or start writing code while I work through the problem. But Newport presents all of this as examples drawn from his personal habits, and cares less about presenting a system than about convincing the reader that it's both valuable and possible to carve out thinking space for oneself and improve one's capacity for sustained concentration.

This book is explicitly focused on people with office jobs who are rewarded for tackling somewhat open-ended problems and finding creative solutions. It may not resonate with people in other lines of work, particularly people whose jobs are the interrupts (customer service jobs, for example). But its target profile fits me and a lot of others in the tech industry. If you're in that group, I think you'll find this thought-provoking.

Recommended, particularly if you're feeling harried, have the itch to do something deeper or more interesting, and feel like you're being constantly pulled away by minutia.

You can get a sample of Newport's writing in his Study Habits blog, although be warned that some of the current moral panic about excessive smartphone and social media use creeps into his writing there. (He's currently working on a book on digital minimalism, so if you're allergic to people who have caught the minimalism bug, his blog will be more irritating than this book.) I appreciated him keeping the moral panic out of this book and instead focusing on more concrete and measurable benefits.

Rating: 8 out of 10

13 May, 2018 04:32AM

May 12, 2018

hackergotchi for Norbert Preining

Norbert Preining

MySql DataTime/TimeStamp fields and Scala

In one of my work projects we use Play Framework on Scala to provide an API (how surprising ;-). For quite some time I was hunting after lots of milliseconds, time the API answer was just terrible late compared to hammering directly at the MySql server. It turned out to be a problem of interaction between MySql DateTime format and Scala.

It sounded like to nice an idea to save our traffic data in a MySql database with the timestamp saved in a DateTime or Timestamp column. Display in the Mysql Workbench looks nice and easily readable. But somehow our API server’s response was horribly slow, especially when there were several requests. Hours and hours of tuning of SQL code, trying to turning of sorting, adding extra indices, nothing of all that to any avail.

The solution was rather trivial, the actual time lost is not in the SQL part, nor in the processing in our Scala code, but in the conversion from MySql DateTime/Timestamp object to Scala/Java Timestamp. We are using ActiveRecord for Scala, a very nice and convenient library, which converts MySql DateTime/Timestamps to Scala/Java Timestamps. But this conversion seems, especially for a large number of entries, to become rather slow. With months of traffic data and hundreds of thousands of timestamps to convert, the API collapsed to unacceptable response times.

Converting the whole pipeline (from data producer to database and api) to use plain simple Long boosted the API performance considerably. Lesson learned, don’t use MySql DateTime/Timestamp if you need lots of conversions.

12 May, 2018 01:21AM by Norbert Preining

May 11, 2018

hackergotchi for Sune Vuorela

Sune Vuorela

Modern C++ and Qt – part 2.

I recently did a short tongue-in-cheek blog post about Qt and modern C++. In the comments, people discovered that several compilers effectively can optimize std::make_unique<>().release() to a simple new statement, which was kind of a surprise to me.

I have recently written a new program from scratch (more about that later), and I tried to force myself to use standard library smartpointers much more than what I normally have been doing.

I ended up trying to apply a set of rules for memory handling to my code base to try to see where it could end.

  • No naked delete‘s
  • No new statements, unless it was handed directly to a Qt function taking ownership of the pointer. (To avoid sillyness like the previous one)
  • Raw pointers in the code are observer pointers. We can do this in new code, but in older code it is hard to argue that.

It resulted in code like

m_document = std::make_unique<QTextDocument>();
    auto layout = std::make_unique<QHBoxLayout>();
    auto textView = std::make_unique<QTextBrowser>();

By it self, it is quite ok to work with, and we get all ownership transfers documented. So maybe we should start code this way.

But there is also a hole in the ownership pass around, but given Qt methods doesn’t throw, it shouldn’t be much of a problem.

More about my new fancy / boring application at a later point.

I still haven’t fully embraced the c++17 thingies. My mental baseline is kind of the compiler in Debian Stable.

11 May, 2018 06:18PM by Sune Vuorela

hackergotchi for Daniel Kahn Gillmor

Daniel Kahn Gillmor

E-mail Cryptography

I've been working on cryptographic e-mail software for many years now, and i want to set down some of my observations of what i think some of the challenges are. I'm involved in Autocrypt, which is making great strides in sensible key management (see the last section below, which is short not because i think it's easy, but because i think Autocrypt has covered this area quite well), but there are additional nuances to the mechanics and user experience of e-mail encryption that i need to get off my chest.

Feedback welcome!

Cryptography and E-mail Messages

Cryptographic protection (i.e., digital signatures, encryption) of e-mail messages has a complex history. There are several different ways that various parts of an e-mail message can be protected (or not), and those mechanisms can be combined in a huge number of ways.

In contrast to the technical complexity, users of e-mail tend to expect a fairly straightforward experience. They also have little to no expectation of explicit cryptographic protections for their messages, whether for authenticity, for confidentiality, or for integrity.

If we want to change this -- if we want users to be able to rely on cryptographic protections for some e-mail messages in their existing e-mail accounts -- we need to be able to explain those protections without getting in the user's way.

Why expose cryptographic protections to the user at all?

For a new messaging service, the service itself can simply enumerate the set of properties that all messages exchanged through the service must have, design the system to bundle those properties with message deliverability, and then users don't need to see any of the details for any given message. The presence of the message in that messaging service is enough to communicate its security properties to the extent that the users care about those properties.

However, e-mail is a widely deployed, heterogenous, legacy system, and even the most sophisticated users will always interact with some messages that lack cryptographic protections.

So if we think those protections are meaningful, and we want users to be able to respond to a protected message at all differently from how they respond to an unprotected message (or if they want to know whether the message they're sending will be protected, so they can decide how much to reveal in it), we're faced with the challenge of explaining those protections to users at some level.


The best level to display cryptographic protects for a typical e-mail user is on a per-message basis.

Wider than per-message (e.g., describing protections on a per-correspondent or a per-thread basis) is likely to stumble on mixed statuses, particularly when other users switch e-mail clients that don't provide the same cryptographic protections, or when people are added to or removed from a thread.

Narrower than per-message (e.g., describing protections on a per-MIME-part basis, or even within a MIME part) is too confusing: most users do not understand the structure of an e-mail message at a technical level, and are unlikely to be able to (or want to) spend any time learning about it. And a message with some cryptographic protection and other tamperable user-facing parts is a tempting vector for attack.

So at most, an e-mail should have one cryptographic state that covers the entire message.

At most, the user probably wants to know:

  • Is the content of this message known only to me and the sender (and the other people in Cc)? (Confidentiality)

  • Did this message come from the person I think it came from, as they wrote it? (Integrity and Authenticity)

Any more detail than this is potentially confusing or distracting.


Is it possible to combine the two aspects described above into something even simpler? That would be nice, because it would allow us to categorize a message as either "protected" or "not protected". But there are four possible combinations:

  • unsigned cleartext messages: these are clearly "not protected"

  • signed encrypted messages: these are clearly "protected" (though see further sections below for more troubling caveats)

  • signed cleartext messages: these are useful in cases where confidentiality is irrelevant -- posts to a publicly-archived mailing list, for example, or announcement e-mails about a new version of some piece of software. It's hard to see how we can get away with ignoring this category.

  • unsigned encrypted messages: There are people who send encrypted messages who don't want to sign those messages, for a number of reasons (e.g., concern over the reuse/misuse of their signing key, and wanting to be able to send anonymous messages). Whether you think those reasons are valid or not, some signed messages cannot be validated. For example:

    • the signature was made improperly,
    • the signature was made with an unknown key,
    • the signature was made using an algorithm the message recipient doesn't know how to interpret
    • the signature was made with a key that the recipient believes is broken/bad

    We have to handle receipt of signed+encrypted messages with any of these signature failures, so we should probably deal with unsigned encrypted messages in the same way.

My conclusion is that we need to be able to represent these states separately to the user (or at least to the MUA, so it can plan sensible actions), even though i would prefer a simpler representation.

Note that some other message encryption schemes (such as those based on shared symmetric keying material, where message signatures are not used for authenticity) may not actually need these distinctions, and can therefore get away with the simpler "protected/not protected" message state. I am unaware of any such scheme being used for e-mail today.

Partial protections

Sadly, the current encrypted e-mail mechanisms are likely to make even these proposed two indicators blurry if we try to represent them in detail. To avoid adding to user confusion, we need to draw some bright lines.

  • For integrity and authenticity, either the entire message is signed and integrity-checked, or it isn't. We must not report messages as being signed when only a part of the message is signed, or when the signature comes from someone not in the From: field. We should probably also not present "broken signature" status any differently that we present unsigned mail. See discussion on the enigmail mailing list about some of these tradeoffs.

  • For confidentiality, the user likely cares that the entire message was confidential. But there are some circumstances (e.g., when replying to an e-mail, and deciding whether to encrypt or not) when they likely care if any part of the message was confidential (e.g. if an encrypted part is placed next to a cleartext part).

It's interesting (and frustrating!) to note that these are scoped slightly differently -- that we might care about partial confidentiality but not about partial integrity and authenticity.

Note that while we might care about partial confidentiality, actually representing which parts of a message were confidential represents a signficant UI challenge in most MUAs.

To the extent that a MUA decides it wants to display details of a partially-protected message, i recommend that MUA strip/remove all non-protected parts of the message, and just show the user the (remaining) protected parts. In the event that a message has partial protections like this, the MUA may need to offer the user a choice of seeing the entire partially-protected message, or the stripped down message that has complete protections.

To the extent that we expect to see partially-protected messages in the real world, further UI/UX exploration would be welcome. It would be great to imagine a world where those messages simply don't exist though :)

Cryptographic Mechanism

There are three major categories of cryptographic protection for e-mail in use today: Inline PGP, PGP/MIME, and S/MIME.

Inline PGP

I've argued elsewhere (and it remains true) that Inline PGP signatures are terrible. Inline PGP encryption is also terrible, but in different ways:

  • it doesn't protect the structure of the message (e.g., the number and size of attachments is visible)

  • it has no way of protecting confidential message headers (see the Protected Headers section below)

  • it is very difficult to safely represent to the user what has been encrypted and what has not, particularly if the message body extends beyond the encrypted block.

No MUA should ever emit messages using inline PGP, either for signatures or for encryption. And no MUA should ever display an inline-PGP-signed block as though it was signed. Don't even bother to validate such a signature.

However, some e-mails will arrive using inline PGP encryption, and responsible MUAs probably need to figure out what to show to the user in that case, because the user wants to know what's there. :/


PGP/MIME and S/MIME are roughly equivalent to one another, with the largest difference being their certificate format. PGP/MIME messages are signed/encrypted with certificates that follow the OpenPGP specification, while S/MIME messages rely on certificates that follow the X.509 specification.

The cryptographic protections of both PGP/MIME and S/MIME work at the MIME layer, providing particular forms of cryptographic protection around a subtree of other MIME parts.

Both standards have very similar existing flaws that must be remedied or worked around in order to have sensible user experience for encrypted mail.

This document has no preference of one message format over the other, but acknowledges that it's likely that both will continue to exist for quite some time. To the extent possible, a sensible MUA that wants to provide the largest coverage will be able to support both message formats and both certificate formats, hopefully with the same fixes to the underlying problems.

Cryptographic Envelope

Given that the plausible standards (PGP/MIME and S/MIME) both work at the MIME layer, it's worth thinking about the MIME structure of a cryptographically-protected e-mail messages. I introduce here two terms related to an e-mail message: the "Cryptographic Envelope" and the "Cryptographic Payload".

Consider the MIME structure of a simple cleartext PGP/MIME signed message:

0A └┬╴multipart/signed
0B  ├─╴text/plain
0C  └─╴application/pgp-signature

Consider also the simplest PGP/MIME encrypted message:

1A └┬╴multipart/encrypted
1B  ├─╴application/pgp-encrypted
1C  └─╴application/octet-stream
1D     ╤ <<decryption>>
1E     └─╴text/plain

Or, an S/MIME encrypted message:

2A └─╴application/pkcs7-mime; smime-type=enveloped-data
2B     ╤ <<decryption>>
2C     └─╴text/plain

Note that the PGP/MIME decryption step (denoted "1D" above) may also include a cryptographic signature that can be verified, as a part of that decryption. This is not the case with S/MIME, where the signing layer is always separated from the encryption layer.

Also note that any of these layers of protection may be nested, like so:

3A └┬╴multipart/encrypted
3B  ├─╴application/pgp-encrypted
3C  └─╴application/octet-stream
3D     ╤ <<decryption>>
3E     └┬╴multipart/signed
3F      ├─╴text/plain
3G      └─╴application/pgp-signature

For an e-mail message that has some set of these layers, we define the "Cryptographic Envelope" as the layers of cryptographic protection that start at the root of the message and extend until the first non-cryptographic MIME part is encountered.

Cryptographic Payload

We can call the first non-cryptographic MIME part we encounter (via depth-first search) the "Cryptographic Payload". In the examples above, the Cryptographic Payload parts are labeled 0B, 1E, 2C, and 3F. Note that the Cryptographic Payload itself could be a multipart MIME object, like 4E below:

4A └┬╴multipart/encrypted
4B  ├─╴application/pgp-encrypted
4C  └─╴application/octet-stream
4D     ╤ <<decryption>>
4E     └┬╴multipart/alternative
4F      ├─╴text/plain
4G      └─╴text/html

In this case, the full subtree rooted at 4E is the "Cryptographic Payload".

The cryptographic properties of the message should be derived from the layers in the Cryptographic Envelope, and nothing else, in particular:

  • the cryptographic signature associated with the message, and
  • whether the message is "fully" encrypted or not.

Note that if some subpart of the message is protected, but the cryptographic protections don't start at the root of the MIME structure, there is no message-wide cryptographic envelope, and therefore there either is no Cryptographic Payload, or (equivalently) the whole message (5A here) is the Cryptographic Payload, but with a null Cryptographic Envelope:

5A └┬╴multipart/mixed
5B  ├┬╴multipart/signed
5C  │├─╴text/plain
5D  │└─╴application/pgp-signature
5E  └─╴text/plain

Note also that if there are any nested encrypted parts, they do not count toward the Cryptographic Envelope, but may mean that the message is "partially encrypted", albeit with a null Cryptographic Envelope:

6A └┬╴multipart/mixed
6B  ├┬╴multipart/encrypted
6C  │├─╴application/pgp-encrypted
6D  │└─╴application/octet-stream
6E  │   ╤ <<decryption>>
6F  │   └─╴text/plain
6G  └─╴text/plain

Layering within the Envelope

The order and number of the layers in the Cryptographic Envelope might make a difference in how the message's cryptographic properties should be considered.

signed+encrypted vs encrypted+signed

One difference is whether the signature is made over the encrypted data, or whether the encryption is done over the signature. Encryption around a signature means that the signature was hidden from an adversary. And a signature around the encryption indicates that sender may not know the actual contents of what was signed.

The common expectation is that the signature will be inside the encryption. This means that the signer likely had access to the cleartext, and it is likely that the existence of the signature is hidden from an adversary, both of which are sensible properties to want.

Multiple layers of signatures or encryption

Some specifications define triple-layering: signatures around encryption around signatures. It's not clear that this is in wide use, or how any particular MUA should present such a message to the user.

In the event that there are multiple layers of protection of a given kind in the Cryptographic Envelope, the message should be marked based on the properties of the inner-most layer of encryption, and the inner-most layer of signing. The main reason for this is simplicity -- it is unclear how to indicate arbitrary (and potentially-interleaved) layers of signatures and encryption.

(FIXME: what should be done if the inner-most layer of signing can't be validated for some reason, but one of the outer layers of signing does validate? ugh MIME is too complex…)

Signed messages should indicate the intended recipient

Ideally, all signed messages would indicate their intended recipient as a way of defending against some forms of replay attack. For example, Alice signs a signed message to Bob that says "please perform task X"; Bob reformats and forwards the message to Charlie as though it was directly from Alice. Charlie might now believes that Alice is asking him to do task X, instead of Bob.

Of course, this concern also includes encrypted messages that are also signed. However, there is no clear standard for how to include this information in either an encrypted message or a signed message.

An e-mail specific mechanism is to ensure that the To: and Cc: headers are signed appropriately (see the "Protected Headers") below.

See also Vincent Breitmoser's proposal of Intended Recipient Fingerprint for OpenPGP as a possible OpenPGP-specific implementation.

However: what should the MUA do if a message is encrypted but no intended recipients are listed? Or what if a signature clearly indicates the intended recipients, but does not include the current reader? Should the MUA render the message differently somehow?

Protected Headers

Sadly, e-mail cryptographic protections have traditionally only covered the body of the e-mail, and not the headers. Most users do not (and should not have to) understand the difference. There are two not-quite-standards for protecting the headers:

  • message wrapping, which puts an entire e-mail message (message/rfc822 MIME part) "inside" the cryptographic protections. This is also discussed in RFC 5751 §3.1. I don't know of any MUAs that implement this.

  • memory hole, which puts headers on the top-level MIME part directly. This is implemented in Enigmail and K-9 mail.

These two different mechanisms are roughly equivalent, with slight differences in how they behave for clients who can handle cryptographic mail but have not implemented them. If a MUA is capable of interpreting one form successfully, it probably is also capable of interpreting the other.

Note that in particular, the cryptographic headers for a given message ought to be derived directly from the headers present (in one of the above two ways) in the root element of the Cryptographic Payload MIME subtree itself. If headers are stored anywhere else (e.g. in one of the leaf nodes of a complex Payload), they should not propagate to the outside of the message.

If the headers the user sees are not protected, that lack of protection may need to be clearly explained and visible to the user. This is unfortunate because it is potentially extremely complex for the UI.

The types of cryptographic protections can differ per header. For example, it's relatively straightforward to pack all of the headers inside the Cryptographic Payload. For a signed message, this would mean that all headers are signed. This is the recommended approach when generating an encrypted message. In this case, the "outside" headers simply match the protected headers. And in the case that the outsider headers differ, they can simply be replaced with their protected versions when displayed to the user. This defeats the replay attack described above.

But for an encrypted message, some of those protected headers will be stripped from the outside of the message, and others will be placed in the outer header in cleartext for the sake of deliverability. In particular, From: and To: and Date: are placed in the clear on the outside of the message.

So, consider a MUA that receives an encrypted, signed message, with all headers present in the Cryptographic Payload (so all headers are signed), but From: and To: and Date: in the clear on the outside. Assume that the external Subject: reads simply "Encrypted Message", but the internal (protected) Subject: is actually "Thursday's Meeting".

When displaying this message, how should the MUA distinguish between the Subject: and the From: and To: and Date: headers? All headers are signed, but only Subject: has been hidden. Should the MUA assume that the user understands that e-mail metadata like this leaks to the MTA? This is unfortuately true today, but not something we want in the long term.

Message-ID and threading headers

Messages that are part of an e-mail thread should ensure that Message-Id: and References: and In-Reply-To: are signed, because those markers provide contextual considerations for the signed content. (e.g., a signed message saying "I like this plan!" means something different depending on which plan is under discussion).

That said, given the state of the e-mail system, it's not clear what a MUA should do if it receives a cryptographically-signed e-mail message where these threading headers are not signed. That is the default today, and we do not want to incur warning fatigue for the user. Furthermore, unlike Date: and Subject: and From: and To: and Cc:, the threading headers are not usually shown directly to the user, but instead affect the location and display of messages.

Perhaps there is room here for some indicator at the thread level, that all messages in a given thread are contextually well-bound? Ugh, more UI complexity.

Protecting Headers during e-mail generation

When generating a cryptographically-protected e-mail (either signed or encrypted or both), the sending MUA should copy all of the headers it knows about into the Cryptographic Payload using one of the two techniques referenced above. For signed-only messages, that is all that needs doing.

The challenging question is for encrypted messages: what headers on the outside of the message (outside the Cryptographic Envelope) can be to be stripped (removed completely) or stubbed (replaced with a generic or randomized value)?

Subject: should obviously be stubbed -- for most users, the subject is directly associated with the body of the message (it is not thought of as metadata), and the Subject is not needed for deliverability. Since some MTAs might treat a message without a Subject: poorly, and arbitrary Subject lines are a nuisance, it is recommended to use the exact string below for all external Subjects:

Subject: Encrypted Message

However, stripping or stubbing other headers is more complex.

The date header can likely be stripped from the outside of an encrypted message, or can have it its temporal resolution made much more coarse. However, this doesn't protect much information from the MTAs that touch the message, since they are likely to see the message when it is in transit. It may protect the message from some metadata analysis as it sits on disk, though.

The To: and Cc: headers could be stripped entirely in some cases, though that may make the e-mail more prone to being flagged as spam. However, some e-mail messages sent to Bcc groups are still deliverable, with a header of

To: undisclosed-recipients:;

Note that the Cryptographic Envelope itself may leak metadata about the recipient (or recipients), so stripping this information from the external header may not be useful unless the Cryptographic Envelope is also stripped of metadata appropriately.

The From: header could also be stripped or stubbed. It's not clear whether such a message would be deliverable, particularly given DKIM and DMARC rules for incoming domains. Note that the MTA will still see the SMTP MAIL FROM: verb before the message body is sent, and will use the address there to route bounces or DSNs. However, once the message is delivered, a stripped From: header is an improvement in the metadata available on-disk. Perhaps this is something that a friendly/cooperative MTA could do for the user?

Even worse is the Message-Id: header and the associated In-Reply-To: and References: headers. Some MUAs (like notmuch) rely heavily on the Message-Id:. A message with a stubbed-out Message-Id would effectively change its Message-Id: when it is decrypted. This may not be a straightforward or safe process for MUAs that are Message-ID-centric. That said, a randomized external Message-ID: header could help to avoid leaking the fact that the same message was sent to multiple people, so long as the message encryption to each person was also made distinct.

Stripped In-Reply-To: and References: headers are also a clear metadata win -- the MTA can no longer tell which messages are associated with each other. However, this means that an incoming message cannot be associated with a relevant thread without decrypting it, something that some MUAs may not be in a position to do.

Recommendation for encrypted message generation in 2018: copy all headers during message generation; stub out only the Subject for now.

Bold MUAs may choose to experiment with stripping or stubbing other fields beyond Subject:, possibly in response to some sort of signal from the recipient that they believe that stripping or stubbing some headers is acceptable. Where should such a signal live? Perhaps a notation in the recipient's certificate would be useful.

Key management

Key management bedevils every cryptographic scheme, e-mail or otherwise. The simplest solution for users is to automate key management as much as possible, making reasonable decisions for them. The Autocrypt project outlines a sensible approach here, so i'll leave most of this section short and hope that it's covered by Autocrypt. While fully-automated key management is likely to be susceptible either to MITM attacks or trusted third parties (depending on the design), as a community we need to experiment with ways to provide straightforward (possibly gamified?) user experience that enables and encourages people to do key verification in a fun and simple way. This should probably be done without ever mentioning the word "key", if possible. Serious UI/UX work is needed. I'm hoping future versions of Autocrypt will cover that territory.

But however key management is done, the result for the e-mail user experience is that that the MUA will have some sense of the "validity" of a key being used for any particular correspondent. If it is expressed at all, it should be done as simply as possible by default. In particular, MUAs should avoid confusing the user with distinct (nearly orthogonal) notions of "trust" and "validity" while reading messages, and should not necessarily associate the validity of a correspondent's key with the validity of a message cryptographically associated with that correspondent's key. Identity not the same thing as message integrity, and trustworthiness is not the same thing as identity either.

Key changes over time

Key management is hard enough in the moment. With a store-and-forward system like e-mail, evaluating the validity of a signed message a year after it was received is tough. Your concept of the correspondent's correct key may have changed, for example. I think our understanding of what to do in this context is not currently clear.

11 May, 2018 04:04PM by Daniel Kahn Gillmor

Sven Hoexter

Replacing hp-health on gen10 HPE DL360

A small followup regarding the replacement of hp-health and hpssacli. Turns out a few things have to be replaced, lucky all you already running on someone else computer where you do not have to take care of the hardware.


According to the super nice and helpful Craig L. at HPE they're planing an update for the MCP ssacli for Ubuntu 18.04. This one will also support the SmartArray firmware 1.34. If you need it now you should be able to use the one released for RHEL and SLES. I did not test it.

replacing hp-health

The master plan is to query the iLO. Basically there are two ways. Either locally via hponcfg or remotely via a Perl script sample provided by HPE along with many helpful RIBCL XML file examples. Both approaches are not cool because you've to deal with a lot of XML, so opt for a 3rd way and use the awesome python-hpilo module (part of Debian/stretch) which abstracts all the RIBCL XML stuff nicely away from you.

If you'd like to have a taste of it, I had to reset a few ilo passwords to something sane, without quotes, double quotes and backticks, and did it like this:

pwfile="ilo-pwlist-$(date +%s)"

for x in $(seq -w 004 006); do
  pw=$(pwgen -n 24 1)
  echo "${host},${pw}" >> $pwfile
  ssh $host "echo \"<RIBCL VERSION=\\\"2.0\\\"><LOGIN USER_LOGIN=\\\"adminname\\\" PASSWORD=\\\"password\\\"><USER_INFO MODE=\\\"write\\\"><MOD_USER USER_LOGIN=\\\"Administrator\\\"><PASSWORD value=\\\"$pw\\\"/></MOD_USER></USER_INFO></LOGIN></RIBCL>\" > /tmp/setpw.xml"
  ssh $host "sudo hponcfg -f /tmp/setpw.xml && rm /tmp/setpw.xml"

After I regained access to all iLO devices I used the hpilo_cli helper to add a monitoring user:

while read -r line; do
  host=$(echo $line|cut -d',' -f 1)
  pw=$(echo $line|cut -d',' -f 2)
  hpilo_cli -l Administrator -p $pw $host add_user user_login="monitoring" user_name="monitoring" password="secret" admin_priv=False remote_cons_priv=False reset_server_priv=False virtual_media_priv=False config_ilo_priv=False
done < ${1}

The helper script to actually query the iLO interfaces from our monitoring is, in comparison to those ad-hoc shell hacks, rather nice:

import hpilo, argparse

parser = argparse.ArgumentParser()
parser.add_argument("component", help="HW component to query", choices=['battery', 'bios_hardware', 'fans', 'memory', 'network', 'power_supplies', 'processor', 'storage', 'temperature'])
parser.add_argument("host", help="iLO Hostname or IP address to connect to")
args = parser.parse_args()

def askIloHealth(component, host, user, password):
    ilo = hpilo.Ilo(host, user, password)
    health = ilo.get_embedded_health()

askIloHealth(args.component,, iloUser, iloPassword)

You can also take a look at a more detailed state if you pprint the complete stuff returned by "get_embedded_health". This whole approach of using the iLO should work since iLO 3. I tested version 4 and 5.

11 May, 2018 03:40PM

May 10, 2018

hackergotchi for Shirish Agarwal

Shirish Agarwal

Reviewing Agent 6

The city I come from, Pune has been experiencing somewhat of a heat-wave. So I have been cutting off lot of work and getting lot of back-dated reading done. One of the first books I read was Tom Rob Smith’s Agent 6 . Fortunately , I read only the third book and not the first two which from the synopsis seem to be more gruesome than the one which I read, so guess there is something to be thankful for.

Agent 6 copyright - Tom Rob Smith & Publishers

While I was reading about the book, I had thought that MGB is a fictious organization thought of by the author. But a quick look in wikipedia told that it is what KGB was later based upon.

I found the book to be both an easy read as well as a layered book. I was lucky to get a big print version of the book so I was able to share the experience with my mother as well. The book is somewhat hefty as it tops out around 600 pages although it’s told to be 480 pages on amazon.

As I had shared previously I had read Russka and how had been disappointed to see how the Russian public were disappointed time and again for democracy. I do understand that the book (Russka) itself is/was written by a western author and could have tapped into some unconscious biases but seemed to be accurate as to whatever I could find from public resources, that story though I may return to in a future date but this time would be for Agent 6 .

I found the book pretty immersive and at the same time lead me thinking on so many threads the author touches but then moves on. I was left wondering and many times just had to sleep, think deep thoughts as there was quite to chew on.

I am not going to spoil any surprises except to say there are quite a few twists and the ending is also what I didn’t expect.

At the end, if you appreciate politics, history, bit of adventure and have a bit of patience, the book is bound to reward you. It is not meant to be a page-turner but if you are one who enjoys savoring your drink you are going to enjoy it thoroughly.

10 May, 2018 08:39PM by shirishag75

hackergotchi for Jonathan McDowell

Jonathan McDowell

Home Automation: Getting started with MQTT

I’ve been thinking about trying to sort out some home automation bits. I’ve moved from having a 7 day heating timer to a 24 hour timer and I’d forgotten how annoying that is at weekends. I’d like to monitor temperatures in various rooms and use that, along with presence detection, to be a bit more intelligent about turning the heat on. Equally I wouldn’t mind tying my Alexa in to do some voice control of lighting (eventually maybe even using DeepSpeech to keep everything local).

Before all of that I need to get the basics in place. This is the first in a series of posts about putting together the right building blocks to allow some useful level of home automation / central control. The first step is working out how to glue everything together. A few years back someone told me MQTT was the way forward for IoT applications, being more lightweight than a RESTful interface and thus better suited to small devices. At the time I wasn’t convinced, but it seems they were right and MQTT is one of the more popular ways of gluing things together.

I found the HiveMQ series on MQTT Essentials to be a pretty good intro; my main takeaway was that MQTT allows for a single message broker to enable clients to publish data and multiple subscribers to consume that data. TLS is supported for secure data transfer and there’s a whole bunch of different brokers and client libraries available. The use of a broker is potentially helpful in dealing with firewalling; clients and subscribers only need to talk to the broker, rather than requiring any direct connection.

With all that in mind I decided to set up a broker to play with the basics. I made the decision that it should run on my OpenWRT router - all the devices I want to hook up can easily see that device, and if it’s down then none of them are going to be able to get to a broker hosted anywhere else anyway. I’d seen plenty of info about Mosquitto and it’s already in the OpenWRT package repository. So I sorted out a Let’s Encrypt cert, installed Moquitto and created a couple of test users:

opkg install mosquitto-ssl
mosquitto_passwd -b /etc/mosquitto/mosquitto.users user1 foo
mosquitto_passwd -b /etc/mosquitto/mosquitto.users user2 bar
chown mosquitto /etc/mosquitto/mosquitto.users
chmod 600 /etc/mosquitto/mosquitto.users

I then edited /etc/mosquitto/mosquitto.conf and made sure the following are set. In particular you need cafile set in order to enable TLS:

port 8883
cafile /etc/ssl/lets-encrypt-x3-cross-signed.pem
certfile /etc/ssl/mqtt.crt
keyfile /etc/ssl/mqtt.key

log_dest syslog

allow_anonymous false

password_file /etc/mosquitto/mosquitto.users
acl_file /etc/mosquitto/mosquitto.acl

Finally I created /etc/mosquitto/mosquitto.acl with the following:

user user1
topic readwrite #

user user2
topic read ro/#
topic readwrite test/#

That gives me user1 who has full access to everything, and user2 with readonly access to the ro/ tree and read/write access to the test/ tree.

To test everything was working I installed mosquitto-clients on a Debian test box and in one window ran:

mosquitto_sub -h mqtt-host -p 8883 --capath /etc/ssl/certs/ -v -t '#' -u user1 -P foo

and in another:

mosquitto_pub -h mqtt-host -p 8883 --capath /etc/ssl/certs/ -t 'test/message' -m 'Hello World!' -u user2 -P bar

(without the --capath it’ll try a plain TCP connection rather than TLS, and not produce a useful error message) which resulted in the mosquitto_sub instance outputting:

test/message Hello World!


mosquitto_pub -h mqtt-host -p 8883 --capath /etc/ssl/certs/ -t 'test2/message' -m 'Hello World!' -u user2 -P bar

resulted in no output due to the ACL preventing it. All good and ready to actually make use of - of which more later.

10 May, 2018 07:53PM

Daniel Stender

AFL in Ubuntu 18.04 is broken

At is has been reported on the discussion list for American Fuzzy Lop lately, unfortunately the fuzzer is broken in Ubuntu 18.04 “Bionic Beaver”. Ubuntu Bionic ships AFL 2.52b, which is the current version at the moment of writing this blog post. But the particular problem comes from the accompanying gcc-7 package, which is pulled by afl via the build-essential package. It was noticed in the development branch for the next Debian release from continuous integration (#895618) that introducing a triplet-prefixed as in gcc-7 7.3.0-16 (like same was changed for gcc-8, see #895251) affected the -B option in way that afl-gcc (the gcc wrapper) can’t use the shipped assembler (/usr/lib/afl-as) anymore to install the instrumentation into the target binary (#896057, thanks to Jakub Wilk for spotting the problem). As a result, the instrumented fuzzying and other things in afl doesn’t work:

$ afl-gcc --version
 afl-cc 2.52b by <>
 gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
$ afl-gcc -o test-instr test-instr.c 
 afl-cc 2.52b by <>
$ afl-fuzz -i in -o out -- ./test-instr
 afl-fuzz 2.52b by <>
 [+] You have 2 CPU cores and 1 runnable tasks (utilization: 50%).
 [+] Try parallel jobs - see /usr/share/doc/afl-doc/docs/parallel_fuzzing.txt.
 [*] Creating hard links for all input files...
 [*] Validating target binary...
 [-] Looks like the target binary is not instrumented! The fuzzer depends on
     compile-time instrumentation to isolate interesting test cases while
     mutating the input data. For more information, and for tips on how to
     instrument binaries, please see /usr/share/doc/afl-doc/docs/README.
     When source code is not available, you may be able to leverage QEMU
     mode support. Consult the README for tips on how to enable this.
     (It is also possible to use afl-fuzz as a traditional, "dumb" fuzzer.
     For that, you can use the -n option - but expect much worse results.)
 [-] PROGRAM ABORT : No instrumentation detected
          Location : check_binary(), afl-fuzz.c:6920

The same error message is put out e.g. by afl-showmap. gcc-7 7.3.0-18 fixes this. As an alternative before this becomes available, afl-clang which uses the clang compiler might be used instead to prepare the target binary properly:

$ afl-clang --version
 afl-cc 2.52b by <>
 clang version 4.0.1-10 (tags/RELEASE_401/final)
$ afl-clang -o test-instr test-instr.c 
 afl-cc 2.52b by <>
 afl-as 2.52b by <>
 [+] Instrumented 6 locations (64-bit, non-hardened mode, ratio 100%)

10 May, 2018 04:21PM

Andreas Metzler

balance sheet snowboarding season 2017/16

For a change a winter with snow again, allowing a early start of the season (December 2). Due to early easter (lifts closing) last run was on April 14. The amount of snow would have allowed boarding for at least another two weeks. (Today, on May 10 mountainbiking is still curtailed, routes above 1650 meters of altitude are not yet rideable.)

OTOH weather sucked, extended periods of stable sunny weather were rare and totally absent in February and the first half of March. - I only had 3 days on piste in February. I made many kilometres luging during that time. ;-)

Anyway here it is:

2005/06 2006/07 2007/08 2008/09 2009/10 2010/11 2011/12 2012/13 2013/14 2014/15 2015/16 2016/17 2017/18
number of (partial) days25172937303025233024173029
total meters of altitude12463474096219936226774202089203918228588203562274706224909138037269819266158
# of runs309189503551462449516468597530354634616

10 May, 2018 11:33AM by Andreas Metzler