Revive an old Neufbox 6 with OpenWrt

While this article might only have interest for french people, you might want to get yourself an inexpensive Neufbox 6 on eBay or the like to play with it, thus the english language.

When I lived in France, my last Internet provider was SFR, it was (and still is) a fiber provider, and you got connected thanks to a box called the “Neufbox”. There were a couple of versions of this box that was pretty hackable and the provider was cool with it, you could even flash it with “opened” versions of their firmware, a modified OpenWrt.
At that time, there was an amazing website called OpenBox4 now closed and only browsable thanks to web archive (donate to them!), that gave all the informations needed to take control of your Neufbox. But as it is closed now, I’ll try to put the needed pieces here in order to revive this great router with a brand new OpenWrt 19.07.

First, talking about the hardware, in order to easily flash and / or rescue your Neufbox 6 you’ll need an USB / TTL UART converter so you’ll be able to plug into and use the box’s serial console. I bought this CP2104 converter on Amazon at 7€ just because I’m an impatient type of person but you could get yourself a cheaper one on eBay.

As shown in the following photos, you’ll have to plug CP2104 GND, TX and RX to the Neubox 6 serial port in that order, from left to right, GND being the pin closest to the screw. The last and empty pin is VCC at 3.3V, do not plug it.

CP2104 USB / TTL converter

Wiring on the NB6

In order to check the serial console, verify that your system sees the UART, i.e.

1
2
3
4
$ dmesg|grep -i cp21
[912189.989801] usb 1-10.2: Product: CP2104 USB to UART Bridge Controller
[912189.992248] cp210x 1-10.2:1.0: cp210x converter detected
[912189.993780] usb 1-10.2: cp210x converter now attached to ttyUSB0

Once the USB serial converter have been connected to the NB6, launch your favorite serial terminal in 115200 8N1 serial mode, for example with GNU screen:

1
$ sudo screen /dev/ttyUSB0 115200

Power on the NB6 and you should see it booting!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
HELO
CPUI
L1CI
HELO
CPUI
L1CI
DRAM
----
PHYS
ZQDN
PHYE
DINT
LASY
USYN
MSYN
LMBE
PASS
----
ZBSS
CODE
DATA
L12F
MAIN


CFE version 1.0.37-106.5 for BCM96362 (32bit,SP,BE)
Build Date: mar. sept. 7 17:10:14 CEST 2010 (arno@golgoth13)
Copyright (C) 2000-2009 Broadcom Corporation.

boot...

Good, now you might shut it down as we won’t be using NB6‘s original firmware anyway.

Next step is to prepare a tftp server somewhere on your network that the NB6 will be able to reach. My tftp server runs on a FreeBSD machine, I’ve only added:

1
inetd_enable="YES"

to /etc/rc.conf, uncommented:

1
tftp    dgram   udp     wait    root    /usr/libexec/tftpd      tftpd -l -s /tftpboot

in /etc/inetd.conf, created a /tftpboot directory, started inetd:

1
$ sudo /etc/rc.d/inetd start

and placed NEUFBOX6-squashfs-cfe.bin into /tftpboot.

Now it’s time to boot the NB6 in a way it stops at the CFE bootloader. To achieve this, press a key when the following message appears:
*** Press any key to stop auto run (1 second) ***
Here you will probably want to change the parameters, you can do this by typing c (Change booline parameters):

1
2
3
4
5
6
7
8
9
10
11
12
13
CFE> c
Press: <enter> to use current value
'-' to go previous parameter
'.' to clear the current value
'x' to exit this command
Board IP address : 192.168.7.90:ffffff00
Host IP address : 192.168.7.254
Gateway IP address : 192.168.7.254
Run from flash/host (f/h) : f
Default host run file name : vmlinux
Default host flash file name : bcm963xx_fs_kernel
Boot delay (0-9 seconds) : 5
*** command status = 0

  • Board IP address is the actual NB6
  • Host IP address is the tftp server

By default the boot delay is set to 1 sec, you might want to raise it.

It is now time to fetch and write the desired OpenWrt image, this is done using the following syntax on the CFE:

1
CFE> f 192.168.7.254:openwrt-19.07.0-brcm63xx-generic-NEUFBOX6-squashfs-cfe.bin

Note that there’s a second, more automated method in order to trigger the download / flash. It consists on setting up a DHCP server on the tftp server and launch the NB6 in download mode. This is done by pressing the SFR front button while powering on the box, and keep pressing it for about 8 seconds, until the led turns red.
In order for this method to work, the DHCP server should be able to respond using the default network and IP the CFE is configured with, i.e.:

1
2
Board IP address                     : 192.168.22.22:ffffff00
Host IP address : 192.168.22.68

The DHCP server configuration would look like this:

1
2
3
4
5
6
host 9box {
hardware ethernet 00:25:15:fe:46:7f;
fixed-address 192.168.22.22;
next-server myserver;
filename "/openwrt-19.07.0-brcm63xx-generic-NEUFBOX6-squashfs-cfe.bin";
}

Your brand new flashed Neufbox 6 should have booted using OpenWrt 19.07, simply press enter on the screen terminal:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
BusyBox v1.30.1 () built-in shell (ash)

_______ ________ __
| |.-----.-----.-----.| | | |.----.| |_
| - || _ | -__| || | | || _|| _|
|_______|| __|_____|__|__||________||__| |____|
|__| W I R E L E S S F R E E D O M
-----------------------------------------------------
OpenWrt 19.07.0, r10860-a3ffeb413b
-----------------------------------------------------
=== WARNING! =====================================
There is no root password defined on this device!
Use the "passwd" command to set up a new password
in order to prevent unauthorized SSH logins.
--------------------------------------------------
root@OpenWrt:/#

It is to be noted that the embedded b43 based WiFi adapter is not recognized, nevertheless, you probably noticed there was an USB port available on the box, good news is this port is perfectly functional and an USB WiFi dongle can be inserted, we only have to fetch the kernel module and maybe needed firmware using OpenWrt package manager opkg.

Let’s first setup the network in /etc/config/network. We will simply use DHCP here:

1
2
3
4
config interface 'lan'
option type 'bridge'
option ifname 'eth0.1'
option proto 'dhcp'

This interface correspond to the blue RJ45 adapters.

Let’s reload the network:

1
root@OpenWrt:/# service network reload

And you should now have a br-lan interface up and running:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
7: br-lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000
link/ether 00:25:15:ee:56:a0 brd ff:ff:ff:ff:ff:ff
inet 192.168.7.90/24 brd 192.168.7.255 scope global br-lan
valid_lft forever preferred_lft forever
inet6 fe80::225:15ff:feee:56a0/64 scope link
valid_lft forever preferred_lft forever
8: eth0.1@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-lan state UP qlen 1000
link/ether 00:25:15:ee:56:a0 brd ff:ff:ff:ff:ff:ff
root@OpenWrt:/# ping -c 1 1.1
PING 1.1 (1.0.0.1): 56 data bytes
64 bytes from 1.0.0.1: seq=0 ttl=56 time=47.902 ms

--- 1.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 47.902/47.902/47.902 ms

You can now fetch the packages database using opkg update.

My USB WiFi dongle is a RTL8192EU based TP-LINK. The associated module package is kmod-rtl8xxxu, let’s install it:

1
root@OpenWrt:/# opkg install kmod-rtl8xxxu

As we can see, this module needs a firmware that’s not present on the system:

1
2
3
4
5
6
[ 1493.163649] usb 1-2: rtl8xxxu: Loading firmware rtlwifi/rtl8192eu_nic.bin
[ 1493.170800] usb 1-2: Direct firmware load for rtlwifi/rtl8192eu_nic.bin failed with error -2
[ 1493.179469] usb 1-2: Falling back to user helper
[ 1493.386687] firmware rtlwifi!rtl8192eu_nic.bin: firmware_loading_store: map pages failed
[ 1493.398014] usb 1-2: request_firmware(rtlwifi/rtl8192eu_nic.bin) failed
[ 1493.404844] usb 1-2: Fatal - failed to load firmware

Download it:

1
2
3
4
root@OpenWrt:/# opkg install rtl8192eu-firmware
Installing rtl8192eu-firmware (20190416-1) to root...
Downloading http://downloads.openwrt.org/releases/19.07.0/packages/mips_mips32/base/rtl8192eu-firmware_20190416-1_mips_mips32.ipk
Configuring rtl8192eu-firmware.

And after unplugging / replugging the dongle:

1
2
3
4
5
6
7
root@OpenWrt:/# ifconfig wlan0
wlan0 Link encap:Ethernet HWaddr 50:3E:AA:9A:F4:00
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

There we go, a shiny new wireless router running OpenWrt ready to serve!

Here are a list of resources that helped me writing this article

Monitor network health with somebar

I knew about a MacOS task bar plugin called Anybar, which basically draws an icon on the task bar to which you can send behaviors with a simple nc command. Naturally, someone cloned it for our beloved Free Unices environments, and it’s called somebar.

I am sometimes in places with weak network, and I like to see at a glance how is my connection doing, somebar seemed the perfect tool for the task.

Somebar waits for a message on an UDP port, 1738 by default, i.e.

1
$ echo -n "green" | nc -4u -w0 localhost 1738

So I came up with this little script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#!/bin/sh

nccmd="nc -4u -w0 localhost 1738"
pingcmd="ping -c 1 -w 1 -q"

okcolor="green"

while :
do
for l in $(grep -v ^# $1)
do
loss=${l%%@*}
latc=${l##*@}

h=${loss%%:*}
c=${loss##*:}

# packet loss
pl=$($pingcmd $h|egrep -o '[0-9\.]+%')
# average ping latency
lt=$($pingcmd $h|sed -rn 's,.*/([0-9\.]+)\.[0-9]+/.*,\1,p')

if [ $# -gt 1 ]; then # for debugging
echo "$h packet loss $pl"
echo "$h latency: $lt / $latc"
fi

[ -z "$pl" ] || [ -z "$lt" ] && continue

if [ "$pl" != "0%" ] || [ $lt -gt $latc ]; then
[ "$c" = "red" ] && break
[ "$c" != "$okcolor" ] && break
fi
c=$okcolor
done

echo -n $c|$nccmd

sleep 5
done

It will read the file given as argv[1], which has the following format:

1
2
3
senate:red@70
discobus:red@10
ddwrt:orange@10

The first field is obviously the host, then the severity of this host being hard to reach and finally the average ICMP delay we tolerate. The script will check both packet loss and latency.

Start somebar and then this script, with an optional random parameter to see some output on the terminal.

Is LevelDB 2 times faster than BadgerDB?

I’m working on a plugin for Goxplorer that will create a database of all Bitcoin addresses present in its blockchain.

That’s an exercise I already did using LevelDB, which is Bitcoin’s choice for some of its own data, and as the task took quite a while, I decided to give a shot to BadgerDB, which I cite is a fast key-value (KV) database written in pure Go.

Well, I must do something very wrong, because I get the following results:

BadgerDB

1
2
$ time ./goxplorer -t -b blk01845.dat -a -x -bc mkaddrdb
./goxplorer -t -b blk01845.dat -a -x -bc mkaddrdb 48.59s user 3.98s system 81% cpu 1:04.72 total

LevelDB

1
2
$ time ./goxplorer -t -b blk01845.dat -a -x -bc mkaddrdb
./goxplorer -t -b blk01845.dat -a -x -bc mkaddrdb 35.91s user 4.28s system 115% cpu 34.763 total

That’s embarrassing.

Maybe you’ll spot something terribly wrong in my code:

BadgerDB

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
func recAddrsToBdg(h []byte, addrs []string) {
opts := badger.DefaultOptions("badgeraddr")
opts.Logger = nil
db, err := badger.Open(opts)
fatalErr(err)
defer db.Close()

var blocks []byte

err = db.Update(func(txn *badger.Txn) error {
for _, a := range addrs {
if len(a) == 0 {
continue
}
item, err := txn.Get([]byte(a))
// address not found, record it
if err == badger.ErrKeyNotFound {
err = txn.Set([]byte(a), h)
fatalErr(err)
continue
}
fatalErr(err)

err = item.Value(func(val []byte) error {
blocks = append([]byte{}, val...)
return nil
})
fatalErr(err)

// if block hash is not yet recorded, record it
if !bytes.Contains(blocks, []byte(h)) {
blocks = append(blocks, h...)
err = txn.Set([]byte(a), blocks)
}
}
return nil
})
fatalErr(err)
}

LevelDB

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
func recAddrsToLvl(h []byte, addrs []string) {
db, err := leveldb.OpenFile("./addresses", nil)
fatalErr(err)
defer db.Close()

var blocks []byte

for _, a := range addrs {
if len(a) == 0 {
continue
}
blocks, err = db.Get([]byte(a), nil)
// address not found, record it
if err == leveldb.ErrNotFound {
err = db.Put([]byte(a), h, nil)
fatalErr(err)
continue
}
fatalErr(err)

// if block hash is not yet recorded, record it
if !bytes.Contains(blocks, []byte(h)) {
blocks = append(blocks, h...)
err = db.Put([]byte(a), blocks, nil)
}
}
}

And yes, the number of keys is strictly the same:

BadgerDB

1
2
$ ../badger-cli/badger-cli list -d badgeraddr|tail -1
Matched keys: 482582

LevelDB

1
2
$ ../go-leveldbctl/leveldbctl --dbdir=addresses k|wc -l
482582

Or is just LevelDB 2 times faster than BadgerDB? ;)

FreeBSD networking issues: TCP offloading and checksum

In the past month, it’s the second time I’m being bitten by FreeBSD in the networking field.

First time with my own gateway, I had this weird behaviour where machines on a different VLAN than the main one would use the Internet at full speed but would struggle to make any transfer from the main VLAN.

Turns out this was a tcp segmentation offload issue, which seems to cause so much problems it is disabled by default in some appliances.

Simply add net.inet.tcp.tso=0 to /etc/sysctl.conf

or add -tso and -lro in rc.conf‘s ifconfig_<interface>:

1
ifconfig_em0="DHCP -tso -lro"

Sources: here, here, and here

Yesterday I had a different issue, yet somewhat similar, on another network where the gateway is also FreeBSD, but this time on a virtual (kvm) machine, where LAN and WAN interfaces are of the virtio type, bridge and passthrough respectively. The gateway would let ICMP pass, but neither TCP nor UDP. This time, I had to disable more than tso, also rxcsum and txcsum:

1
2
ifconfig_vtnet2="DHCP -lro -tso -rxcsum -txcsum"
ifconfig_vtnet1="inet 192.168.1.254 netmask 255.255.255.0 -lro -tso -rxcsum -txcsum

From what we can read here it seems I also could have disabled those at kvm‘s level.

Gitlab CI caching for Go projects

The reference documentation when it comes to couple golang and continuous integration in Gitlab is this one, it’s well put, easy to read and pretty accurate. Except for the caching part, or at least nowadays with go modules. This is what happens when a commit is pushed with the .gitlab-ci.yml given as an example in that document:

1
2
3
4
5
6
7
131 Creating cache default...
132 WARNING: /apt-cache: no matching files
133 WARNING: /go/src/github.com: no matching files
134 WARNING: /go/src/gitlab.com: no matching files
135 WARNING: /go/src/golang.org: no matching files
136 WARNING: /go/src/google.golang.org: no matching files
137 WARNING: /go/src/gopkg.in: no matching files

And as a matter of fact, the cache is empty for next stage.

Maybe I’m missing something, but the article is clear on these commands:

Then, you specify some folders of this image to be cached. The goal here is to avoid downloading the same content several times. Once a job is completed, the listed paths will be archived, and next job will use the same archive.

Well it does not.

So there’s this, and also another reason why I dug up on the topic; I like to locally test my commits before pushing, and it is possible to use gitlab-runner exec for that, which is great, but what’s not so great is that gitlab-runner exec doesn’t support the artifact feature, and thus I can’t bring a build result to a testing stage, which I need in my current project in order to test the program itself.
While the best practices say https://docs.gitlab.com/ee/ci/caching/:

cache: Use for temporary storage for project dependencies. Not useful for keeping intermediate build results, like jar or apk files. Cache was designed to be used to speed up invocations of subsequent runs of a given job, by keeping things like dependencies (e.g., npm packages, Go vendor packages, etc.) so they don’t have to be re-fetched from the public internet. While the cache can be abused to pass intermediate build results between stages, there may be cases where artifacts are a better fit.

artifacts: Use for stage results that will be passed between stages. Artifacts were designed to upload some compiled/generated bits of the build, and they can be fetched by any number of concurrent Runners.[…]

We can’t stick to these when using the exec command because of its limitations. But we can usecache!

So, thanks to this blog post which points a smart way of declaring GOPATH and that article which explains how to actually preserve caching between stages, I came up with this .gitlab-ci.yaml, which works and caches both locally with gitlab-runner exec and on Gitlab jobs:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
image: golang:1.12

stages:
- build
- test

before_script:
- export GOPATH=${CI_PROJECT_DIR}/.cache
- apt-get update -qq && apt-get -y -qq install jq xxd

build:
stage: build
cache: &build_cache
key: build
paths:
- .cache
- goxplorer
script:
- mkdir -p .cache
- make

unit_tests:
stage: test
cache: *build_cache
script:
- make test
- /bin/sh runtest.sh
dependencies:
- build

code_coverage:
stage: test
script:
- make coverage

Invoke the build phase like this:

1
$ gitlab-runner exec docker --docker-volumes $(pwd)/cache:/cache build

And finally the test phase:

1
$ gitlab-runner exec docker --docker-volumes $(pwd)/cache:/cache unit_tests

Happy testing ;)

Understanding golang channel range... again

In a previous article, I tried to explain my understanding of go channels interaction with ranges. Turns out my explanation was probably not clear enough because here I am, nearly a year after, struggling to achieve pretty much the same exercise.

So here we go again, on a good old trial and error fashion progress.

The goal here is to retrieve channel messages that are pushed from go routines created in a for loop.

The most naive thought would be this piece of (non working) code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
package main

import "fmt"

func main() {
c := make(chan string)
for _, t := range []string{"a", "b", "c"} {
go func(s string) {
c <- s
}(t)
}

for s := range c {
fmt.Println(s)
}
}

And as a matter of fact, it fails miserably:

1
2
3
4
5
6
7
8
9
10
$ go run main.go
c
a
b
fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan receive]:
main.main()
/home/imil/src/go/test/channels/main.go:13 +0x175
exit status 2

But why is that? well because the channel is never closed, and as such, the range never ends, go detects that behaviour and panics because this program can never terminate.

OK then, let’s close that channel right after the for loop!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
package main

import "fmt"

func main() {
c := make(chan string)
for _, t := range []string{"a", "b", "c"} {
go func(s string) {
c <- s
}(t)
}
close(c)

for s := range c {
fmt.Println(s)
}
}

And try it

1
2
$ go run main.go
$

Nothing. There’s also a very good reason for that, as we close the channel on the main function, nothing blocks c, and no go routine in the for loop had the chance to finish its job before we hit the range loop, which will end immediately because the channel is now closed and there’s nothing to read from it. Note that you might see a result, but probably no more than 1 message, meaning that one go routine could be executed before reaching the range loop.

OK then. So that’s it, we need to Wait, and that’s a job sync.WaitGroup knows how to handle, this should be easy:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
package main

import (
"fmt"
"sync"
)

func main() {
var wg sync.WaitGroup
c := make(chan string)
for _, t := range []string{"a", "b", "c"} {
wg.Add(1)
go func(s string) {
c <- s
wg.Done()
}(t)
}
wg.Wait()
close(c)

for s := range c {
fmt.Println(s)
}
}

But then again

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
fatal error: all goroutines are asleep - deadlock!

goroutine 1 [semacquire]:
sync.runtime_Semacquire(0xc0000141b8)
/home/imil/pkg/go/src/runtime/sema.go:56 +0x39
sync.(*WaitGroup).Wait(0xc0000141b0)
/home/imil/pkg/go/src/sync/waitgroup.go:130 +0x65
main.main()
/home/imil/src/go/test/channels/main.go:18 +0x143

goroutine 5 [chan send]:
main.main.func1(0xc0000200c0, 0xc0000141b0, 0x4b8cfb, 0x1)
/home/imil/src/go/test/channels/main.go:14 +0x49
created by main.main
/home/imil/src/go/test/channels/main.go:13 +0x123

goroutine 6 [chan send]:
main.main.func1(0xc0000200c0, 0xc0000141b0, 0x4b8cfc, 0x1)
/home/imil/src/go/test/channels/main.go:14 +0x49
created by main.main
/home/imil/src/go/test/channels/main.go:13 +0x123

goroutine 7 [chan send]:
main.main.func1(0xc0000200c0, 0xc0000141b0, 0x4b8cfd, 0x1)
/home/imil/src/go/test/channels/main.go:14 +0x49
created by main.main
/home/imil/src/go/test/channels/main.go:13 +0x123
exit status 2

What’s wrong this time?! Well, blocking happened again. We wg.Wait() for our go routines to end, but in turn, they are waiting for someone to read and consume c! So basically, we’ll never get pass wg.Wait(), go knows it, and panics.

Let’s print some debugging to witness this statement:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
package main

import (
"fmt"
"sync"
)

func main() {
var wg sync.WaitGroup
c := make(chan string)
for _, t := range []string{"a", "b", "c"} {
wg.Add(1)
go func(s string) {
fmt.Printf("go %s, before chan\n", s)
c <- s
fmt.Printf("go %s, after chan\n", s)
wg.Done()
}(t)
}

wg.Wait()
close(c)

for s := range c {
fmt.Println(s)
}
}
1
2
3
4
5
6
7
$ go run main.go
go c, before chan
go a, before chan
go b, before chan
fatal error: all goroutines are asleep - deadlock!
[..,]
exit status 2

As you can see, we never get to see the second fmt.Printf().

What now? we need to Wait in a non-blocking manner. And this can be done using… another go routine!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
package main

import (
"fmt"
"sync"
"time"
)

func main() {
var wg sync.WaitGroup
c := make(chan string)
for _, t := range []string{"a", "b", "c"} {
wg.Add(1)
go func(s string) {
fmt.Printf("go %s, before chan\n", s)
c <- s
fmt.Printf("go %s, after chan\n", s)
wg.Done()
}(t)
}

go func() {
wg.Wait()
close(c)
}()

for s := range c {
fmt.Println(s)
}
}

Fingers crossed

1
2
3
4
5
6
7
8
9
10
$ go run main.go
go a, before chan
go a, after chan
a
go b, before chan
go b, after chan
go c, before chan
b
c
go c, after chan

Yay! Much better. In order to witness more clearly the behavior of this method, let up add a timer to the waiting function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
package main

import (
"fmt"
"sync"
"time"
)

func main() {
var wg sync.WaitGroup
c := make(chan string)
for _, t := range []string{"a", "b", "c"} {
wg.Add(1)
go func(s string) {
fmt.Printf("go %s, before chan\n", s)
c <- s
fmt.Printf("go %s, after chan\n", s)
wg.Done()
}(t)
}

go func() {
c <- "oooweeeee I'm still heeeere"
time.Sleep(time.Second * 2) // wait 2 seconds
fmt.Println("ran now")
wg.Wait()
close(c)
}()

for s := range c {
fmt.Println(s)
}
}
1
2
3
4
5
6
7
8
9
10
11
12
$ go run main.go
go a, before chan
go b, before chan
go c, before chan
oooweeeee I'm still heeeere
a
b
c
go c, after chan
go a, after chan
go b, after chan
ran now

You should see a 2 seconds wait time before ran now it displayed, and thus the channel is closed. Note that in this scenario, wg.Wait() is useless as the channel loop will consume all c‘s way before 2 seconds.

So that’s it! I hope I made this clearer to my own mind… and maybe yours ;)

proof-of-work based blockchain explained with golang

Yet another “blockchain explained” article, I know, I really thought about if releasing it or not, but you know, you only understand what you can explain clearly, so I hope I’ll be able to explain proof of work and blockchain as clearly as it is clear in my mind.
The originality of this post is that I’ll try to make those concepts clear through pieces of code extensively explained so it doesn’t feel like a theoretical expose where you get the idea without the taste.

First thing first, as you probably have read 1M times, a blockchain is, well, a chain of blocks. Yeah thank you iMil that was helpful. From a coding point of view, this seems like an inverse linked list. Remember?

1
2
3
4
5
----------------    ----------------    ----------------
| data: first | | data: foo | | data: bar |
| addr: 0x1000 |<-. | addr: 0x1001 |<-. | addr: 0x1002 | ...
| prev: 0x0 | \| prev: 0x1000 | \| prev: 0x1001 |
---------------- ---------------- ----------------

Those are blocks, and the block n+1 has a reference to its predecessor thanks to the prev element, which points to the addr from the previous block. I present you a blockchain :)
There are plenty of very well put articles and videos explaining how this helps making an unmodifiable list, this one is probably one of the best I’ve watched.

Actually, known blockchains use hashes as their parent reference, and this is where the fun begins. What is the actual hash using as reference from one block to its parent? Well, it’s the solution to a puzzle. Or more precisely, the result of a proof of work.

Consider the following structure:

1
2
3
4
5
6
7
type Block struct {
Index int
Timestamp int64
Hash []byte
Data string
PrevHash []byte
}

Let’s produce some data with it, for example, by adding two structure members. To give an idea, Data + PrevHash would produce a certain amount of data, which in turn we could use to make a sha256 hash.
This would give us this type of function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// data is of type Block.Data, and prev is Block.PrevHash 
func genhash(data string, prev []byte) []byte {
// merge data and prev as bytes using bytes.Join
head := bytes.Join([][]byte{prev, []byte(data)}, []byte{})

// create a sha256 hash from this merge
h32 := sha256.Sum256(head)

fmt.Printf("Header hash: %x\n", h32)

// sha256.Sum256() returns a [32]byte value, we will use it as a []byte
// value in the next part of this article, thus the [:] trick
return h32[:]
}

From this value we will now try to solve a puzzle. There could be many ideas of such puzzles, but the one used in Bitcoin and many more cryptocurrencies is to find a number (called a nonce), which when added with the hash we got from adding struct values, will produce a result inferior to a determined target.

This target, again in this scenario, is a binary number beginning with difficulty * number of zeroes. I.e. if difficulty = 5, the puzzle solution is a number inferior to a number obtained by left shifting 1 from 256 - 5 (256 being the hash size and 5 the difficulty), so in binary form, 1 followed by 251 zeroes.

The process of solving this puzzle is called mining and as the correct hash has been found, it can be easily verified by anyone by adding the header values with the nonce and thus proving there was a work to find it. Once done, this process validates the block, which can then be added to the blockchain.

Here’s what this code look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
// the receive bytes represent the previously generated header hash
func mine(hash []byte) []byte {
// we use math/big in order to manipulate big numbers
target := big.NewInt(1)
// this is the left shift creating the target puzzle
target = target.Lsh(target, uint(256-difficulty))

fmt.Printf("target: %x\n", target)

// this is the value that will be incremented and added to the header hash
var nonce int64

// Now loop until max int64 size is reached, this is 100% arbitrary
for nonce = 0; nonce < math.MaxInt64; nonce++ {
// create a new test number
testNum := big.NewInt(0)
// sum header hash with the nonce
testNum.Add(testNum.SetBytes(hash), big.NewInt(nonce))
// and create a hash from it
testHash := sha256.Sum256(testNum.Bytes())

fmt.Printf("\rproof: %x (nonce: %d)", testHash, nonce)

// is the target number (0x8000...) bigger than our created hash?
if target.Cmp(testNum.SetBytes(testHash[:])) > 0 {
// if yes, we solved the puzzle
fmt.Println("\nFound!")
// again, sha256 returns a [32]byte, return type is []byte{}
return testHash[:]
}
}

return []byte{}
}

Now all we need to finish this exercise is to actually create blocks and piling them up, we will use a simple string slice with some data in it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
func main() {
// here is the string slice
bdatas := []string{"Genesis", "2d block", "3rd block", "4th block"}

// we do not have previous hash
prev := []byte{}

for i, d := range bdatas {
// create the new block
b := NewBlock(i, d, prev)
fmt.Printf("Id: %d\nHash; %x\nData: %s\nPrevious: %x\n",
b.Index,
b.Hash,
b.Data,
b.PrevHash,
)
// and record current found hash for future block
prev = b.Hash
}
}

The NewBlock function is pretty straightforward, it returns a complete block, which hash has been mined using its header hash

1
2
3
4
5
6
7
8
9
10
11
12
13
14
func NewBlock(id int, data string, prev []byte) *Block {
return &Block{
// block Index
id,
// current Unix time
time.Now().Unix(),
// first compute a hash with block's header, then mine it
mine(genhash(data, prev)),
// actual data
data,
// reference to previous block
prev,
}
}

Fully working code for this example is available here, try increasing the difficulty and witness the time to solve the puzzle increase.

This exercise is really the tip of the blockchain iceberg, yet I find it is the building block (pun intended) of a proof-of-work based one. Hope I manged to demystify these concepts as clearly as I picture them today, if you feel something written here is wrong, please leave a comment and I’ll try to fix it the best way I can.

Many thanks to Jeiwan for his fantastic blockchain implementation in golang, which has been a great source of inspiration for this article.

Replacing a (silently) failing disk in a ZFS pool

Maybe I can’t read, but I have the feeling that official documentations explain every single corner case for a given tool, except the one you will actually need. My today’s struggle: replacing a disk within a FreeBSD ZFS pool.

What? there’s a shitton of docs on this topic! Are you stupid?

I don’t know, maybe. Yet none covered the process in a simple, straight and complete manner. Here’s the story:

Since yesterday I felt my personal FreeBSD NAS was sluggish, and this morning, I saw those horrible messages popping in my syslog console:

1
2
3
4
5
6
7
Jul  2 12:49:53 <kern.crit> newcoruscant kernel: ahcich1: Timeout on slot 8 port 0
Jul 2 12:49:53 <kern.crit> newcoruscant kernel: ahcich1: is 00000000 cs 00000000 ss 00000300 rs 00000300 tfd 40 serr 00000000 cmd 0000c917
Jul 2 12:49:53 <kern.crit> newcoruscant kernel: (ada1:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 08 50 25 e9 40 3b 00 00 00 00 00
Jul 2 12:49:53 <kern.crit> newcoruscant kernel: (ada1:ahcich1:0:0:0): CAM status: Command timeout
Jul 2 12:49:53 <kern.crit> newcoruscant kernel: (ada1:ahcich1:0:0:0): Retrying command
Jul 2 12:51:02 <kern.crit> newcoruscant kernel: cant/memory/memory-inactive: ds[0] = 52350976.000000
Jul 2 12:51:02 <kern.crit> newcoruscant kernel: ahcich1: AHCI reset: device not ready after 31000ms (tfd = 00000080)

Yeah… that bad.

The first thing that stroke me is that ZFS seemed perfectly fine with that:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
root@newcoruscant:~ # zpool status
pool: zroot
state: ONLINE
scan: scrub repaired 0 in 2h26m with 0 errors on Tue Jun 25 12:08:56 2019
config:

NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ada0p4 ONLINE 0 0 0
ada1p4 ONLINE 0 0 0
ada2p4 ONLINE 0 0 0

errors: No known data errors

But the input/output error thrown by smartctl -a /dev/ada1 made things clear, I needed to replace this disk quickly!
Thanks to past-me, there already was a disk ready for this task at ada3, so, after trustfully reading the zpool administration guide, and in particular Replacing a Functioning Device, I entered:

1
# zpool replace zroot ada1p4 ada3p4

Except it didn’t ran as expected:

1
2
cannot open 'ada3p4': no such GEOM provider
must be a full path or shorthand device name

What a fantastic and explicit error message just to say that ada3 doesn’t have a corresponding partition table.
I am no FreeBSD guru and very occasional user, so no, I am not used to GEOM, gpart, GELI etc… finally, this very well written stackexchange post showed me how to replicate the correct partition table to the new disk:

1
# gpart backup ada0|gpart restore -F ada3

Now zpool replace zroot ada1p4 ada3p4 would work! I also did not forget to replicate the boot sequence to the new disk as instructed by both the documentation and zpool:

1
2
3
# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada3 
partcode written to ada3p1
bootcode written to ada3

And at last the silvering was taking place:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
root@newcoruscant:~ # zpool status
pool: zroot
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue Jul 2 11:21:24 2019
3.91M scanned out of 1.84T at 38.5K/s, (scan is slow, no estimated time)
1.30M resilvered, 0.00% done
config:

NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ada0p4 ONLINE 0 0 0
replacing-1 ONLINE 0 0 0
ada1p4 ONLINE 0 0 0
ada3p4 ONLINE 0 0 0
ada2p4 ONLINE 0 0 0

errors: No known data errors

But… at less than 40K/s! Turns out that very logically the failing disk and its timeouts was slowing down the silvering, so I learned that to avoid this kind of situation, you should offline the failing disk from the zpool:

1
# zpool offline zroot ada1p4

And then

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
$ sudo zpool status
pool: zroot
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue Jul 2 16:01:22 2019
514G scanned out of 1.84T at 167M/s, 2h20m to go
170G resilvered, 27.22% done
config:

NAME STATE READ WRITE CKSUM
zroot DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
ada0p4 ONLINE 0 0 0
replacing-1 DEGRADED 0 0 8
15084350875675872541 OFFLINE 0 0 0 was /dev/ada1p4
ada3p4 ONLINE 0 0 0
ada2p4 ONLINE 0 0 0

errors: No known data errors

Much better. At the end of the resilvering, everything is now working correctly:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ sudo zpool status
pool: zroot
state: ONLINE
scan: resilvered 628G in 2h52m with 0 errors on Tue Jul 2 18:53:48 2019
config:

NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ada0p4 ONLINE 0 0 0
ada3p4 ONLINE 0 0 0
ada2p4 ONLINE 0 0 0

errors: No known data errors

I read that you should zpool remove the failing disk at the end of this operation, but when trying to do so:

1
2
3
4
root@newcoruscant:~ # zpool remove zroot ada1p4
cannot remove ada1p4: no such device in pool
root@newcoruscant:~ # zpool remove zroot 15084350875675872541
cannot remove 15084350875675872541: no such device in pool

So I guess zpool did it itself.
Now it’s time to buy and add a new spare for the next disk that fails…

golang reflection tips

Because I’m the kind of person who likes genericity, I often find myself using features of languages that are flagged as “only use it if you know what you’re doing”. Golang reflection is one of those features, powerful yet a bit confusing.

Reflection, as explained in The Laws of Reflection is the ability of a program to examine its own structure, particularly through types; it’s a form of metaprogramming

In short, you can introspect variables at run-time, making your program exceptionally dynamic. How can this serve any purpose? well imagine for example creating function names dynamically by another function parameter. Pretty cool uh?

But first things first, let’s draw the scene, the reflect package exposes two principal functions that are used to dive into our variables, TypeOf and ValueOf. Those functions return, respectively, a reflect.Value and reflect.Type type.

Let’s start an example from there, consider a simple int variable:

1
i := 3

Here’s a reflect representation of i‘s type:

1
r := reflect.TypeOf(i)

Without any surprise, if we print it, we get int:

1
fmt.Printf("%v\n", r)

play it

Using a similar syntax, here we get the value of i:

1
r := reflect.ValueOf(i)

Which prints i’s value

Actually, you can retrieve i‘s type using the Value.Type() method:

1
fmt.Printf("%v\n", r.Type())

In the previous ValueOf example we only wanted to print i‘s value, but it turns out we also can modify it, yet we need to make a little change to the call.
When we do ValueOf(i), we actually copy i to the Value part of the reflection, so naturally, changing it would not affect the initial value of i. In fact, this would be so useless that it’s not even permitted

The error says that the value is unadressable, and this can be checked with the CanAddr method.

What we need here looks like what we would do in a C program where a function needs to modify a parameter: passing the address of the parameter, and instead of dereferencing it with a *, we’ll use the Elem() method:

1
fmt.Println(reflect.ValueOf(&i).Elem().CanAddr())

play it

Now we can modify i‘s content using the SetInt() method

Another nice trick with reflection is the ability to inspect a struct content, either by index, by field name or even by tag if the struct happens to have some.

Let’s first try a simple enumeration of a struct fields:

1
2
3
4
5
6
7
8
9
10
var s struct {
name string
id int64
}

r := reflect.TypeOf(s)

for i := 0; i < r.NumField(); i++ {
fmt.Printf("%v\n", r.Field(i))
}

As you might have guessed, r.NumField() returns the number of fields in the struct, and r.Field(i) accesses the ith field of the struct. This loop will print a StructField, which contains many informations about the field, if you only want the field’s name, simply request r.Field(i).Name.

Here’s another example using reflect ability to inspect struct tags:

1
2
3
4
5
6
7
8
9
10
11
12
var s struct {
name string `daname`
id int64 `daid`
}

r := reflect.TypeOf(s)

for i := 0; i < r.NumField(); i++ {
if r.Field(i).Tag == "daname" {
fmt.Printf("found tag for %v\n", r.Field(i).Name)
}
}

runme

One last trick and possibly the coolest, or at least IMHO, reflection permits to build a function call 100% dynamically, only by reflecting variables. Why would one do that? Well imagine a program calling functions using os.Args, without the need of a gigantic switch of many if.

Method building follows the same scheme as the Field method, meaning that you can find and build a name either by index Method() or by name MethodByName(). There’s one catch though, don’t forget reflect operates on variables, and we can’t search for a function from thin air, it has to be bound to some kind of variable; fortunately, Go gives the possibility to attach methods to variables. Here’s an example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
package main

import (
"fmt"
"reflect"
)

type RetVal int

func (r *RetVal) Foo() {
fmt.Println("hello from Foo!")
*r = 3
}

func main() {
var r RetVal
funcname := "Foo"

va := []reflect.Value{}

f := reflect.ValueOf(&r).MethodByName(funcname)
if f.IsValid() {
f.Call(va)
}
fmt.Printf("r is %d", r)
}

runme

In order to “build” the function call, we’re created a custom type, RetVal, to which we attach a Foo() method.
As you can see, we declare a string variable holding "Foo". It is mandatory to declare the build function call parameters list, and as we could expect, those are reflect.Value types.
To be able to modify the value of r, we pass it by reference, but we perfectly could have passed it by value if we were not to assign 3 to it in the function.
Before calling a dynamic function, it is always a good idea to verify that it is actually valid using the IsValid method.
Finally, the created Value (f) can be Call()ed with optional parameters.

I found this method useful when writing a Mattermost bot which features can be called through a parameter passed in the chat.

Cleaner micro Kubernetes on OSX

While my main workstation is a Linux Mint machine, I occasionally use my OSX ${WORK} laptop for traveling and composing. I’m not really fond of the OS, but at least it’s an UNIX-like, and pkgin runs well with it ;)
When I’m “on the go”, I like to try things and play along with technologies I’m currently obsessed with, among them Kubernetes.
On OSX, the natural choice is to go with minikube, it’s kind of integrated and does the job well, but if you tried it already and also happen to run docker for OSX you might have found yourself struggling with versions and consistency between the two. Added to this that I wanted to have a fully functional Linux virtual machine, preferably Debian GNU/Linux, there was way too much inconsistencies and wasted disk and CPU space to come. So I dug by myself and found a clean and fast solution by spawning my own virtual machine using OSX native hypervisor, which runs Canonical’s microk8s, a nicely done snap package to install a fully working and modular Kubernetes cluster on a Linux machine.

First thing is obviously to create the virtual machine.

If you search how to install Debian on OSX‘s native hypervisor, you’ll probably find very well put blogs and tutorials speaking about xhyve, a fork of FreeBSD‘s bhyve for OSX, Here are two good resources on how to achieve this:

Unfortunately, this was a working solution for Debian until version 8, but now when trying to install version 9 with it, the Debian installer kernel panics with a missing init message.
In order to fix this, you will need to use HyperKit, an xhyve fork maintained by Moby, the container infrastructure. Here’s what the modified xhyve script look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Installer kernel and initrd are here:
# http://ftp.debian.org/debian/dists/stretch/main/installer-amd64/current/images/netboot/debian-installer/amd64/
KERNEL="linux"
INITRD="initrd.gz"
CMDLINE="earlyprintk=serial console=ttyS0"
ISO="debian-9.7.0-amd64-netinst.iso"

MEM="-m 1G"
#SMP="-c 2"
NET="-s 2:0,virtio-net"
IMG_CD="-s 3,ahci-cd,$ISO"
IMG_HDD="-s 4,virtio-blk,debian.img" # raw disk image
PCI_DEV="-s 0:0,hostbridge -s 31,lpc"
LPC_DEV="-l com1,stdio"
ACPI="-A"
UUID="-U 549293DC-EA12-441C-B285-C64BA7A48BF4"

sudo hyperkit $ACPI $MEM $SMP $PCI_DEV $LPC_DEV $NET $IMG_CD $IMG_HDD $IMG_CD $UUID -f kexec,$KERNEL,$INITRD,"$CMDLINE"

As you can see, only the xhyve command was changed to hyperkit, which is 100% backward compatible. Apart from that detail, you can follow the previous procedures to proceed with installation and startup, notably vmlinuz and initrd extraction before reboot.

Once the Debian installation is complete and your usual tools installed, you may proceed with microk8s installation, as described on the official website. Do not forget, like I did, to add /snap/bin to the PATH, otherwise the microk8s.* commands won’t be automatically found.

1
2
3
$ microk8s.kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 167m

If you’d like to manage the cluster directly from OSX, export its configuration using:

1
$ microk8s.kubectl config view --raw

Copy the exported content to OSX‘s ~/.kube/config, and then:

1
2
3
$ kubectl get all
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 5h

And voila! here’s a brand new micro-Kubernetes cluster installed on your laptop, without the hassle of handling multiple specific virtual machines tailored for every use case.