Tuesday, November 23, 2021

Adding a New ZFS Encrypted Disk on Ubuntu

I recently got a second-hand SSD for my work computer and I wanted to make use of it by moving my home dataset to the new SSD while keeping it encrypted and using Ubuntu's boot time unlock feature. After a day of fiddling with it I got it to work but the solution wasn't particularly obvious so I thought I'd share it!

Hardware: x86_64
OS: Ubuntu 21.10
Kernel: 5.13.0-21-generic


Recent Ubuntu releases have support for installing ZFS on the root filesystem using built-in ZFS encryption, but it works in a surprising way. A volume is created to hold the the key file and is available in /dev/zvol. That volume is formatted with LUKS. You can the format like this:

$ sudo cryptsetup luksDump /dev/zvol/rpool/keystore 
LUKS header information
Version:       	2
Epoch:         	3
Metadata area: 	16384 [bytes]
Keyslots area: 	16744448 [bytes]

During boot the typical process of looking for LUKS volumes takes place and this volume is unlocked and mounted on /run/keystore/rpool. Inside the mount you'll find the key used to unlock the ZFS datasets:

$ ls -l  /run/keystore/rpool
total 20
drwx------ 2 root root 16384 Nov 22 18:38 lost+found
-rw------- 1 root root    32 Nov 22 18:38 system.key

Looking at rpool you can see that it's using the key in that location:

$ zfs get keylocation rpool 
NAME   PROPERTY     VALUE                                  SOURCE
rpool  keylocation  file:///run/keystore/rpool/system.key  local

Creating an Encrypted Pool and Dataset

This is the easy part. Since we know were to find the system key we can just create a new pool using the key. These options are taken from this blog post about using ZFS encryption in Ubuntu 20.04

$ sudo zpool create -f \
                -o ashift=12 \
                -O compression=lz4 \
                -O acltype=posixacl \
                -O xattr=sa \
                -O relatime=on \
                -O normalization=formD \
                -O canmount=off \
                -O dnodesize=auto \
                -O sync=disabled \
                -O recordsize=1M \
                -O encryption=aes-256-gcm \
                -O keylocation=file:///run/keystore/rpool/system.key \
                -O keyformat=raw \
                newpool /path/to/device

If that worked you should be able to get the keystatus of the new pool and see that it's available:

$ zfs get keystatus newpool 
newpool  keystatus  available    -

Now you can create an encrypted dataset on that pool:

$ sudo zfs create newpool/newdata

Now you have an encrypted dataset! Unfortunately, it won't be remounted after a reboot. Figuring out how to do that was the harder part.

Making it Permanent

Mounting encrypted volumes is done using systemd. During early boot a script called zfs-mount-generator creates the Systemd units that are necessary to make sure that the key is available to encrypted volumes and that zfs load-key is called for each volume. This is the manual for zfs-mount-generatorzfs-mount-generator will not see your new pool unless you update its cache located in /etc/zfs/zfs-list.cache which is not done automatically. It seems you have to trick it into working. Here's how. Start by creating an empty file with your pool name:

$ sudo touch /etc/zfs/zfs-list.cache/newpool

Now trigger an event that will cause a zedlet to update the cached information in that file:

$ sudo zfs set canmount=on newpool

Verify that the newpool file has contents:

$ cat /etc/zfs/zfs-list.cache/newpool

All done! Your new pool will be automatically decrypted using the same key as your system and be ready after the next reboot.


I think it's a bug in Ubuntu that creation of files in /etc/zfs/zfs-list.cache isn't done automatically. For non-encrypted mounts they're not necessary so perhaps it's an oversight. I hope this helps you!

Saturday, August 6, 2016

Building the Arduino for CS-11M

I'm teaching CS-11M for the second time in the fall and I want to fix some of the things that I thought were "broken" about the last go around. One of the hardest things to manage was the constant rewiring of the Arduino to match the week's project. Debugging circuits difficult and is not a learning objective for students. Also, my grading program had to detect how students wired their breadboard which made writing graders a bit of a pain.

This year there's only one circuit. We'll build it in stages over the first few weeks of class. To make it easy on everyone I've created a series of videos to show how the circuit is done. I used YouTube's video editing functions which worked quite well. Huge thanks to YouTube user AgentJayZ whose channel is full of amazing jet engine and gas turbine videos. I learned a lot about jets and a lot about making simple but effective YouTube videos.

Here's my playlist that shows how to build the CS-11M circuit:


Tuesday, December 22, 2015

Arduino Emulator, part 2

In my last post I introduced the Arduino emulator that I wrote for CS-11M. The source code is here:


The emulator was designed for grading and has limitations. The emulator code and the sketch run in one thread so if the sketch goes into an infinite loop the emulator hangs (unless the loop calls into Arduino code which calls the emulator). Sketches often use static initialization (see the Servo library examples) which makes emulator initialization fragile. Most of all it's tedious writing test code in C++.

Before next year I want to rewrite the emulator to make it more flexible and complete. Here are my goals for the next implementation:

  1. Write tests in Python. 
  2. Serial port interaction using Pexpect.
  3. Loadable sketches. Sketches should be loadable at run-time, rather than linked in. Loading a sketch with dlopen() let's me ensure the emulator is initialized before static initialization in the sketch. 
  4. Threading support. It will be necessary for attachInterrupt().
  5. Arduino IDE integration. It would be ideal to have the IDE compile the sketch for the emulator. That way I can integrate the emulator back-end in the same way that different chips and architectures are integrated. 
There's a significant leap in complexity. The emulator will be two processes, the C++ sketch and the Python script, communicating via a socket (and Google's protocol buffers). The emulator state that's now embodied in the Emulator class should be Python side and socket calls from the sketch should be synchronous so test code can be as precise as possible. 

But when will I have the time to build it? 

Monday, December 21, 2015

The Arduino Emulator

I needed a way to grade assignments in CS-11M, which is taught on Arduino. Downloading student's designs would have been extremely tedious. I would have had to have been strict about the circuit (exact pins, wiring, values, etc.) which I thought would have negative consequences. So I decided to build myself a simple emulator. You can see the work here:


The emulator works at the Arduino API level. I added to it as needed to make it functional enough for the assignments I gave. Here's the anatomy of the emulator:


Emulator code is here. The Emulator object mediates communication between the sketches and the test code. The implementation of Arduino functions in the main namespace is in emulator.cpp. Defined:

  • micros() and millis()
  • delay() and delayMicroseconds()
  • digitalRead(), digitalWrite() and pinMode()
  • analogRead(), analogWrite() and analogReference()
  • tone() and noTone()
  • random() and randomSeed()
  • map(), etc.
These core functions internally operate on the Emulator object rather than pins or a processor-centric emulator.


The arduino directory contains Arduino code that was ported to the emulator. Porting proved to be easy. The serial port is emulated using stdio which handless TTYs and pipes properly. There's also a hack that lets test code create input. But it's threaded code that's not synchronized it's bound to fail in a complicated test. The arduino directory contains emulation of
  • HardwareSerial (as Serial) including Serial, Stream, Print and WString
  • Servo 
Additional libraries will have to be ported if they're used. 


The tests directory contains tests for class projects. Each <sketch-name>_test.cpp is meant to test an assignment given in class where the student was asked to turn in <sketch-name>.ino. The main directory has shell scripts that glue everything together. 

Building the Emulator

Three parts are built together:
  1. Base emulator code. 
  2. <sketch-name>.ino
  3. The <sketch-name>_test.cpp 
The resulting executable runs the sketch and the test together. The build procedure is: 
  1. Gather <sketch-name>.ino and related sources. 
  2. Use the Arduino command line interface to build the sketch and leave intermediate files. This creates the <sketch-name>.cpp file.  
  3. Compile the <sketch-name>.cpp file and related source with the test code and emulator
Step 2 is important. Arduino creates function declarations that make otherwise illegal C++ legal. Further, a fail at this step tells me that the original sketch didn't compile (i.e. it's the student's fault) most compile failures in step three are my fault. 

The emulator was an indispensable but it really needs improvement. In my next post I'll discuss what I would like to do with it. 

Wednesday, November 25, 2015

The Airbnb Android App contains Spyware.

A couple of months ago I noticed that my phone was talking to China while using Wireshark on my home network. The conversation which was between my IP address and contained this data:
POST /c/ HTTP/1.1
Content-Length: 113
Host: infoc2.duba.net
Connection: Keep-Alive
................X..V....................HTTP/1.1 200 OK
Server: Kingsoft Web Server
Date: Mon, 28 Sep 2015 00:10:05 GMT
Content-Type: text/plain
Content-Length: 36
Connection: keep-alive
Alarming! I wasn't able to find any information about the IP address that seemed useful so, freaked out, I decided to factory reset my phone. I'm so busy during the semester that I thought it would be best to just be done with it.

Fast forward to yesterday, when after getting my IDS back online I saw this:

The RFC-1918 address listed is the internal IP address of my Android phone. So I did some more digging. I started a new Wireshark capture and watched all traffic from my phone overnight. I also installed the very handy Network Connections Android App. I paid for it too so I could capture for a longer time. What did I see? Several connections made to Chinese servers from the Airbnb app. I started to think that I was wrong to worry because it's perfectly reasonable for Airbnb to use international infrastructure (there's only one Internet, right?) Except here's a transcript of the conversation that caused the IDS alert:
POST /v2/report HTTP/1.1
Accept: application/json
Accept-Encoding: gzip
Content-Encoding: gzip
X-App-Key: 9f6627a3f8efaa87b929071c
Authorization: Basic MjgwNzY0MzQ1Mjo4ZTcyZasdfASH234yehereSAJfaA4MDc3ZTVmNDFiNQ==
Content-Length: 427
Host: stats.jpush.cn
Connection: Keep-Alive
K.w..$iJ..B.....3..W........J.....q....D*2l..KL.#.5TV$..eZN......C.Jlrc...5Fk^..c.eH..>;.........q..,.&AQ..C...nK...D.N.....B...'(.4.7.%......m...%..M.. .C.f..L.. 0p....x
>..o&.......HTTP/1.1 200 OK
Server: nginx/1.4.0
Date: Wed, 25 Nov 2015 16:43:28 GMT
Content-Type: text/plain; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Content-Encoding: gzip
So what, you say? Here's what you get when you unzip the data the app sent to the server:
Look closely and you'll notice that the MAC addresses and the signal strength of all the clients on my WiFi as well as my local DNS server were all uploaded to the server along with my location and cellular ID (which I've blanked out with zeros). Here is the result of the Network Connections capture: 49994 7007 com.airbnb.android:10113 11/24/2015 9:27 PM 51390 7004 com.airbnb.android:10113 11/24/2015 9:29 PM 44554 7000 com.airbnb.android:10113 11/24/2015 9:41 PM 34173 7006 com.airbnb.android:10113 11/25/2015 9:01 AM 39308 7004 com.airbnb.android:10113 11/25/2015 8:51 AM 33237 80 com.airbnb.android:10113 11/25/2015 8:51 AM 38221 7004 com.airbnb.android:10113 11/25/2015 9:01 AM 43747 80 com.airbnb.android:10113 11/25/2015 9:01 AM

Airbnb does this even after you kill it's background processes. My conclusion: Airbnb is tracking my every move. I'm uninstalling the app and complaining to Google. 

Friday, May 15, 2015

Snorby on Ubuntu 14.04 LTS

I'm an IT department of one at home. It's difficult to get useful IDS tools working on your network, which is better than it used to be. Attacks are more sophisticated and easier to execute than ever. I've been experimenting with Suricata IDS and I want to see threats in a maximally useful way. Snorby is a Ruby on Rails based web application that can analyze your IDS logs and give you visibility into your network. Protection is the sum of prevention, detection and response. Log files are not detection. Snorby has a setup guide on it's website but I thought I'd make one specific to Ubuntu 14.04. There's a blog with instructions for 12.04 that will break in 14.04, as my students found out.

The key difference between the "vanilla" Snorby installation and this procedure is that I want to use Ubuntu's packaged versions of as many things as possible. I love the way Ruby bundles dependencies and compiles a standalone environment. Any sane admin would sacrifice disk space to reduce system interdependencies. I'm just seeing what I can get away with.

You must have gcc and a supporting build environment installed. I don't show that here. You install the Ruby components with:

$ sudo apt-get install ruby1.9.3 rails bundler rake wkhtmltopdf

Now you need dev packages for the gems that have C/C++ source that needs to be built:

$ sudo apt-get install mysql-server git-core libxml++2.6-dev libxslt1-dev libmysqlclient-dev

Now build. This process can be done entirely as a non-root user, therefore it should be. No excuses.

$ sudo mkdir /opt/snorby
$ sudo chown me:me /opt/snorby
$ cd /opt/snorby
$ wget https://github.com/Snorby/snorby/archive/v2.6.2.tar.gz
$ tar -xvf v2.6.2.tar.gz
$ cd snorby-2.6.2
$ bundle install
$ bundle exec rake snorby:setup

During setup I see this warning:
  "Jammit Warning: Asset compression disabled -- Java unavailable."
I'm ignoring it based on reading this thread.

Add a snorby user to the database. Don't fail to change the name and password.

mysql> grant all on snorby.* to 'snorby'@'localhost' identified by 'snorby';
Use the default configurations as a template:

cp config/snorby_config.yml.example config/snorby_config.yml
cp config/database.yml.example config/database.yml

Customize those files to match your needs. Snorby's instructions have the best information. Here's what I changed in my configuration:
  1. Basic configuration: domain, email.
  2. The location of wkhtmltopdf is /usr/bin/wkhtmltopdf
  3. Database username and password.
What's next? Snorby's site has instructions on how to start it. After that you have to integrate the output of Suricata. 

Tuesday, March 24, 2015

Embedded Linux Conference, Day 2: Winners

Here's a postcard from the Embedded Linux Conference. The clear winners of this year are the Internet of Things (IoT) and drones. There was one talk about UAVs last year and this year they have an entire track. That's something of a 20x increase in talks on the subject. Some goes for IoT which in addition to its own track also scored two of the keynote talks.

One thing that really struck me is how much needs to be done in IoT. Panasonic announced that they are open-sourcing their existing software stack. Intel is headlining their efforts in the Open Interconnect Consortium which includes Cisco as a diamond member (the highest category). Everyone is happy to announce how many people are on board but during the panel discussion the emphasis was on interoperability. Something tells me Apple Homekit will start with a huge advantage: Interoperability guaranteed. Intel presented an interesting talk about IoT security but today that is more concept than implementation.