Python 3 is Awesome!

pythonToday I will tell you about the massive success that is whypy3.com. With hundreds of users a day (on the best day when it reached page two of Hacker News and hundreds actually being 103), it has been a tremendous success in the lucrative Python code snippet market. By presenting small snippets of code displaying cool features of Python 3, I was able to single–handedly convert millions (1e-6 millions to be exact) of Python 2 users to true Python 3 believers.

It all started when I saw a tweet about a cool Python 3 feature I haven’t seen before. This amazing feature automatically resolves any exception in your code by suppressing it. Who needs pesky exceptions anyway? Alternatively, you can use it to cleanly ignore expected exceptions instead of the usual except: pass.

from contextlib import suppress

with suppress(MyExc):
    code

# replaces

try:
    code
except MyExc:
    pass

There are obviously way better and bigger reasons to finally make that move to Python 3. But what if you can be lured in by some cool cheap tricks? And that’s exactly why I created whypy3.com. It’s a tool that us Python 3 lovers can use to try and slowly wear down on an insistent boss or colleague. It’s also a fun way for me to share all my favorite Python 3 features so I don’t forget them.

I was initially going to to do the usual static S3 website with CloudFront/CloudFlare. But I also wanted it to be easy for other people to contribute snippets. The obvious choice was GitHub, and since I’m already using GitHub, why not give GitHub Pages a try? Getting it up and running was a breeze. To make it easier to contribute without editing HTML, I decided to use the full blown Jekyll setup. I had to fight a little bit with Jekyll to get highlighting working, but overall it took no time to get a solid looking site up and running.

After posting to Hacker News, I even got a few pull requests for more snippets. To this day, I still get some Twitter interactions here and there. I don’t expect this to become a huge project with actual millions of users, but at the end of the day this was pretty fun, I learned some new technologies, and I probably convinced someone to at least start thinking about moving to Python 3.

Do you use Python 3? Please share your favorite feature!

Hypervisor Hunt

After getting burnt by Hyper-V, I decided to go for the tried and true and installed VMware Player on Windows 10. I had to install Ubuntu again on the new virtual machine, but it was a breeze thanks to VMware’s automated installation process. Everything that was missing in Hyper-V was there. I was able to use my laptop’s real resolution, networking over Wi-Fi was done automatically, audio magically started working, and even performance was noticeably better.

After a few weeks of heavy usage, I started noticing some problems with VMware. Every once in a while the guest OS would freeze for about a second. I didn’t pay too much attention to it at first, but it slowly started to wear on me. I eventually realized it always happens when I use tab completion in the shell and the real cause was playing sounds. It’s still progress over Hyper-V’s inability to play any audio, but it was not exactly a pleasant experience.

The other, far more severe issue, was general lack of performance. It just didn’t feel like I was running Ubuntu on hardware, or even close to it. I experienced constant lags while typing, alt+tab took about a second to show up, compiling code was weirdly slow, video playback was unusable, and everything was just generally sluggish and unresponsive. Overall it was usable, but far from ideal.

Today I finally broke down and decided to give yet another hypervisor a shot. Next up came VirtualBox. I didn’t have high expectations, but VMware was starting to slow me down so I had to try something. Installation was even easier since VirtualBox can just use VMware images. Then came the pleasant surprise. Straight out of the box performance was noticeably better. Windows moved without lagging, alt+tab reaction was instantaneous, and sound playback just worked. Once I installed the guest additions and enabled video acceleration, video playback started functioning too. I still can’t play 4K videos, but at least my laptop doesn’t crawl to a halt on every video ad.

As a cherry on the top, VirtualBox was also able to properly set the resolution on the guest OS at boot time. In VMware, I had to leave and enter full screen once after login for the real resolution to stick. Switching inputs between guest and host in VirtualBox is also easier. It requires just one key (right ctrl) as opposed to two with VMware (left ctrl+alt).

I realize these results depend on many things like hardware, drivers, host/guest versions, etc. I bet I could also solve some of these issues if I put some research into it. But for running Ubuntu 17.04 desktop on my Windows 10 Dell XPS 13 with the least hassle, VirtualBox is the clear winner. Let me know if you had different experience or know how to make it run even smoother.

Things They Don’t Tell You About Hyper-V

I really wanted to like Hyper-V. It’s fully integrated into Windows and runs bare metal, so I was expecting stellar performance and a smooth experience. I was going to run a Linux box for some projects, get to work with Docker for Windows, and do it all with good power management, smooth transitions and without sacrificing performance.

And then reality hit.

  1. Hyper-V doesn’t support resolutions higher than 1920×1080 with Linux guests. And even that is only adjustable by editing grub configuration which requires a reboot. The viewer allows zooming, but not in full screen mode. With a laptop resolution of 3200×1800, that leaves me with a half empty screen or a small window on the desktop.
  2. Networking support is mostly manual, especially when Wi-Fi is involved. You have to drop into PowerShell to manually configure vSwitch with NAT. Need DHCP? Nope, can’t have it. Go install a third party application.
  3. Audio is not supported for Linux guests. Just like with the resolution issue, you’re forced to use remote X server or xrdp. Both are a pain to setup and didn’t provide acceptable performance for me.
  4. To top it all off, you can’t use any other virtualization solution when Hyper-V is enabled. Do you want both Docker for Windows and a normal Linux desktop VM experience? Too bad… VMware allows you to virtualize VT-x/EPT so you can run a hypervisor inside your guest. Hyper-V doesn’t.

It seems like Hyper-V is just not there yet. It might work well for Windows guests or Linux server guests, but for Linux desktop guest it’s just not enough.

False Positive Watch

While debugging any issue that arises on Windows, my go-to trick is blaming the anti-virus or firewall. It almost always works. As important as these security solutions are, they can be so disruptive at times. For developers this usually comes in the form of a false positive. One day, out of the blue, a user emails you and blames you for trying to infect their computer with Virus.Generic.Not.Really.LOL.Sue.Me.1234775. This happened so many times with NSIS that someone created a false positive list on our wiki.

There are a lot of reasons why this happens and a lot of ways to lower the chances of it happening, but at the end of the day, chances are it’s going to happen. It even happened to Chrome and Windows itself.

So I created False Positive Watch. It’s a simple free service that periodically scans your files using Virus Total and sends you an email if any of your files are erroneously detected as malware. You can then notify the anti-virus vendor so they can fix the false positive before it affects too many of your customers.

I use it to get notifications about NSIS and other projects, but you can use it for your projects too for free. All you need is to supply your email address (for notifications) and upload the file (I delete it from my server after sending it to VirusTotal). In the future I’m going to add an option to just supply the hash instead of the entire file so you can use it with big files or avoid uploading the file if it’s too private.

Docker Combo Images

comboI’ve been working with Docker a lot for the past year and it’s pretty great. It especially shines when combined with Kubernetes. As the projects grew more and more complex, a common issue I kept encountering was running both Python and JavaScript code in the same container. Certain Django plugins require Node to run, Serverless requires both Python and Node, and sometimes you just need some Python tools on top of Node to build.

I usually ended up creating my own image containing both Python and Node with:

FROM python:3

RUN curl -sL https://deb.nodesource.com/setup_8.x | sudo -E bash -
RUN apt-get install -y nodejs

# ... rest of my stuff

There are two problems with this approach.

  1. It’s slow. Installing Node takes a while and doing it for every non-cached build is time consuming.
  2. You lose the Docker way of just pulling a nice prepared image. If Node changes their deployment method, the Dockerfile has to be updated. It’s much simpler to just docker pull node:8

The obvious solution is going to Docker Hub and looking for an image that already contains both. There are a bunch of those but they all look sketchy and very old. I don’t feel like I can trust them to have the latest security updates, or any updates at all. When a new version of Python comes out, I can’t trust those images to get new tags with the new version which means I’d have to go looking for a new image.

So I did what any sensible person would do. I created my own (obligatory link to XKCD #927 here). But instead of creating and pushing a one-off image, I used Travis.ci to update the images daily. This was actually a pretty fun exercise that allowed me to learn more about Docker Python API, Docker Hub and Travis.ci. I tried to make it as easily extensible as possible so anyone can submit a PR for a new combo like Node and Ruby, or Python or Ruby, or Python and Java, etc.

The end result allows you to use:

docker run --rm combos/python_node:3_6 python3 -c "print('hello world')"
docker run --rm combos/python_node:3_6 node -e "console.log('hello world')"

You can rest assured you will always get the latest version of Python 3 and the latest version of Node 6. The image is updated daily. And since the build process is completely transparent on Travis.ci you should be able to trust that there is no funny business in the image.

Images: https://hub.docker.com/r/combos/
Source code: https://github.com/kichik/docker-combo
Build server: https://travis-ci.org/kichik/docker-combo

Compatible Django Middleware

Django 1.10 added a new style of middleware with a different interface and a new setting called MIDDLWARE instead of MIDDLEWARE_CLASSES. Creating a class that supports both is easy enough with MiddlewareMixin, but that only works with Django 1.10 and above. What if you want to create middleware that can work with all versions of Django so it can be easily shared?

Writing a compatible middleware is not too hard. The trick is having a fallback for when the import fails on any earlier versions of Django. I couldn’t find a full example anywhere and it took me a few attempts to get it just right, so I thought I’d share my results to save you some time.

import os

from django.core.exceptions import MiddlewareNotUsed
from django.shortcuts import redirect

try:
    from django.utils.deprecation import MiddlewareMixin
except ImportError:
    MiddlewareMixin = object

class CompatibleMiddleware(MiddlewareMixin):
    def __init__(self, *args, **kwargs):
        if os.getenv('DISABLE_MIDDLEWARE'):
            raise MiddlewareNotUsed('DISABLE_MIDDLEWARE is set')

        super(CompatibleMiddleware, self).__init__(*args, **kwargs)

    def process_request(self, request):
        if request.path == '/':
            return redirect('/hello')

    def process_response(self, request, response):
        return response

CompatibleMiddleware can now be used in both MIDDLWARE and MIDDLEWARE_CLASSES. It should also work with any version of Django so it’s easier to share.

Dell XPS 13 9350 WiFi

Dysfunctional Broadcom WiFi card
Dysfunctional Broadcom WiFi card

I have had my Dell XPS 13 for almost half a year now. It’s the 2015 model numbered 9350. Ever since I got it, the WiFi has had issues. It’s dropping packets left and right and especially when placed on uneven surfaces like my lap or a carpet. I think it also has something to do with heat because it seems to take a while to show up after the computer is initially powered on. The issue worsened with time to the point where I simply can’t use it on my lap anymore.

I am far from alone with this issue. A simple Google search shows many people complaining about some variation of the same. Luckily, some of them actually figured out the culprit and it’s the Broadcom WiFi card. Apparently some models come with an Intel WiFi card that works perfectly fine. In the 2016 model they went ahead and completely replaced the faulty Broadcom card with Killer card across all models.

For whatever stupid reason, I decided to call Dell Support before just replacing the card myself. They were predictably useless and wasted two and a half hours of my time. Tier 1 took two hours of blindly following steps from a piece of paper. Tier 3 actually tried some advanced WiFi settings I never heard of, but that failed too. And then they saw my VirtualBox network adapters. They immediately jumped to conclusions that I’m sharing my connection with other computers (huh?) and that I need to disable them. So we disabled them and then instead of running an actual test like moving the laptop to an uneven surface, they ran one pingtest.net and decided that one passing test means they fixed the issue. I hung up.

More angry at myself for wasting my own time than Dell Support doing their jobs, I went ahead and purchased:

  1. Intel WiFi card 7265NGW
  2. Screwdriver set with Torx T5 because why make things easy to repair?
  3. Plastic pry tools because Torx is not hard enough

Once everything arrived, I followed the service manual and installed the new card in less than 10 minutes. With the right tools in hand, it was really simple and only cost $40. No need to wait 10-14 days for Dell to repair my laptop and no need to waste my time convincing technical support that it’s a hardware issue and another Broadcom won’t help.

Needless to say the WiFi is finally working perfectly fine on any surface.

Staying Safe Online

I have seen a few “staying safe online” guides lately. I wrote one of my own a while back after some of my friends were threatened online and got worried. This guide should be a good starting point for most common casual internet users. It’s important to remember that no matter what you do if it’s online, it can be hacked.

  • Never reuse passwords
    • Some websites are easier to hack than others
    • Hackers will try the same password on other websites
    • Use LastPass for easier management
  • Don’t use simple passwords
    • Hackers guess passwords all the time
    • There are easy automatic tools that enumerate all password options
    • Don’t use your name, birthday, SSN, or any public information in passwords
  • Keep your computer & phone up-to-date
    • Old software has known and easily exploitable vulnerabilities
  • Never click links in emails
    • Clicking the wrong link can give control of your accounts to hackers
    • Manually browse to the website even if the email looks legit
  • Always logout on public computers
    • Preferably never login on public computers in the first place
    • Data can be linger even after logging out
    • Some public computers record your passwords
  • If it was put online, it will stay online
  • Any private information shared can help hacking
    • Your name and birth year can be enough to guess your SSN

Securing Facebook

  • Click the little lock icon on top and follow instructions
    • Set everything to private
    • Hide your birth year
  • Click the little triangle on the top right and choose Settings
    • Enable login alerts to be notified of hacks
    • Enable login approvals
    • Enable trusted contacts in case your account is hacked

Securing Google Account

Stale MapReduce Staging Directories

I had a problem where HDFS would fill up really fast on my small test cluster. Using hdfs dfs -du I was able to track it down to the MapReduce staging directory under /user/root/.staging. For some reason, it wasn’t always deleting some old job directories. I wasn’t sure why this kept happening on multiple clusters, but I had to come up with a quick workaround. I created a small Python script that lists all staging directories and removes any of them not belonging to a currently running job. The script runs from cron and I can now use my cluster without worrying it’s going to run out of space.

This script is pretty slow and it’s probably possible to make it way faster with Snakebite or even some Java code. That being said, for daily or even hourly clean-up, this script is good enough.

import os
import re
import subprocess

all_jobs_raw = subprocess.check_output(
  'mapred job -list all'.split())
running_jobs = re.findall(
  r'^(job_\S+)\s+(?:1|4)\s+\d+\s+\w+.*$',
  all_jobs_raw, re.M)

staging_raw = subprocess.check_output(
  'hdfs dfs -ls /user/root/.staging'.split())
staging_dirs = re.findall(
  r'^.*/user/root/.staging/(\w+)\s*$',
  staging_raw, re.M)

stale_staging_dirs = set(staging_dirs) - set(running_jobs)

for stale_dir in stale_staging_dirs:
  os.system(
    'hdfs dfs -rm -r -f -skipTrash ' +
    '/user/root/.staging/%s' % stale_dir)

The script requires at least Python 2.7 and was tested with Hadoop 2.0.0-cdh4.5.0.

Download PDB by GUID

Sometimes you get stuck with a broken or no dump at all. You know what you’re looking for but WinDBG just keeps refusing to load symbols as you continue to beg for mercy from the all knowing deities of Debugging Tools for Windows. You know what PDB you’re looking for but it just wouldn’t load. The only thing you do know is that you don’t want to go digging for that specific version of your product in the bug report and build a whole setup for it just so you can get the PDB. For those special times, some WinDBG coercion goes a long way.

To download the PDB create a comma separated manifest file with 3 columns for each row. The columns are the requested PDB name, its GUID plus age for a total of 33 characters and the number 1. Finally call symchk and pass the path to the manifest file with the /im command line switch. Use the /v command line switch to get the download path of the PDB.

To demonstrate I’ll use everyone’s favorite debugging sample process.

C:\>echo calc.pdb,E95BB5E08CE640A09C3DBF3DFA3ABCB42,1 > manifest

C:\>symchk /v /im manifest
[...]
SYMSRV: Get File Path: /download/symbols/calc.pdb/E95BB5E08CE640A09C3DBF3DFA3ABCB42/calc.pdb
[...]
DBGHELP: C:\ProgramData\dbg\sym\calc.pdb\E95BB5E08CE640A09C3DBF3DFA3ABCB42\calc.pdb - OK

SYMCHK: FAILED files = 0
SYMCHK: PASSED + IGNORED files = 1

To force load the PDB you need to update the PDB path, turn SYMOPT_LOAD_ANYTHING on, and use the .reload command with /f to force and /i to ignore any so called mismatches.

kd> .sympath C:\ProgramData\dbg\sym\calc.pdb\E95BB5E08CE640A09C3DBF3DFA3ABCB42
kd> .symopt+0x40
kd> .reload /f /i calc.exe=0x00400000

You should now have access to all the data in the PDB file and stack traces should start making sense.