Kazu's Log

Mostly doing yak shaving.

CircleCI’s AWS CLI is old. That’s why people install the CLI from Python’s pip rather than using apt-get.

Technically speaking, there is no “CircleCI’s AWS CLI”. CircleCI uses Docker, and there are pre-build Docker images, which use Debian 8 (Jessie) and Debian 9 (Stretch) currently. Debian 9’s AWS CLI is 1.11.13 and Debian 8’s CLI is 1.4.2. They are really old. CloudFront was “preview” at that time.

It is better to ignore these ancient versions. Just install the latest CLI from pip.

For next two or three months, I’m going to write a Pomodoro timer app in Swift. I’ve been using JustFocus as my Pomodoro timer. While I am satisified about 90% of the app, I sometimes forgot to start the timer, mostly after meetings where I tend to cancel an ongoing pomdoro.

Writing Pomodoro timer shouldn’t be that hard. I hope that I can “scratch own itch” by writing my own app.

Having my own project

Another goal of this project is having (and completing) my own software development project.

I’ve been working as a software developer in a relatively large corp for 5+ years. While working in a large corp itself is fine for me, I recently realized what I conceive as “software development” is becoming “software development within a team, in a large corp” which is just a small sub-set of software development.

So, I want to try something different. The repos is here and I will share my progress in this blog bi-weekly.

Compared to React, Angular’s commit message guidelines are very detailed,

We have very precise rules over how our git commit messages can be formatted. This leads to more readable messages that are easy to follow when looking through the project history. But also, we use the git commit messages to generate the Angular change log.

which are adopted by Vue’s Commit Message Convention.

Because of the guidelines, the most of commit messages on the repos start with “types” that indicate the types of changes.

The below chart is the types of changes on Angular over time.

Angular: Types of Changes

Compard to Angular, Vue is much smaller and sporadic.

Vue: Types of Changes

I recently finished Udacity’s Intro to Machine Learning, while I haven’t finished the final assignment yet. The next step may be learning Deep Neural Network but before that, I’d like to see what I can do with what I’ve learned.

The course was using Enron email dataset a lot, and I want to use something similar but not the same to recap things. The dataset should have messages, authors, … How about Git repositories?

But before starting the machine learning part, the first step is loading a Git repository into Python.

GitPython

There are multiple Python libraries that can interact with Git. I’m unsure which would be the best, but GitPython is good enough for me.

First, I convert a Git repository (I used facebook/react) into a JSON file.

import git
import json
...

def commit_summary(c):
    result = {}
    for path, stats in c.stats.files.iteritems():
        for k in stats:
            result[k] = result.get(k, 0) + stats[k]
    result['file_count'] = len(c.stats.files)
    result['committed_date'] = c.committed_date
    result['hexsha'] = c.hexsha
    result['message'] = c.message
    result['email'] = c.author.email
    return result

react = git.Repo('../react')

with open('react-commits.json', 'w') as out:
    out.write('[\n')

    commits = react.iter_commits('master')
    index = 0
    for c in commits:
        if index != 0:
            out.write(',\n')
        index += 1

        json.dump(commit_summary(c), out)

    out.write(']\n')

Then load the JSON file into Pandas.

import pandas as pd
...

commits = pd.read_json('react-commits.json')
commits['committed'] = pd.to_datetime(commits['committed_date'], unit = 's')

React Commits over Time

The Y-axis have insertions and deletions.

Commits on facebook/react
ggplot(aes('committed', 'insertions'), commits) + \
  geom_line(aes(color = 1)) + \
  geom_line(aes('committed', '-deletions', color = 2)) + \
  ylab('Added/Deleted')  + xlab('Committed Date') + \
  guides(color=False) + scale_color_gradient()

There are a few spikes on deletions (newer to older);

  1. Delete documentation and website source (#11137)
  2. [site] Load libraries from unpkg (#9499)
  3. New Documentation
  4. Merge remote-tracking branch ‘facebook/master’
  5. remove likebutton from docs for now

While most of them were administrative changes, the last, oldest commit was a bit funny;

it has some facebook-ism in there and it’s probably shouldn’t be on the site.

I would agree so :)

Just a small tip, which I didn’t know in the beginning.

plotnine’s save method takes dpi as a parameter.

from plotnine import * # import ggplot(), aes(), ...
g = ggplot(...)
g.save('plot.png', width = 10, height = 10, dpi = 100)
g.save('plot-2x.png', width = 10, height = 10, dpi = 200)

Then you can use srcset to specify the images from HTML.

<img alt="..." src=".../companies.png" srcset=".../companies-2x.png 2x"/>

Also during exploration, I occasionally set plotnine.options.figure_size to have relatively large images.

import plotnine
plotnine.options.figure_size = (20, 20)