Spirituality without Religion

Read a brian pickings article titled Neuroscientist Sam Harris on Happiness, Spirituality Without Religion, and How to Cultivate the Art of Presence

The essence of this was discovering that feeling in that a person feels extremely comfortable and happy with the self. That feeling is described using a loaded word called "spirituality" and associated with religion. For folks who have experienced it, it is not related to religion at all.

It is outside of it. It is an innate existential feeling that one can get, when a person is honest with his thoughts, feelings and happy for the present moment.

Open Source Developer - Wes McKinney

The name Wes McKinney rang a bell when I was reading a general news article on a hedge-fund developer talking about open source developer stresses. Then I realized that the created of Pandas library, author of the book Python for Data Analysis was working for 2-Sigma now. I also came to know that he is a committer with Apache Parquet project, which is a C++ library for storing data in a columnar fashion. With all these noteworthy implementations aside, Wes seems to have a background in statistics from MIT.

There are many interesting people whose work we use, sometimes unaware of the person behind the project.

CPython moved to Github

CPython project moved it's source code hosting from self-hosted mercurial repository, at hg.python.org to Git version control system hosted at Github. The new location of python project is http://www.github.com/python/cpython

This is second big version control migration that is happening since I got involved. The first one was when we moved from svn to mercurial. Branches were sub-optimal in svn and we used svn-merge.py to merge across branches. Mercurial helped there and everyone got used to a distributed version control written in python, mercurial. It was interesting for me personally to compare mercurial with the other popular DVCS, git.

Over the years, Github has become popular place for developers to host their projects. They have constantly improved their service offering. Many python developers, got used to git version control system and found it's utility value too.

Two years ago, it was decided that Python will move to Git and Github. The effort was led by Bret Cannon assisted by number of other developers and the migration happened on Feb 10, 2017.

I helped with the migration too and helped with providing tool around converting the hg to git, using the facilities available from hg-git mercurial plugin.

We made use hg-git, and wrote some conversions scripts that could get us to the converted repo as we wanted.

  1. https://github.com/orsenthil/cpython-hg-to-git
  2. https://bitbucket.org/orsenthil/hg-git

Now that the migration is done, we are getting ourselves familiar to the new workflow.

Yashwant Kanetkar - Let Us C

I came across an article on Yashwant Kanetkar, and it rekindled many of my fond memories.

Let Us C is not merely a book, but a bible for millions of programmers in India.

That is true for me. I had read it during years 2000-2002, and I kept a count of it. I read, solved all the problems in that book for more than 100 times.

It has helped me a lot and I am indebted to that book.

The article on Yashwant Kanetkar was published in a Indian magazine, I made a copy of it, because, as you can imagine, I will cherish it.

I also watched a short talk by Yaashwant Kanetkar in which he explores his journey, and has some words of advice for programmers from India.

Who slides Wins

Few years ago, I wrote this n-puzzle using pygame for fun. It was inspired from the sliding block puzzle game that I had played as a kid, which had numerals in the front and a picture of taj mahal in the back. The idea was to slide and fit the photo together.

https://github.com/orsenthil/who-slides-wins

How to play

  1. Human plays first. Use Arrow Keys to Move and Fit the Picture.
  2. Press Enter when Done.
  3. Computer Plays and tries to beat you with less moves that you took.

It uses A* with Manhattan Distances to Solve the puzzle.

If you want to try it on your computer.

  1. Install python2.7
  2. Create a virtualenv.
  3. pip install pygame
  4. clone the git repo.
  5. python run_game.py

Deep Learner Playing Breakout

Let's first watch this video

In this video, I just gave the program a game and it learned to play by itself. No, I did not code the player, that would have been so traditional. Here the player, the computer, the program, actually learns to play by itself by just playing the game! It does not need me.

I recorded this video for experiencing how a Deep learning algorithm actually works. And as you can notice, it works amazingly! Deep learning is subset Artificial Intelligence that tries to show "intelligent behavior" by using something similar to (neural networks) human brains wiring. It uses mathematics, that we think, human brains internally use to exhibit rational thinking.

The results of these have been amazing. From beating go games (Thanks For Ruining Another Game Forever, Computers) , to making self-driving cars a possibility. The above video should give some idea that self-driving cars can learn about hurdles and try to navigate by itself.

How to setup your system for Deep Learning Experiment ?

I wish, you will be excited to replicate this experiment. If you are interested, here is how I setup.

1. Rent a GPU Instance from AWS or Azure. Right now, we need GPUS. They are very costly, but the deep learning frameworks are not optimized for CPUs. I spent multiple weeks of uptime on CPU without any results. Go for GPU. AWS Has it.

  1. Setup Ubuntu 14.04 with proper NVIDIA drivers.
  2. Install X11 and Window Manager. It wont be fun otherwise.
sudo apt-get install xubuntu-desktop xfce

4. Setup viewing your powerful "Cloud Desktop" using nomachine. Apparently, that's the best way I could setup remote graphic viewing.

  1. Clone the DeepMind-Atari-Deep-Q-Learner code.
  2. Install the dependencies.
./install_dependencies.sh
  1. And, as my son will say. Here you go!
./run_gpu breakout

You can exit nomachine with the program running, and constantly come back to monitor your computer trying to learn to play a game by itself.

Raft - Visual Explaination site

Came across this website http://thesecretlivesofdata.com/raft/ that wonderfully explains raft consesus protocol. It's
entertaining to follow.

I think, when trying to understand these algorithms trying to implement them is the only way to "get it".

As this article titled Students' Guide to Raft notes

Inevitably, the first iteration of your Raft implementation will be buggy. So will the second. And third. And fourth. In general, each one will be less buggy than the previous one, and, from experience, most of your bugs will be a result of not faithfully following Figure 2.

Glyph's post on threads

Glyph Lefkowitz is a great author. In this post titled unyielding he makes an excellent case against threads, with so many valid references for his arguments. He makes a point event driven programs that should be the first thing we should think if we are think about concurrency.

As a developer or project lead, if I have to emulate certain practices for a writing high quality software, I think, looking up to glyph and his twisted project is never a mistake.

SWIM - Group Membership protocol paper

For my computer science paper reading, I picked up a paper called SWIM - Scalable Weakly-consistent Infection style Process Group Membership Protocol, which I bookmarked a while ago.

I must have come across this while taking the distributed systems course by Indranil Gupta in Coursera. This paper is interesting, approachable and tries to solve the problem of membership updates in distributed systems.

The primary motivation of this seems that heartbeat based member update seem not scalable, so they innovate on providing an alternate mechanism that can utilized for membership updates in a group.

The basic protocol uses the random-probing based failure detector protocol of and disseminates membership updates via network multicast.

The core innovative concepts include:

  • Epidemic style membership broadcast.
  • Suspicion-based failure detector protocol.

In turn these provide:

  • Constant message load (bandwidth) per member regardless of the number of members in the group
  • Constant time to first-detection of a faulty process regardless of the number of members in the group
  • Low false-positive failure detection rate

This seems to be a popular paper, and there are many implementations available in the web. I think, I must bookmarked this paper as suggestion to myself that these are interesting projects (in python) to attempt.

Dominant Resource Fairness

I was reading the paper on Dominant Resource Fairness and found it approachable, interesting and fairly easy to understand.

Dominant Resource Fairness is a resource allocation strategy used by a system like Mesos.

In general terms, resources are basically things that a group will need and the idea is the allocate the resources amongst the members of the group in an efficient way. Examples could be the amount of money (resource) to be distributed across a group of people in the community or the processor cores in a multi-core processor that needs to be distributed and allocated to the process running on that processor.

The Dominant Resource Fairness uses Linear Programming technique to solve the problem of resource sharing.

In a datacenter with multiple computers, having multiple CPUs, multiple memories, network cards and many other resources, those needs to be shared across the processes that are running in the datacenter. DRF uses the concept of a dominant resource. The Dominant share is the maximum share that an entity (process) has been allocated for any resource. For e.g, if if a process A has heavy CPU usage and process B has heavy memory usage, the dominant resource for process A is CPU and the dominant resource for process B is Memory.

Dominant Resource Fairness seeks to maximize the minimum dominant share across all entities. That's the formulation for the linear programming problem for you. Doing it across in a distributed way for different tasks with different requirements is the challenge that is being solved.

For example, if user A runs CPU-heavy tasks and user B runs memory-heavy tasks, the DRF attempts to equalize CPU share of user A with the memory share of user B. In this case, the DRF would allocate more CPU and less memory to the tasks run by user A, and allocate less CPU and more memory to the tasks run by user B. In the single resource case -- where all jobs are requesting the same resources -- the DRF reduces to max-min fairness for that resource.

Some interesting anecdotes I found in the paper include, enforcing "fairness" in resource sharing is a difficult problem by itself.

A big search company provided dedicated machines for jobs only if the users could guarantee high utilization. The company soon found that users would sprinkle their code with infinite loops to artificially inflate utilization levels.

The paper also quoted economic research on difficultly in ensuring fairness.

Competitive equilibrium from equal incomes (CEEI), a popular fair allocation policy preferred in the micro-economic domain is not strategy proof.