The Pragmatic Scientist's Guide to Causal Inference

My notes from the course taught by Edward Kennedy at CMU. Work in progress: feel free to email me or leave annotations with feedback.

Colors of Spring

A Springy Ride Dino Pride Pittsburgh, 2017.

Modeling Disease Trajectories

My next project will (with high probability) involve modeling “disease trajectories” in some form. Though a concrete problem definition is still a few months away, I ran into some interesting recent work in this space that I will be reading soon. This post simply lists and quotes content from these papers that I found interesting. I will append to this post as I come across more.

Indexing Memories Better

I realised I’ve been having memory blanks in the thoughts I have on things I read, secondary research topics, and just about anything else that does not have to do with my primary research. The symptom manifests as this feeling that I know I had some thought(s) about a certain matter, consequential ones that I would like to remember, but being unable to extract that specific memory from within.

So I’ve decided to write things down here before they slip away, which probably means more frequent posts that closely resemble me on Twitter, but without the 140-character chains.

The Elasticity of Consumption

Update: I found a survey that addresses exactly the same questions as outlined in this post (even categorizing shocks and responses similarly!).

I am interested in how individuals respond to changes in their income. The changes may affect periodic income, temporarily or permanently, in the form of income shocks, such as layoffs, or be less egregious periodic income modulations such as delayed pay-days, bonuses and raises. They may also be completely independent from periodic income, but expected, such as receiving your tax refund. Finally, they may be completely unexpected financial windfalls, such as having your security deposit returned by the slumlord you used to rent from in darker times.

Spark Streaming Microbatch Metrics, Programmatically via the REST API

TLDR; metric collection script.

The Spark Streaming web UI shows a number of interesting metrics over time. Tan and I were specifically interested in the (micro)batch start times, processing times and scheduling delays, which we could find no reported way of obtaining programmatically. We were running Spark 2.0.0 on YARN 2.7.2 in cluster-mode.

Coffee on the North Fork

North Fork Roasting Co.

I usually head west from Stony Brook every other weekend to work out of a nice cafe in Manhattan or Brooklyn, but have become increasingly frustrated with the tiny, crowded spaces that I’ve had to put up with. So this weekend, I decided to head east to the North Fork, and was pleasantly surprised.

Online Random Projections

feature I recently encountered the problem of computing the similarity of pairs of “documents” (which, in my case, were actually graphs), where the documents arrived as a stream of individual words. The incoming stream of words could both initiate new documents and grow existing documents. Interestingly, the words of each document could also arrive out-of-order, but I will ignore this complication for now.

On Profanity in Science

I came across a slightly old paper on Studying the Source Code of Scientific Research. The section on The Sacred and the Profane had me in fits. It reminded me of the less academic but equally hilarous analogue from software development: Commit Logs from Last Night.

On The Pursuit of Beauty

Today I read a profile by the New Yorker on Yitang Zhang, who proved an upper bound on the gap between consecutive primes last year. I found his profile strangely unglamorous for a MacArthur fellow. Many segments of the article made me stop and think about my own academic prejudices and goals. I also liked many of the quotes and the way some concepts were described. So I’ve collected below all the snippets from the article that caught my eye.

Building the Igor REST API

This weekend marks the end of my GSoC with OSUOSL. We now have a full-fledged REST API, a hierarchical CLI and more than a couple of patches to pyipmi implementing IPMI commands. I have already talked about the CLI before the midterm evaluation. In this post, I will focus on the Igor REST API that the CLI delegates all the heavy lifting to.

The Case of Lenovo India

TLDR: Ordered a Z510 from Lenovo India’s The Do Store on May 31; received a damaged laptop on June 7; struggled for a refund/replacement, eventually sending a legal notice; received a refund on August 1.

This post started off as a rant, but has since become an analysis. I hope this serves useful to potential Lenovo buyers, especially since their recent Y and Z series launch in India with 3 years of extended warranty that is useless in practice.

Building the Igor CLI with Click

This summer, I am building an IPMI management console for the OSUOSL. IPMI is the interface implemented by special hardware that lets you command a machine over a network as long as it is plugged in, even if the machine is powered off or refuses to boot an OS. If you have ever had ops engineers rebooting a bricked datacenter server with a magic console, you have seen IPMI.

Micromotives and Macrobehavior

The first chapter of Micromotives and Macrobehavior introduces the phenomenon of people behaving in a self-serving way that results in some regular and predictable pattern of the group’s behavior as a whole.

An interesting question posed is, is there a relation between the behavior characteristics of individuals and the characteristics of the aggregate?

Beginning How To Prove It

I just finished working through the introduction of How To Prove It. The book presents a structured approach to reading and writing proofs, that is analogous to how we compose elements like if-then statements and do-while loops to write computer programs. I’m personally finding the book very promising, and highly recommend working through the introduction to get a taste of the kind of mental process this provokes.

Building This Blog

This is the third iteration of my blog. This is a meta post on how it was constructed and why certain choices were made.

Prose.io

And here’s a post written in prose.io! No longer will I need to clone and build my Jekyll site locally, and hope that the computer at hand has half-decent Vim syntax highlighting.

Which Markdown?

Which markup processor should I use? My needs for one happen to be simple: it should be under active development, should make embedding (\LaTeX) dead simple, and should be deployable with Markdown on Github Pages, since I plan to compose posts in prose.io. On offer, we have Redcarpet, RedCloth, Rdiscount, kramdown and Maruku.

Infinite Loops

My second attempt at Coursera’s Scala course turned out to have a refreshingly twisty plot. The grand plan this time is to blaze through the course using both Scala and Clojure, and then top it off by dishing out code reviews of Storm (Clojure) and Kafka (Scala).

Throwing Darts

Estimating π on a beanstalkd cluster.

Update: November 6, 2012: We further organized the content of this post and conducted a 3-hour tutorial at PyCon India 2012. The source code and tutorial slides are available on our Github repository.

The Finish Line

I stumbled across this gem on Reddit.

I make sure to start every day as a producer, not a consumer.

When you get up, you may start with a good routine like showering and eating, but as soon as you find yourself with some free time you probably get that urge to check Reddit, open that game you were playing, see what you’re missing on Facebook, etc.

Put all of this off until “later”. Start your first free moments of the day with thoughts of what you really want to do; those long-term things you’re working on, or even the basic stuff you need to do today, like cooking, getting ready for exercise, etc.

This keeps you from falling into the needy consumer mindset. That mindset where you find yourself endlessly surfing Reddit, Facebook, etc. trying to fill a void in yourself, trying to find out what you’re missing, but never feeling satisfied.

When you’ve started your day with doing awesome (not necessarily difficult) things for yourself, these distractions start to feel like a waste of time. You check Facebook just to make sure you’re not missing anything important directed at you, but scrolling down and reading random stuff in your feed feels like stepping out into the Disneyland parking lot to listen to what’s playing on the car radio - a complete waste of time compared to what you’re really doing today.

It sounds subtle, but these are the only days where I find myself getting anything done. I either start my day like this and feel normal and productive, or I look up and realize it’s early evening, I haven’t accomplished anything and I can’t bring myself to focus no matter how hard I want to.

Why I'm Now A Pseudo-Buddhist

This might come out as insensitive and immature to those of you who’ve breezed through greater ordeals, but even the smallest triplet of losses in quick succession might have one running for spiritual solace.

Graphy In TBioMed

A tiny but significant moment of joy for Rijul and I: a Graphy creation published and live in an international journal, it’s nice to see someone finding Graphy useful enough for the tiny but essential task of drawing a complex graph to embed in a research paper.

The Sniping Tool in Gnome

For the uninitiated, it’s a nifty tool I’ve seen being used on Vista that lets you select an area of the screen to take a screenshot of, instead of the entire screen; and it’s been lying around on Gnome all this while!

Quick Fix: Natty, Ubuntu Classic and Compiz

If you’re reading this, you must be a Unity reject: the ones who upgraded to Natty and are now left in the cold after being told their hardware’s just too yesterday. If you’re beginning to miss your wobbly windows and other frills on Ubuntu Classic, give this quick hack a try.

Graphy: In Studio

A screenshot gallery; click on a picture to enlarge it.

Drawing graphs

Figure 1: Constructing a graph: graphs can be directed or undirected, and can be simple or hyper-graphs (with self-loops). Once a graph type has been selected, construction involves simply selecting an element type (vertex or edge) and clicking/dragging to create it.

Graphy: A First Look

Graphy screenshot

Graphy is a teaching and learning tool that let’s you visualize graph theoretic algorithms on graphs you can construct yourself. The image above shows one of the graphs we tested Graphy’s step-by-step visualization expertise on; I never thought a depth first search would look that cool. Get Graphy!

$1 Per Pixel, And Sold Out!

The main motivation for doing this is to pay for my degree studies, because I don’t like the idea of graduating with a huge student debt. I know people who are paying off student loans 15-20 years after they graduated. Not a nice thought!

Alex Tew

Who Owns Your Code?

The company owns the rights to all work produced during the term of employment.

It probably won’t be long before you find yourself looking at something similar in the corporate contract of that big company you’re looking forward to joining.

Graphy: Alpha Testing Begins!

Take Graphy-alpha for a little walk!

And we mean alpha; which is a polite way of telling you that if you’re nice to Graphy, Graphy might be nice to you. Hey, it’s not going to fry your computer, so give it a try! Post in whatever you’d like to say, ask or complain about in the comments to this post; we’ll compile the frequent questions and complaints into our wiki as soon as we get the time.

Sup, Letux 400

The Letux 400, a rebranded Skytone Alpha 400.

The Samsung Galaxy 3, Android, USB and Linux

This is a hack to get the Samsung Galaxy 3 to mount as a USB drive on Linux. For some reason, it doesn’t react to being plugged in to my computer, apart from a meek beep and an indication that it’s slyly sucking power from the USB port.

pyFreeSMS: A Python API

This is a Python API to send free messages via the many online free SMS providers. It currently works only with 160by2; I’ll add support for Way2SMS et all soon, though I’m hoping people interested in a specific provider will extend this API themselves.

The Metal Ball With Legs

The tremors grew massive, Digging my nails into steel, I dreaded my shell falling apart. I could hear many voices, agitated, excited. Gradually, they grew distant, until all was still but the constant roar from the outside. A force pulled me to the ground, unseen but strong. I stayed down, tongue hanging limp, trying to soothe the bloody muscle pounding against my frame. Silently, I pleaded to the blinking lights, asking to be taken back to my old kennels. They soon stopped blinking.

Kudryavka

I wanted to do something nice for her: She had so little time left to live.

Dr. Vladimir Yazdovsky

Lift-Off From Moscow

I may have come across as pompous, a snob even. Truth be told, Father, great Siberian husky that he was, imbibed in me restraint. He often lay by, watching drearily, even as his comrades before him took to each others throats over the night’s last morsel. I never knew Mother. I was told she was a dainty terrier, which might give some meaning to my slightness. Living thus phlegmatic, and of meager means and form, I must admit, kept the life in me quite supple. Then I turned three.

M.A.D.: Our Winning Entry

Our winning entry for the event M.A.D held during Quark 2011 at BITS - Pilani, Goa Campus. Designed by Tanmay Binaykiya and I, the theme of the contest was to conceptualize and design an A3-sized print advertisement endorsing the theme Mahindra Scorpio: The Iconic SUV.

Verilog On Linux: Defenestrating Modelsim

Defenestrate /diˈfɛnəˌstreɪt/ (verb) : to throw (a person or thing) out of a window. The word originated from a couple of incidents in Prague, back in the 14th century, when a bunch of guys stormed in and tossed seven town officials out the window (quite literally).

Provoking Excerpt #01: SICP

It’s my first time coming across computer science literature that reads like one of Bertrand Russell’s philosophical pieces. Here’s an excerpt from the foreword I’ve just gotten past; if you’ve taken a course in programming languages this semester, steal this book!

Getting Android Sources Behind A Restrictive Proxy

I’ll have to assume you’re suffocated by both the following bottlenecks in getting the Android Open Source Project code:

  • Blocked git:// protocol and port.
  • A limit on the amount you’re allowed to download.

apt-fast: Accelerate Apt-Get Downloads

apt-fast accelerates your apt-get downloads using axel as a download manager. Axel segments the downloads and fetches them from multiple servers; typical download manager stuff. The developer claims an increase in speed of up to 26 times.

xkcd.pl

Get xkcd.pl here.

A simple Perl script to rip the entire bunch of xkcd webcomics to your hard disk. This script is a Perl rendition of xkcd.get() and a number of similar webcomic-to-local-disk aggregators littered all over the web, with a notable difference being the addition of the tooltip as a caption to the downloaded xkcd webcomic image.

My Social Network: A Simple Connected Graph

The image here is my Facebook social graph; with my friends, my friends of friends and so on as the vertices connected by edges representing their Facebook friendships, limited only by members’ individual privacy options. You can click on it (a better idea would be to right-click and then do a “Save Link As”) to open a 32MB ultra high-res .png file where you can actually zoom in to read the names and stuff.

GraphML With JUNG: Saving Out

This one of two posts I’m dedicating to saving out to and loading from GraphML using the JUNG library. These are two parts of a really good library that lack sufficient documentation.

GraphML With JUNG: Importing GraphML

This is a follow up to my earlier post on saving to GraphML using the JUNG library. It would make more sense if you browsed through that post before reading this one.

Cell Phone + Bluetooth = Webcam!

While I’m posting the instructions specifically for Linux boxes, a little perseverance should have you running it on any box that’s supported! What you’ll need is:

Easter Eggs!

Here are some cool easter eggs you’ll probably find on your Linux box; I really don’t know whether they’re GNOME specific, but there’s a good chance none of them will wipe out your hard drive, so go ahead and try them out!

Here's Something New | Foreheads!

It rings again. I swat it like that last mosquito I clumsily missed, expecting it to stop buzzing. Silence, momentarily. It starts again, while the voice echoes in my head, “Multivitamins.” Buzz. I can’t take it any more.

September Rain

The campus in September, 2009. When the the walkways had to be waded through.

IRC | A Quickie Noobie Guide

Pretty simple; here’s a quick recipe on how to bake this cake on Pidgin.

Hilltop Karting at Verna

To be very frank, karting bores me. Like all repetitive action; running in circles, lifting weights (hypothetically), chewing gum. But this turned out to be like one of those times when you were younger, when you’d spin around until you got dizzy, and do it again and again till you end up with a sick stomach.

ध्वनिक आतंकवाद

Yeah, that’s my desk. Yes, it’s a calm, peaceful, academically inspiring little cranny in the corner of my room. Yes, that is John Mayer blazing out of my speakers; a feeble attempt at self defence. I cower behind the lens, trying to drown out the sonic pain.

Windows Eats Grub

This is a common problem that surfaces when you install a Windows OS after you install Ubuntu; Windows conveniently wipes out the GRUB bootloader (the Linux equivalent of NTLDR, the “Choose Operating System menu”), locking you out from booting into Ubuntu. The fix for this is quick and easy in the typical case.

Tux: The Dark Side (Part I)

It’s pretty routine to see the flustered Linux user feverishly banging on CTRL - ALT - DELETE, trying to get his misbehaving Linux box to do something familiar. And then finally giving up and pulling the plug, probably frying his motherboard or corrupting his hard drive in the process.

TODO: Punish Operating System

89% of all video games contain some form of violence. Let’s break convention here, and come to face with another dirty piece of reality, and this time we don’t spare the video-game teetotallers. A glimpse into the average computer user’s psyche would bring up a variety of graphic images; disturbing, surreal fantasies of chaining your computer to a wall, flogging its rear, yelling,

“Why don’t you just do what I tell you to!?”

Baby Steps | Beginning My GSoC Research

Most babies take their first stumble a year from the time they’re born. Considering I’ve been programming in some form or the other since the Fourth Grade, I’m a ten year old just crawling out of his diapers;  it’s never too late to take that first step!

Eyes Half Closed

Before the bleak rays of dawn trickle through the curtain blinds, the night drowns in its silence, the essence of life made conspicuous by its absence. While the town lies in sweet slumber, somewhere, some place, a bright orange glob begins to stir.