post

Revamped isbn.net.in

New & revamped isbn.net.in

New & revamped isbn.net.in

 

Because of constant badgering from loyal isbn.net.in users (Navdeep, Chinmay, Hari, Sandip, Vidyaraj, Kartik, Vivek, Arjun, Leo, Ravi and others), I finally had to dedicate a weekend to fixing up isbn.net.in. However, instead of just fixing up the old code base, I rewrote it to use Compojure instead of the deprecated Noir library, and along the way, I re-did some of the code design to make it more flexible for editing and debugging.

I can’t believe people still use the site 3 years after I wrote the first version and put it up, especially with so many comparison shopping sites for India announced in these 3 years which are more functional and cover more categories. But, hey, can’t argue with those users :)

Caveats: It’s still a work-in-progress, the JSON API, etc. are still not present in this version and more ecommerce stores have to be added, will work on those going forward. And I can use the help if anybody has time, it’s open-sourced at https://github.com/swaroopch/isbnnetinclj2.

Advertisements
post

[Tech] Why is “Database Layer as a REST API” not common?

We have “database APIs” such as abstraction layers over multiple SQL databases and ORMs. But why not take it to the next step and make it a REST API like any other network call that we can make?

Database as a REST API

Database as a REST API

Advantages would be:

Did we just sort-of reinvent Datomic?

Of course, this is not a new idea at all, take restSQL as an example – my question is why is this not talked about more often?

Do most frameworks support this? If not, why not? If so, why don’t most frameworks don’t talk about such a use case in their documentation? If I use Django, I’ll start writing the models and use South to create migrations, and that’s that. If I have to reuse those model, from say, Java, then you’re on your own. The point is that, by default, Django (or Rails) doesn’t encourage you to do such a thing. If you go for a lighter framework such as Flask, then this becomes easier because the ORM is anyway not part of the framework.

Is this concept felt needed only in a polyglot case (multiple database systems, multiple programming languages)?

P.S. Also read Stevey’s Google Platforms rant.

Update on [2013-04-28 Sun]: Also see the very useful tech talk Designing a Beautiful REST+JSON API.

post

Wrote an EDN format reader and writer in Python

I was reading about the EDN format over the weekend. EDN (pronounced like in “eden garden”) is a data format in the same league as JSON but is supposed to have some nifty features such as sets, keywords, date-time type, custom types, and also being a proper subset of Clojure.

Having a date-time type as well as custom types seems useful to me, so I was taking a look at the current Python implementations of the EDN format and I didn’t find them satisfactory, for example, one of the listed ones had all custom parsing code which was difficult to read, one was not even a real implementation, just boilerplate code, etc.

So I thought why not create a better implementation and I did – it is up on GitHub at https://github.com/swaroopch/edn_format.

It has been a long time since I did lex and yacc, so it was a fun weekend project :)

post

Learning Clojure

I once happened to attend a RubyConfIndia talk by C42’s Steven Deobald who said:

data > functions > macros > compilers

That kind of stuck in my head even though I didn’t know what it meant at that time. I understood it only after learning Clojure and “The Clojure / Lisp way”. I realized it when I was writing Python code for work, and I suddenly noticed I was writing code differently and I had one of those good aha moments that is supposedly the start of a person’s Lisp journey.

I’m now amused at how often I break down my Python or Java code into lots of little functions instead of the 100-liner functions that I used to write before and am still surprised that I never realized I was writing them! The good thing about the “lots of little functions” is the modularity and the ease with which I can write, read, understand and importantly test the code without having to build an object hierarchy first.

For example, my code has now suddenly started looking like this, where data structure is explicitly written down and the processing code is separate from it – this makes the code really reusable. It is a contrast to my earlier programming style where I would’ve probably had the data structure implicit in the parsing code (which makes it less maintainable) or worse, had classes and objects to do the same and it would certainly have not been so reusable! Think of a typical Java programming workflow where I would have had to create a class to represent the data input and passed that to a processor class instance and so on.

# http://www.lexicon.net/sjmachin/xlrd.html
import xlrd

DATA_SHEET_NUMBER = 0
START_ROW = 3 # skip headings

# Explicit structure of the data
COLUMN_MAPPING = {
    'name' : 0,
    'class' : 1,
    'maths' : 2,
    'geography' : 3,
    'english' : 4,
}

def row_to_dict(sheet, row_number):
    assert isinstance(sheet, xlrd.sheet.Sheet)
    assert isinstance(row_number, int) and row_number > 0 and row_number < sheet.nrows
    # Code that will work with changing structure
    return dict([(key, sheet.cell_value(rowx=row_number, colx=COLUMN_MAPPING[key])) for key in COLUMN_MAPPING.keys()])

def import_excel(content):
    book = xlrd.open_workbook(file_contents=content)
    sheet = book.sheet_by_index(DATA_SHEET_NUMBER)
    # Code that will work with different spreadsheet formats
    sheet_data = [row_to_dict(sheet, row_number) for row_number in range(START_ROW, sheet.nrows)]
    sheet_data = [data for data in sheet_data if len(data['name']) > 0] # Ignore empty rows
    return sheet_data

if __name__ == '__main__':
    from pprint import pprint
    pprint( import_excel(open('test.xls', 'rb').read()) )

To be clear, Python was a good first step, what changed was the mindset after attempting to learn a Lisp language. As Peter Norvig once said:

Basically, Python can be seen as a dialect of Lisp with “traditional” syntax (what Lisp people call “infix” or “m-lisp” syntax). One message on comp.lang.python said “I never understood why LISP was a good idea until I started playing with python.” Python supports all of Lisp’s essential features except macros, and you don’t miss macros all that much because it does have eval, and operator overloading, and regular expression parsing, so some–but not all–of the use cases for macros are covered.

A good friend of mine once said that Python is more popular because it is more approachable by traditional programmers and hence a more “social” programming language, whereas Lisp is a powerful language but not for everyone. That is explained in detail in the Lisp Curse essay.

So first good thing about Clojure is that it is a Lisp. Second is that it runs on the JVM which has solid performance, sometimes 20x better if you use it right. Third is solid Java interoperability. This was important to me because as a consultant, Java is unavoidable and I’ve written more Java code this year than I ever have. And using a good dynamic language on top of JVM with good Java interoperability is a path to making my work go faster. At least, that was how I got started. After all, your code will end up reflecting your company.

The downside I felt when I was grokking Clojure is that syntax is not simple even though that is the claim of traditional Lisps, for example #”” is regex, #{} is a set, #_() elides the form (compiler checks the code but acts as if it was commented out), #() is an anonymous function, #’ derefs to vars, and so on.

Here is a quick idea about Clojure’s philosophies that I was pointed to:

clojure three circles

Another interesting point is that functional programming languages are growing and it is probably because the future is DSLs again.

If you’re still not convinced, you should watch The Curious Clojureist. And you should definitely watch all the Rich Hickey talks.

How to learn Clojure

The O’Reilly Clojure book is best book that I’ve come across yet.

However, equally important, my strong recommendation is that Clojure is good only when combined with Emacs and ghoseb’s emacs setup. After learning Clojure in that environment, writing Python again makes me miss so many goodies (To get up to the same productivity in a few ways, I’m using PyCharm these days and am enjoying that).

To make my learning solid, I rewrote isbn.net.in for the third time in Clojure. The source code is at https://github.com/swaroopch/isbnnetinclj – be prepared to read some amateurish Clojure code.

I got a lot done in ~280 lines of Clojure code compared to 480+ lines of code in Ruby/Rails and a ton more boilerplate code. This difference in number of lines of code repeats often.

One interesting point is that because of the Clojure way of thinking, I ended up using a simple combination of future and core.cache to do the fetching of prices from book stores in parallel rather than bringing a full-fledged background jobs processor (delayed_jobs) to do that which vastly simplified the system. You can read that code in stores.clj.

Ending Thoughts

I got started with this journey because of frustrations with Java and at the same time I was trying to be not be narrow-minded with experience in just Python/Ruby/Perl languages (they are so similar). I kept reminding myself of what Douglas Crockford said:

WHAT WERE THE TRAITS OF THE WEAK PROGRAMMERS YOU’VE SEEN OVER YOUR CAREER?

That’s an easy one—lack of curiosity. They were so satisfied with the work that they were doing was good enough (without an understanding of what ‘good’ was) that they didn’t push themselves.

I’m much more impressed with people that are always learning. The brilliant programmers I’ve been around are always learning.

You see so many people get into one language and spend their entire career in that language, and as a result aren’t that great as programmers.

Programming languages becoming popular is almost never about the merits of the language itself and rather just a virtuous cycle of availability of programmers or platform requirements – Javascript and Objective-C are popular because you have no other choice, not only because of the merits of the language. Similarly, Clojure is leveraging the JVM and whatever native platform it runs on and hence is getting that initial lift needed to make the language appealing since people don’t want to learn and start on yet another ecosystem.

This is best explained by Alan Kay himself:

Q: What should Java have had in it to be a first-quality language, not just a commercial success?

Alan Kay: Like I said, it’s a pop culture. A commercial hit record for teenagers doesn’t have to have any particular musical merits. I think a lot of the success of various programming languages is expeditious gap-filling. Perl is another example of filling a tiny, short-term need, and then being a real problem in the longer term. Basically, a lot of the problems that computing has had in the last 25 years comes from systems where the designers were trying to fix some short-term thing and didn’t think about whether the idea would scale if it were adopted. There should be a half-life on software so old software just melts away over 10 or 15 years.

It was a different culture in the ’60s and ’70s; the ARPA (Advanced Research Projects Agency) and PARC culture was basically a mathematical/scientific kind of culture and was interested in scaling, and of course, the Internet was an exercise in scaling. There are just two different worlds, and I don’t think it’s even that helpful for people from one world to complain about the other world—like people from a literary culture complaining about the majority of the world that doesn’t read for ideas. It’s futile.

Did you know that Lisp and Smalltalk are not so much in vogue because they were killed by bad hardware!?:

Alan Kay: Yes, actually both Lisp and Smalltalk were done in by the eight-bit microprocessor—it’s not because they’re eight-bit micros, it’s because the processor architectures were bad, and they just killed the dynamic languages. Today these languages run reasonably because even though the architectures are still bad, the level 2 caches are so large that some fraction of the things that need to work, work reasonably well inside the caches; so both Lisp and Smalltalk can do their things and are viable today. But both of them are quite obsolete, of course.

Lastly, I wanted to mention that my Clojure journey would not have sustained if it wasn’t for Baishampayan Ghose (a.k.a. @ghoseb, a.k.a BG) whose untiring answers to my dumb questions was instrumental in me finally gaining some understanding of Clojure and Lisp in general. Thanks BG!

P.S. Watch this 2011 talk by Alan Kay. As @ghoseb would say, Be prepared to blow your mind.