标签:
(TL,DR)
We‘re building a micro-service platform christened Hasura.io (alpha release scheduled in summer 2015), and we used Haskell as the core programming language to build it.
This is a post for people who‘re not very sure about using Haskell in production, to convince them otherwise. We‘re not proselytising for conversion of huge enterprise codebase into complete Haskell. But a little bit of Haskell here and there is probably safe.
The post is bereft of any actual code samples and quantitative benchmarks (we‘ll keep adding them here as we go along).
Our Background
We‘re a team of developers with fairly diverse backgrounds. Most of our production experience had been with the mainstream languages like C, Java, Python, C++, C#, Javascript. When we set about the task of building a PaaS/BaaS, we started evaluating what sort of language and toolkit would be best suited for building the core platform.
What Attracted Us To Haskell:
To be really honest, we‘re a bunch of young people, and seeing something like this gets us really fired up:
Haskell is an advanced purely-functional programming language. An open-source product of more than twenty years of cutting-edge research, it allows rapid development of robust, concise, correct software. With strong support for integration with other languages, built-in concurrency and parallelism, debuggers, profilers, rich libraries and an active community, Haskell makes it easier to produce flexible, maintainable, high-quality software.
Source: wiki.haskell.org
Purely functional (whatever that meant at the time)! Twenty years of cutting edge research! Robust, concise and correct software! We‘d already bumped into some conversations about how tight and helpful the Haskell community was supposed to be.
This was the stuff of engineering dreams. But Haskell was notorious for having a steep learning curve, was it worth putting business requirements at risk? Is it really as hard as people claim it to be? What is up with the category theory discussions that come up at the drop of a hat?
And then, we came across this: Beating the averages
The promise of something that‘s hard to use, and then works really well and increases productivity by a few orders of magnitude, is a pretty solid and sensible promise.
We all do use vim after all.
First steps with Haskell
After slightly more rigorous evaluation with some sample programs, we thus started working on projects (webapps in production) with some parts entirely written in Haskell. (Yay, microservices!). Some of the first real-world things we wrote included a ZeroMQ broker, automatically generating a typesafe CRUD and query API from a database schema, a redis-caching layer and more.
Some of the libraries we used during development were written by stalwarts in the Haskell and programming languages community. Haskell library authors tend to be spectacularly experienced, qualified and/or savvy people. Coupled with the nature of the language, this sort of means the code you write is on good foundations, safety and performance wise. Check the JSON parsing speed improvements out as the library version numbers go up!
After each project, we felt we were getting better superlinearly. Our experiences carried forward elsewhere too. Our Python code became more functional, concise and easier to maintain.
A Real World Problem: Expose a subset of SQL in JSON with fine grained permissions
We needed performant libraries for setting up an HTTP server, JSON parsing and interfacing with Postgres. Easy AST manipulation, the ability to reason about the correctness, and easily adding new features were the prime requirements during development. Above all, we needed the final program to scale up well, and be efficient with the tons of IO it would have to do.
Requirements: Flexible codebase, Scale up, Handle IO well
Language | Programming ease/desire | Memory footprint | Concurrency |
---|---|---|---|
Java | None | High | Mature |
Node | Ok, but programming with callbacks | Ok | Very Immature |
Haskell | Yes. Program ‘synchronously‘ and GHC handles the rest. | Low | Mature |
Python | Yes | High | Not easy |
Objectively too, Haskell was the choice here.
The three performant libraries that we ended up with were:
It was a breeze to rapidly prototype and test individual components before composing them into a whole. Midway through the project, we had to switch to a different Postgres library. The changes that we had to make were so localized and easy to reason about, that the entire transition took hardly a few programmer days.
Very surprisingly we got done really quickly. When we think back, we realize that the majority of the time we spent on the project was only on the problem we had set out to solve. The code in the project relates very closely to the actual JSON to SQL transformation rules we‘ve formulated on paper, and Haskell quickly got everything else out of the way.
A Conclusion to the Ramble
To conclude this slightly rambly post, let‘s revisit some of the initial claims made by the maintainers of the language, and how it was useful to us in practice. There are so many discussions about how and why haskell. In fact, here‘s a good place to see it being thrashed out: What is haskell actually useful for?
Purely functional: We could reason about our entire program, and the code reflected our reasoning almost exactly.
Concise code: Write code now, refactor soon. Smaller code lets you refactor larger portions easily (and hence frequently).
Robust code: Static typing with type inference, and a smart compiler (The Glorious Glasgow Haskell Compilation System).
Testing: As waxed poetic by everyone, the type system dramatically reduces a huge class of unit tests. Quickcheck makes another class of tests quite a breeze, by automatically generating test cases for assertions that you make about your functions. Most of what remains, is best tested manually anyway (eg: IO related failures).
Concurrency with IO:
GHC‘s threading abstractions and IO manager, makes it easy to think about concurrently executing computations and gives you all the benefit of event-based asynchronous runtimes like node
GHC has non-blocking IO (like node), supports real multiple threads and multicores naturally (unlike node) and works using events underneath (like node). It‘s not even a fair war.
Rich libraries: Haskell libraries are extremely easy to integrate and very very easy to just dig in and modify safely for your own use (and maintain forked versions).
Deployment: Private cloud deployments for Haskell code becomes really easy, because all library dependencies can be statically-linked. This makes version control and deployment a one step process.
Some common roadbumps
Type errors could be bizarre for a beginner, and parse errors are not helpful at all.
Documentation: Types are great for documentation when they are simple and concise. This is mostly the case, but in some libraries (ahem..lens), the type signature offers little to no documentation. In these cases, it takes a significant effort to use the library, but types finally make sense and are often what one returns to.
Language extensions: More often than not, when you‘re looking at a library in Haskell, you‘ll see use of GHC‘s language extensions. Some are very intuitive (OverloadedStrings, ScopedTypeVariables). Some are not (RankNTypes). This makes it harder to “learn from the source”.
The Clincher
No matter what you work on, there is a joy in programming with Haskell. You feel better after you write progressively harder code. And progressively harder code keeps coming your way. There‘s so much working Haskell code out there, that you can just stare at days for and not really understand (but always use!).
The satisfaction we feel after a good day of Haskell is unparalleled, and at the end of the day, that‘s what it‘s really all about isn‘t it.
From Zero to HIPster (Haskell In Production)
标签:
原文地址:http://my.oschina.net/u/236698/blog/401549