Blog Moved

WordPress has some formatting problems in displaying posts published by Gmail emails, so I’m moving my personal technical blog to Blogger:

http://agentzh.blogspot.com/

Please update your RSS subscribe and/or bookmark settings 🙂

Posted in Uncategorized | 1 Comment

My fork of ngx_eval 2011.01.12 released

People have been reporting issues of my fork of ngx_eval when using it with recent versions of nginx. I’ve updated it to version 2011.01.12 to compatible with nginx 0.8.54+:

https://github.com/agentzh/nginx-eval-module/downloads

Now "eval_subrequest_in_memory" is off by default. Because subrequest_in_memory mode is still having issues.

Here’s a small example for using ngx_eval with my ngx_redis2, an nginx upstream module that supports almost the full Redis 2.0 unified wire protocol:

location /foo {
eval_override_content_type ‘application/octet-stream’;
eval_subrequest_in_memory off;
eval $res {
redis2_literal_raw_query ‘set one 5\r\nfirst\r\n’;
redis2_pass 127.0.0.1:6379;
}
echo [$res];
}

Accessing /foo with curl yields

[+OK

  ]

on my side.

Please ensure that you’re using nginx 0.8.54+ and the latest version of my fork of ngx_eval (at lease 2010.12.29+ is required).

If you find any issues, please let me know.

Special thanks go to Valery Kholodkov for creating ngx_eval in the first place.

P.S. Still, it’s recommended to use ngx_lua module’s rewrite_by_lua or access_by_lua directives combined by the "ngx.location.capture" API to do what my fork of ngx_eval has been doing.

Posted in Uncategorized | Leave a comment

ngx_lua v0.1.1: access_by_lua and access_by_lua_file landed

After the first public release of ngx_lua, I’m proud to annouce the v0.1.1 version, which has introduced the access_by_lua and access_by_lua_file directives:

https://github.com/chaoslawful/lua-nginx-module/downloads

Now we can code up our own nginx access-phase handlers directly in pure Lua, with all the capabilities with rewrite_by_lua and content_by_lua, like firing up subrequests to other locations, reading/writing nginx variables, changing response headers, internal redirection, HTTP 301/302 redirection, throwing out 403/500/etc error pages, and etc etc etc.

Here’s a small example that emulates ngx_auth_request’s functionality in pure Lua:

location / {
access_by_lua ‘
local res = ngx.location.capture("/auth")
if res.status == ngx.HTTP_OK then
return
end
if res.status == ngx.HTTP_FORBIDDEN then
ngx.exit(res.status)
end
ngx.exit(ngx.HTTP_INTERNAL_SERVER_ERROR)
‘;

# proxy_pass/fastcgi_pass/postgres_pass/…
}

which is approximately equivalent to

location / {
auth_request /auth;

# proxy_pass/fastcgi_pass/postgres_pass/…
}

except that "auth_request" runs at the beginning of the nginx access phase while "access_by_lua" runs at the end of the access phase.

ngx_lua is an nginx C module that embeds the Lua or LuaJIT interpreter into the nginx core and you can find the latest source code and the complete documentation here

https://github.com/chaoslawful/lua-nginx-module

Enjoy!

Posted in Uncategorized | Leave a comment

ngx_lua v0.1.0: scripting nginx with Lua

I’m happy to announce the first public release of the ngx_lua module, v0.1.0:

https://github.com/chaoslawful/lua-nginx-module/downloads

ngx_lua This module embeds the Lua/LuaJIT interpreter into the nginx core and integrates the powerful Lua threads (aka Lua coroutines) into the nginx event model by means of nginx subrequests.

Unlike Apache’s mod_lua and Lighttpd’s mod_magnet, Lua code written atop this module can be 100% non-blocking on network traffic as long as you use the ngx.location.capture interface to let the nginx core do all your requests to mysql, postgresql, memcached, redis, upstream http web services, and etc etc etc (see ngx_drizzle, ngx_postgres, ngx_memc, ngx_redis2, and ngx_proxy modules for details).

You can find the latest source code as well as the latest documentation here:

http://github.com/chaoslawful/lua-nginx-module

We’ve already run ngx_lua as well as dozens of other nginx C modules in Taobao.com’s web applications for months and this module is considered production ready.

With ngx_lua, it’s now possible to construct true C10K-capable web applications because everything can be trivially made non-blocking, not only the network traffic between the clients, but also the traffic to the backends like RDBMS and memcached clusters. And we’ve been working hard to make it a much better substitution for traditional web development solutions like PHP, not only in terms of performance, but also ease of use.

Enjoy!

Posted in Uncategorized | Leave a comment

Video for my ngx_openresty talk on ECUG 2010

I gave a (Chinese) talk regarding ngx_openresty (aka nginx.conf scripting) at ECUG 2010 in the last month. Here’s the video for the talk:

    http://v.ku6.com/show/D00rqtnRwKzJdIsB.html

The slides can be viewed in a web browser from here:

    http://agentzh.org/misc/slides/nginx-state-of-the-art/

Please use the arrow keys on your keyboard to switch pages in the slides.

Have fun!

Posted in Uncategorized | Leave a comment

ngx_memc v0.11 released: small bug fixes and more documentation

I’ve just released the v0.11 version of our ngx_memc module:

  http://github.com/agentzh/memc-nginx-module/tarball/v0.11

This version applies the patch from iframist, fixing the zero size buf alert in error.log when $memc_value is set to empty ("").

The ngx_memc module extends the standard memcached module to support almost the whole memcached ascii protocol. And it can be used with either the standard memcached server or other backends supporting the memcached wire protocol like TokyoTyrant.

Maxim Dounin’s excellent ngx_http_upstream_keepalive module can also be used with this module to provide a powerful connection pool for memcached.

See ngx_memc’s wiki page for more details:

  http://wiki.nginx.org/HttpMemcModule

I’ve also updated the wiki page for this release and documented various directives like memc_connect_timeout, memc_read_timeout, memc_send_timeout, and memc_buffer_size. (These directives are inherited directly from the standard memcached module.)

Enjoy!

Posted in Uncategorized | Leave a comment

淘宝(北京)量子团队的实习机会

我们是阿里巴巴淘宝网(北京)数据平台部门的量子团队( http://lz.taobao.com ),我们正在寻找有想法有激情的同学到我们团队来实习。

我们目前的几个实习岗位近期的主要工作包括

  • 基于 hive/hadoop 的数据仓库系统的自动化测试的方法研究和实施(Perl 5, PostgreSQL/mysql, Hive/Java)
  • 在数据产品的在线应用(Online Application)中进行 hive 计算任务的提交和异步执行的管理
  • 基于 nginx 和 ngx_openresty 的数据 web service 平台的完善(C, Lua, Erlang, 以及我们自主设计的 LZSQL 语言)

感兴趣的同学可以在我的个人博客上找到更多的细节:

    http://agentzh.spaces.live.com

我们对学校、学历和专业都没有要求,因为我们相信真正有潜质的人并不必出自名牌大学,也不必是科班出身。我们也不会要求有很深很广的知识和技术背景,毕竟我们这里更多的是使用或者自己创造全新的知识和技术。我们看重的主要是

  • 有很强的求知欲和好奇心,对新事物和新想法有浓厚的兴趣,并愿意为之付出很大的努力,经常自己搞点有趣的事情玩
  • 同时有坚持力,愿意沉下心系统地学习一些学科、理论和技术细节,有耐心,能耐得住寂寞
  • 拥有较好英语功底,可以无障碍地大量阅读英文文档和其他资料
  • 能有较长较为稳定的实习期(6 个月以上),因为许多重要的工作需要足够长的时间才能达到足够的深度和广度

下面是我们觉得胜任此工作所必需具备的技术上的基本技能

  • 经常在 Linux 或者其他 *NIX 环境中工作,熟悉 shell 和命令行
  • 经常使用像 vim 和 emacs 这样的文本编辑器
  • 有比较好的 ANSI C 编程的基础
  • 熟悉至少一门动态的脚本语言,比如 Perl 或者 Python
  • 曾经使用过某一种版本控制系统,例如 git 或者 subversion
  • 熟悉至少一种 make 性质的项目构造和管理工具,例如 GNU make
  • 熟悉 SQL 语言和某一种 RDBMS,例如 mysql 或者 PostgreSQL
我们能提供和一直倡导的主要是

  • 比较宽松的工作氛围和学习氛围
  • 比较细致的技术指导和支持
  • 激进但不失必要谨慎的新技术的调研与生产应用
  • 通过自己设计语言和工具,变无聊的体力活为有趣的创造性的事情
  • 这个实习岗位工作会像正式员工的工作一样有趣,一样富有挑战
我们的办公地点在:北京市朝阳区东三环北街 38 号院 1 号楼泰康金融大厦 25 层。

如果你对我们这个实习岗位感兴趣,请将你的简历通过电子信箱发送到我的信箱: chunlai at taobao dot com,并确保邮件标题中出现“应聘”这两个字,以便我能及时看到 🙂

章亦春 (agentzh)
 
Posted in Uncategorized | Leave a comment

LZSQL compiler hacking and future VoltDB applications

We’ve working on a compiler for our own language named "lzsql". It’s for our nginx-based web service platform that drives our data product lz.taobao.com. Our "lzsql" compiler can now emit lua code that has passed lots of real world tests.

We can now decide whether to run a sql query at a remote mysql node or at the nginx core, all in the lzsql language.

For "local sql queries", we’ve implemented a full-fledged sql engine in pure lua. It’s damn fast, especially using LuaJIT. 6k q/s for a single nginx worker process is not uncommon in our benchmark.

And we’ve introduced a type system in our language such that it can handle sql quoting rules automatically. The typechecker can ensure that a lzsql variable with a specific type is used correctly in the context of the sql query. The sql language is part of the language anyway. Therefore, sql injection cannot happen.

We mostly use the "local sql engine" for merging data from completely different data sources, like those from both mysql and a non-relational data source. We do have some non-relational data sources like our real time stats services and other Java-powered web services from other departments of Taobao.com.

Here’s a small example:

   text $pattern;
   location $mysql_node;

   @a := select count(id) as count
         from cats
         where name contains $pattern
         group by park
         at $mysql_node;

   @b := select count(id) as count
            from other_service.some_api($pattern)
            group by park;

   return (@a union all @b);

In this sample, "other_service.some_api" is a non-blocking call to some remote non-relational data source. And the first SQL query runs on a remote mysql node specified by the variable $mysql_node while the last two both run directly in the nginx core by our sql engine written in Lua.

The .lzsql source file is compiled down to (very compact) Lua code before deploying to our production servers. Because it is a true compiler, we use Perl 5, one of the not-so-fast scripting languages, to implement the whole compiler (approximately 3k lines of hand-written code). Perl modules like Moose and Parse::Descent have made the compiler construction process quite enjoyable 🙂

In the future, the lzsql compiler is also expected to optimize the sql queries automatically for specific remote sql engine, like mysql’s.

The lzsql compiler will be eventually be released under an opensource license with the name "RestyScript" when we decouple those our specific business logic from the compiler. For now, we hardcode some business logic into the compiler for the sake of convenience. We’re going to move them into compiler plugins or language extensions and make the lzsql toolchain itself more general.

My intern students become very productive when they start using the lzsql language 😉 The old system they’re replacing is written in tons of ugly php code, oh well 😉 we’ve cut off 90% of the codebase size and also got 20 ~ 30 times faster 😀

We’re also puting our heads around VoltDB, a really nice memory database. And we’re also looking forward to rewriting our "real time stats services" mentioned above using VoltDB and Erlang or Lua or etc. An nginx upstream module for the VoltDB binary protocol is also on chaoslawful’s and my TODO list.

The only sad part regarding VoltDB is that it’s written in Java, but it’s not a very big issue for us. It has some ugly limitations regarding its sql and interfaces, but we can work around those details on the level of our lzsql language and just use it (combined with java) as the runtime.

It’s already starting to become more and more interesting 🙂

Stay tuned!

Posted in Uncategorized | 3 Comments

ngx_lua module updated

chaoslawful++ has just added several new features to our ngx_lua module. Checkout his blog article for details:

    http://chaoslawful.javaeye.com/blog/755013

ngx_lua embeds the power of the Lua language into the nginx core and can be used to construct high performance web services and web applications:

    http://github.com/chaoslawful/lua-nginx-module

Happy lua hacking! 😀

Posted in Uncategorized | Leave a comment

ngx_set_misc v0.14: extending ngx_rewrite’s “set” directive

I’m happy to announce the first public release of our Taobao.com ngx_set_misc module, v0.14.

ngx_set_misc is an nginx module that extends the standard ngx_rewrite module’s "set" directive to support various advanced functionalities like MD5, SHA1, json/mysql/postgresql string literal quoting, URI escaping/unescaping, default variable value assignment, upstream hashing based on a custom key, base32 encoding/decoding, and more 🙂

Please see the project homepage for more details:

    http://github.com/agentzh/set-misc-nginx-module

And the release tarball can be downloaded from

    http://github.com/agentzh/set-misc-nginx-module/tarball/v0.14

Various (funny) use cases can be found in my "nginx.conf scripting" talk’s slides:

    http://agentzh.org/misc/slides/nginx-conf-scripting/  (use the arrow keys on your keyboard to switch pages)

I must thank my colleagues shrimp and calio for their work on polishing this module in the last few months.

This module won’t be possible if Marcus Clyne does not publish his crazy Nginx Development Kit (NDK) project:

    http://github.com/simpl-it/ngx_devel_kit

And it’s a prerequisite for this module 🙂

I know that this module has a really terrible name, but it’s been there for months already 😛

We’ve been using it extensively in our products of Taobao.com. And Qunar.com is also using it heavily in their production environment.

Enjoy!

Posted in Uncategorized | Leave a comment