BlogBlogs

June 26th, 2007

BlogBlogs

At the begin BlogBlogs was my way to learn Ruby on Rails, but immediately it became a real website and a few months later a brand new Web 2.0 startup company. It is a blog indexing, searching and ranking website with an embedded social network of bloggers and readers for the Brazilian (and portuguese speaking) blogosphere.

The site was launched in June, 2006 as a prototype, but was in February, 2007 with the release of a new version of the site that it entered the mainstream. It jumped from a few thousand visits per day to around 150,000 visits daily as of today. In the last months it ranked as Top 500 in the Technorati ranking, Top 10,000 in the Alexa.com ranking and is in the Top 250 in traffic in Brazil and Top 500 in Portugal.

Its a self funded entrepreneurship sustained by AdSense ads and it is 100% developed in Rails.

The BlogBlogs also provides several tools for bloggers to enhance their sites such as the Last Visitors Widget present on sidebar at the right of this page. It shows the last BlogBlogs users who visited this site and its very popular in among the Brazilian bloggers.

Hi, I just released a new version of the TextMate PoCSw Bundle. You can find it the projects area and it has lots of enhancements from the first version. Now you can open a file from an URL selected on the text, from the clipboard or from the user input and it has several error handling enhancements.

Special thanks for Brett Terpstra from the Circle Six Design for his suggestions, code and feedback. Thanks Brett!!!

If you use TextMate you probably love it, but you may find yourself searching for some missing functionalities. Well, today I was trying to solve a support request from my blog indexing site called BlogBlogs and I had to check a blog feed from the web and I like to see code on TextMate, however it doesn't have an open file from URL option on the File Menu. Of course that I could download the feed using the browser or wget, but I would like to have something very fast and handy for that task inside TextMate.

Well, a few days ago I started to study how to create bundles for TextMate to develop a Capistrano bundle for it, but it seems to be a little bit more complex than what I was planning. So, I decided to write an easier bundle and create a Open File from URL Bundle for TextMate as may first publicly available TextMate Bundle.

I'm going to pack and name all my bundles in the PoCSw Bundles package. You can download the first version with the TextMate Open File from Url Bundle

Bundle Download the PoCSw TextMate Bundle!

It is pretty simple to install and use. To install unzip the file and double-click the bundle icon. It will be installed on your TextMate. To use, open a file and type the bundle shortcut, or go to the PoCSw Bundles item in the Bundles Menu and choose the Open File from URL option. A prompt will be present asking for the URL. Type the URL and click the Ok button, the content of the URL will be inserted on the current file.

If you have any doubt, just left a comment here. I may take time to respond, but you can be sure I will.

Everybody knows that Rails provides 3 types of caching mechanisms. Page cache, action cache and fragment cache. However, by default, Rails serves the cached content as text/html and that may be an issue for those who wants to cache different types of contents, such as standard HTML for its web pages and XML for its feeds or REST responses.

I faced that issue in the last weeks while trying to improve the performance of my website called BlogBlogs (a Brazilian / Portuguese speaking blogs indexing and searching - please visit). The problem was to enhance the speed of the feed of the Official BlogBlogs Weblog and also the speed on the responses for the BlogBlogs public REST API. In both situations I need to server cached XML with the proper content type (you know, the builder and XML handling is really slow in Rails and also on majority other platforms).

Choosing the right type of cache

The first step was to choose between page, action or fragment cache.

Page cache is the fastest but less scalable Rails caching solution. Rails never touches the cached content, so to serve the right content type I'll have to use some of the workarounds available with routes and rewrites (what is dependent of the webserver). Then I dismissed page action towards an 100% Rails based solution.

Fragment cache whas not an option for builder templates (.rxml). As far as I know it can only be used for HTLM templates (.rhtml). Then fragment cache was also dismissed since I need to generate XML output.

Action cache was the last option and it worked for me. With fragment cache I was able to cache content generated by an XML template (.rxml) and also to control the cache handling in order to achieve the desired result: cached xml content served with the proper content-type.

The problem

First step is to have the action that generates the XML cached, so the controller will be like this:

class WeblogController

  caches_action :feed

  def feed
    @items = Item.find :all
  end

end

That makes WeblogController#feed to be cached. So, at the first time the action is called Rails will generate the content using the template feed.rxml, the content will be delivered to the browser with the proper content-type and a copy of it will be cached in your cache directory. Usually (using the standard Rails cache storage) the cache file will be something like this:

/webapp/tmp/cache/0.0.0.0.3000/weblog/feed.cache

The next time the action is cached, Rails will check for the existence of the cached fragment, load it and deliver it to the browser, however with the content-type text/html and thats exactly the behavior we want to void. We need that the cached content be delivered to the browser as text/xml or application/xml (or something like that).

Understanding action cache

Now I'll try to explain how the action cache works with my limited knowledge on Rails internals (I have not studied the code, and I can tell some wrong stuff). Once we define that the action feed is to be cached, Rails will mix-in the necessary cache code in the controller. Also this cache code works as a filter, being executed around the method (action) code. The cache code will check for the fragment, creating it (based on the action code) or reloading an existing fragment and delivering it to the browser. The glitch is that for existing fragments the content is delivered with the text/html content-type which may be incorrect for some situations.

That's all we need to know about action cache in order to understand the solution for the problem. But we need to know a little bit of filters. Rails filters are executed in the order they are created. That said, one of characteristics of action cache is the ability to have some filters being executed before the actual cache code is executed. This allow us to deal with things like authentication and is also a key aspect in our solution. In order to have a filter being executed in an action with cache, the filter declaration needs to be placed before the caches_action declaration.

The solution

The solution is pretty simple. We'll create a filter which will checks for the existence of the action cache fragment and if the fragment exists the filter will load it, set the header for the proper content-type and then will deliver the fragment content. That is it, what we did was to prevent the Rails cache code to be executed when the fragment exits and then run our own code which makes the right thing. Now our controller looks like this:

class WeblogController

  # Our filter call...
  before_filter :cache_as_xml

  # Rails action cache call...
  caches_action :feed

  # Our action
  def feed
    @items = Item.find :all
  end

  private

  # Our filter code...
  def cache_as_xml

    # Building the fragment name...
    # I do not know a better way to do that. Please let me know if you know.
    fragment_name = request.env["HTTP_HOST"].gsub(":",".") + request.env["REQUEST_URI"]
    fragment_name = fragment_name.last == "/" ? fragment_name.chomp("/") : fragment_name

    # Reading the fragment cache (if exists, or nil if not)...
    fragment = read_fragment( fragment_name )

    if !fragment.nil?
      # Setting the content-type header to the proper content-type (in this case text/html, but it can be anything)...
      @headers["Content-Type"] = "text/xml;"

      # Delivers the content
      render  :text => fragment

      # Void any further processing (do not forget this)
      return false

    end     

  end

end

Now, when you call the action and there is a cached content, the fragment content will be delivered to the browser using the content-type defined in the filter.

That's it.

Does it worth ?

Oh yeahh.... the following data is not a real benchmarking, but it shows how far the cache can speedup your application. Those are the Rails log for two consecutive calls to the cached action. In the first one the cache was empty and then the action code is executed to render the output (around 10 seconds). The following one is the same request being served with the cached content (around 0.001 second). That's 10.000 times faster!!!

Completed in 10.24998 (0 reqs/sec) | Rendering: 10.18748 (99%) | DB: 0.00025 (0%) | 200 OK [http://www.blogblogs.com.br/weblog/feed]
Completed in 0.00099 (1009 reqs/sec) | Rendering: 0.00006 (5%) | DB: 0.00000 (0%) | 200 OK [http://www.blogblogs.com.br/weblog/feed]

Now, lets expire the cache

There are two ways to expire the cache. The hardcore way is to delete the cached files, you can prepare a shell script which will clear the files and then schedule this as a cron job.

But if you want to have better control of the cache expiring, you should use Rails solutions for that. You can explicitly expire a cache fragment with the expire_fragment() method. However, if must be aware because Rails create the cache fragments based on the request url, so the same fragment can be under two different paths, such as: /webapp/tmp/cache/blogblogs.com.br/weblog/feed.cache and /webapp/tmp/cache/www.blogblogs.com.br/weblog/feed.cache. That said, you must clear both possibilities in you code (inside a sweeper or some other place in you app). You don't need to add the .cache extension while calling expire_fragment(). Your code should be like this:

expire_fragment("blogblogs.com.br/weblog/feed")
expire_fragment("www.blogblogs.com.br/weblog/feed")

To the infinity and beyond...

Let's think ahead, Rails could add this kind of functionality to the caching mechanisms. I didn't tried to patch Rails (I'm not too confident to do that), but I wondered on something like this:

  • Extend the caches_action with some options like this

    caches_action :action, :content_type => 'text/xml'
    
  • Then the cache code will do what we did in our cache_as_xml filter.

I believe that this kind of things can be easily added to Rails and will solve this issue in an elegant way. I'll forward this post to David Heinemeier, maybe this can be added to Rails 1.2.

More stuff and references ...

Please add your comments here and let me know if that helped you somehow or if you found other ways to do that. ;-)

Sobre

May 20th, 2007

A Pice of Cake Software nasceu num destes devaneios numa das madrugada em que eu estava programando o BlogBlogs. Pensei, "todo site tem um powered by ou desenvolvido por e o BlogBlogs também deveria ter". Então pensei, "o que colocar? Desenvolvido por Manoel Lemos...??.. eca... Powered by Lemos.Net ...??.. eca... já fiz isto um monte de vezes". Então, de repente, tive o seguinte pensamento "Como é fácil fazer software com Ruby on Rails..." Eureca!!! Éra isto mesmo, é fácil, é teta, é café-com-bolinho, é piece-of-cake... dai veio o Piece of Cake Software.

Além do fato de que a expressão Piece of Cake (é um pedaço de torta) tem muito a ver com Rails, este nome também tem muito a ver com o que acredito sobre desenvolvimento de software. Desenvolvimento deve ser uma atividade gostosa, recompensadora, e fácil. Software tem que ser fácil e intuitivo. Isto é a filosofia da Piece of Cake Software, a única empresa do mundo de apenas 1/4 de homem (por enquanto). Pois é, empresas de um homem só existem várias, mas de 1/4 de homem só a Piece of Cake Software. Funcionamos da 0h as 6hs da madugrada, ou seja, 1/4 de dia.

About

May 20th, 2007

The Piece of Cake Software was born in one of those crazy delusions in the late hours in the night while I was coding the BlogBlogs. I thought, "every website has the wordings powered by or developed by and the BlogBlogs needs that also". Then I wondered, "what to write? Developed by Manoel Lemos?? argh... Powered by Lemos.Net?? argh... I have done that so many times". Then, suddenly, I had the following thought "How easy is to develop software with Ruby on Rails..." Eureca!!! That is it, its easy, its fast, its a piece-of-cake... then the Piece of Cake Software was born.

Beyond the fact that the expression Piece of Cake has everything to Rails, this name also reflects what I believe in terms of software development. Development must be a joyful, rewarding and easy activity. Software must be easy and intuitive. That is the Piece of Cake Software philosophy, the unique company in the world of a quarter-of-man (so far). There are several one-man's band, but a-quarter-man's company, only the Piece of Cake Software. We are opened from midnight to 6AM, a quarter of day.