Thursday, January 26, 2006

OldPosts: RDBMS - a 'sql' server of my own

RDBMS : Relational Database Management Systems.
Actually this idea came to me while i was thinking of what project to do in the compilers course; that was last summer. It was orignally about parsing SQL, but it was extended to a full idea of a DBMS.
I liked the idea. Overtime, ideas to implement and how to implement it started to accumulate. Until it had to burst out. Knowing how to do something, and wanting to do it, is a powerful force. So i started seriuosly thinking about implementing it this vacation. I started by reading some book called "Database management systems, 2nd edition". A great introduction which gave me a great backgroung on the basics of the database theory and Relational algebra and calculus ( search for it in http://wikipedia.com ).
There is different kinds of DBs, i will mention the names only, not diving in details. Old systems includes Navigational databases. Then about 70s they invented the well-known commonly-used Relational databases. Now there is Object Databases, supporting OOP. This is NOT the whole list, it is just what i could remember for the moment.
What really made me think seriously about making one myself, is an articel i read on Mohamed Meshref's Blog about one student who made one server which can -under some circumstances- execute some queries faster than Oracle's.
With all that information on my head, i opend Visual C++, created an empty header file, gazed at the screen for about 10 minutes trying to figure how to start. As from experience i knew that all systems starts small, i did that. I made a small program that saves an array of structs in a file. And then focused my "grey brain cells" -like Agatha Christie said- on how to generalize that to any user input; on the same struct ( username, pk, and age ). I figured it needs a SQL parser, so i left that part, and went in the other direction, how to generalize the struct contents first ?. I found a great one here http://groups.google.com/group/comp.lang.c/browse_thread/thread/527b451b96ce957a/d612253f370e6aab?q=variable+length&rnum=3#d612253f370e6aab . So i started designing the datastructures required.
I forgot to mention that in coincidence i read in "Introcution to Algorithms" about B-Trees and how they are used in DBMS to index tables in order to minimize disk I/O. Imagine thay can minimize disk I/O for a billion-record table from 32 IO read ( using binary search ), to only 3 IO reads ( of course it traded-off to 3000 comparisons ).
The second step after designing the data structure is defining the way of altering a table's content. Then after that build a way to handle 'parsed' queries and mapping them to these APIs. Parsing SQL is not an issue, some hours with Yacc/flex, and BOOM we have one.
The only remaining part, is what is required "exactly" from a DBMS. The basic implementation/engine is defined by Relational calculus. Other parts will just complay to any SQL standard, i have chosen SQL-92.
As usual i feel this is too short article, maybe i will follow up sometime with part2 or something continuing the in-depth details of the data-structures and how i 'will' -isA- handle the data integrity checks, and table altering, and the indexing problems - B-Trees is not easy to implement yo know !
I'll just go complete the data-structure heirarchy now, wish me luck!

Wednesday, January 4, 2006

OldPosts: A new programming language

I've recently discovered a new programming language called Ruby ( in Arabic it means ياقوت ) http://ruby-lang.org. It have a very easy and intiuitive syntax. As it follows the POLS : Principle of Least Surprise its output is bug-free most of the time and you rarely need to debug.

A RAD ( Rapid Application Development ) web application framework for it called Ruby On Rails ROR is comparable to Apache Struts for Java. Using ROR you can be highly productive, a complete weblog system, like this one ( MSN weblog ), can be completed entirely from scratch including testing and administrative interface in 15 minutes. I did not believe this at first until i downloaded the movie in http://rubyonrails.org/screencasts .

ROR, just like Struts, employs MVC ( Model-view-controller ) pattern for its web applications. And one of its powerful features is that it automated ORM ( Object relational model ) for you. For example, after you create the database and create your web application ( not by typing anything, just by one click ) you'll find the web pages which displays the content and the pages modifies it is ready. If you added a column in a table in the database and just refreshing the page in your browser you'll find a new text area ( or the corresponding control to the new column data type ) is added and fully working in both the view and the edit pages!

All of the typing you will ever do, if you wanted, is that you'll just rename the page title,or reorder how the data is displayed only. Validations is declarative, not imperative. You just say "validate_exist field_name". Same with special relations like one-to-many you only say "has_many object_name". After that you will never have to write any SQL query, well, any SQL at all.

In the sample weblog he only wrote 58 lines of code. with 45 of them auto-generated. Which reminded me of a system i worked on before. It was a J2EE web application which handles the registration of a credit-hours college system. It was 3 months, 37 classes, and more than 2300 lines of code. I had to hand it with some known bugs because the college started and i had no time to fix them. After I've seen that ROR sample I estimated that all the work can be done in 30 minutes. This seems unbelievable, I know. Although I didn't actually rewrote it using ROR but we can count the time estimated. I will not include the time for database because it is already made. Say 1 minute to "click" to generate the solution, 10 minutes to navigate in the generated solution and change the "order" and "place" of data. Another 10 minutes to correct the "validations" and "test" them. This is total of 10+10+1 = 21 minutes. The other 9 minutes is for "yawning" and grabbing something to eat or drink a hot drink !! Now let's compare that to if it was done using J2EE ( not Struts ). First, create a list of the actions in the system ( 2 days ). Write the model - including a loooot of SQL statements and interfacing Java Date with SQL Data ( 3 weeks ). Implement the actions ( 2 weeks ). Implement the view - HTML/JSP pages ( 1 week ). Test ( 2 weeks, there is a lot of data to test ) [Note: ROR can generate tests for you too ]. In total of 3 + 2 + 1 + 2 = 8 weeks = 2 months ( and 2 days ). The other month was for learning Java and JSP and MVC. To be fair, i think using Struts it would take less than a week, excludsing the time needed to learn Struts. In ROR i didn't have to learn MVC nor JSP, and Ruby is so intiuitive that you can learn it in 15 minutes See: http://tryruby.hobix.com/ for a 15-minutes tutorial. Conclusion, the system can be made in 45 minutes, INCLUDING learning the langauge and ROR! ROR is so simple that it can be learnt from a few examples not even a tutorial.

I would like to write more about the origin of the language and its history with a detailed list of programming langauges popularity and Ruby's place in that list but i am really busy with the exams so I think I can add that later. Thanks for reading.