Thursday, January 28, 2010

A Case for Redundant Arrays of Inexpensive Disks (RAID)

A Case for Redundant Arrays of Inexpensive Disks (RAID)
by David A. Patterson, Garth Gibson, Randy H. Katz

Important Points --
* The paper addresses innovation necessary to speed up I/O intensive applications.
* The fundamentals introduced by the paper are still applicable, and further improvements like RAID-DP(RAID level 6) build on top of the ideas presented here.
* A new paradigm is presented to improve performance, reliability, power consumption and scalability of disk usage.
* The metrics are appropriately defined for supercomputers and transaction processing systems, depending on the access patterns.
* The paper is clear and iteratively explains the advantage of each level, the problem in the level, and how the next level adddresses a particular problem.
* The quantitative and qualitive arguments in the paper about the advantages of the taxonomy of 5 different organizations of disks arrays are spot on.
--
Deficiencies--
* For level 3 RAID, the paper does not describe the ECC code used and thus how ECC is calculated using only one check disk per group. Though a reference to a paper [Park86] has been provided, it would have good to have a small description in the paper as well. Thus it was easy to determine why level 3 RAID is better suited to supercomputing applications than to small transaction processing systems.
* The paper could have talked more about how the ideas can be implemented by software instead of hardware.
--
Conclusion--
The paper presents the advantages of using inexpensive PC disks and advantages of 5 different levels of organization of these arrays. It is able to effectively describe the advantages of the different levels in terms of important quantitative parameters, and the kind of applications for which each level is suited. I think the paper marked a major change in data storage paradigms, applicable across the computing industry, whose fundamentals are still applicable nowdays.

Eraser: A Dynamic Data Race Detector for Multi-Threaded Programs

Eraser: A Dynamic Data Race Detector for Multi-Threaded Programs
by Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, Thomas Anderson

Important Points --
* Provides a tool to detect race conditions simpler, more efficiently and more thorughly than previous tools.
* Provides a tool based on locking displine being satisfied, which are more commonly used for synchronization of multi-threaded programs of the time.
* Even catches race of dynamically allocated data, which was not possible easily by prior dynamic race detection tools.
* It handles cases that could have caused false alarms in a naive implementation like memory reuse, private locks, benign races.

---
Deficiencies --
* Since it is a testing tool, it cannot guarantee that programs are free from races. The ideal solution is still to be found.
* It would not be used for systems that deploy other synchronization primitives.
* Slows down the performance of the system.
* It's implementation is OS specific, the paper does not provide a standard tool that could be used with different operating systems and different synchronization mechanisms.
--
Conclusion--
I think the paper does provide an effective way to test dynamic race conditions on certain programs that follow the lock discipline, and their implementation tries to cover a lot of different use cases. Though, they have not been able to provide a generic framework that could be used for different operating systems, and different synchronization mechanisms.

Friday, January 22, 2010

Handy Distributed System Numbers

Handy Notes from Google's keynote at LADIS 2009

http://perspectives.mvdirona.com/2009/10/17/JeffDeanDesignLessonsAndAdviceFromBuildingLargeScaleDistributedSystems.aspx

Time, Clocks, and the Ordering of Events in a Distributed System.

Time, Clocks, and the Ordering of Events in a Distributed System.
by Leslie Lamport

My thoughts on the paper -

Important Points ---
* Provide an effective simple approach, that seems to be easy to implement, that resolves an very important synchronization problem.

* They effectively present their approach in the form of a set of 2 conditions that if satisfied would provide total ordering of system events. They provide an efffective algorithm to involving multiple processes that uses message acknowledgement protocols and message numbers.

* Paves the way for future research in areas like failure handling , i.e. when a process fails to respond.

Deficiencies ---
* They also argue that this problem applies to any multiprocessing system, even if all the processes are on the same machine. I think there could more simplistic ways, if the processes are on the same machine, it was not clear from the paper what was the synchronization problem when the processes were on the same machine.

* The algorithm given to synchronize access to a single resource from multiple processes seems to have a lot of overhead as for every resource request messages are sent to and recived from from all other processes.

Conclusion ---
This paved a way for a solution to a very important problem is distributed systems, and is still commonly used paradigm in distributed systems. Nowdays, Network TIme Servers are used quite a lot for synchronization, which extends the same paradigm.

Using threads in interactive systems: A case study

Using threads in interactive systems: A case study
by Carl Hauser, Christian Jacobi, Marvin Theimer, Brent Welch, Mark Weiser

My thoughts on the paper -

Important Points:
* Some threading paradigms introduced by the paper like monitoring are easy to be used, and thus are useful and convenient for the programmer.

* Some paradigms introduced are commonly used even today like sleepers, which are widely used even these days. Sleepers contribute to programming convenience by allowing background operations like garbage collecting to be separated from the primary tasks. Others like slack processes etc. are challenging.

* Serializers was an important paradigm introduced used commonly even today. I think serializers inspired the concept of thread pools, which is a set of fixed or variable number of threads created for the purpose of picking up events from a queue and performing the tasks concurrently. Thread pools should actually be considered as the new paradigm introduced called concurrency exploiters.
Thread pools are widely used for communications between 2 different server systems.

* The paper effectively describes common mistakes made in thread programming, like in usage if WAIT wherein predicate is not checked once the WAIT is over.

* It describes various new design issues that exist in threading paradigms, and tries to understand them and rectify them.

Deficiencies:
* Although it has been mentioned in the paper that task rejuvenation is controversial, I would like to add that the fact that it masks design flaws makes it an unfeasible approach. In typical systems, there are bound to be a lot of design flaws because of the complexity involved. And there is always a dearth of mechanisms/instrumentations to uncover these flaws. So any approach that masks these is not feasible at all.

* To me Deadlock avoider paradigm, seems to be a very simplistic way of trying to avoid deadlocks and there does not seem to be any reasoning as to how it would ensure deadlocks would be avoided. Though I am biased, because I am already aware of other mechanisms like graphing etc. that may not have existed at that point of time.

* I think the paper lacks in providing feasible solutions for usage of certain paradigms like slack processes. It seems the paper describes all the challeneges involved but fails to provide any solution on how the slack processes could be effectively used.

Conclusions :
I think this paper is a useful point of reference for programmers to determine the practical advantages of a particular thread paradigm, the common mistakes made in thread programming, flaws existing in the current thread paradigms, and also as a point of reference for researchers to do future research on paradigms that still require improvement or are not serving their purpose enough.

I think the most important lesson from the paper is that due to the complexity of threading systems, there are bound to be new design issues that arrive, and rather than building systems that try to masks these issues, the systems should be built to be able to accurately understand the root of these issues, understand them and rectify them.

Implementing Remote Procedure Call

Implementing Remote Procedure Call
by A. D. Birrell, B. J. Nelson

My thoughts on the above paper -

The paper was written a long time ago and discusses the thinking that went into why an RPC mechanism would be useful for distributed computing applications, what are the advantages for its use in distributed computing applications. It also describes how it is similar to other mechanisms that could be used for communication across machines like message passing, but it describes the advantages in terms of simplicity as it can be used by the same programmers who use procedure calls, but do not have distributed systems programming experience. It also describes other goals in mind while designing the RPC mechanism like efficiency, generality, clean and simple semantics. It describes in a detail the modules involved, which seem to be still used in RPC libraries popular today like ONCRPC(wherein rpcgen program is used to genrate the stubs, similar to what Lupine does as described in the paper).

Thursday, August 9, 2007

Blob

Main Entry: 1blob
Pronunciation: 'bläb
Function: noun
Etymology: Middle English (Scots)
1 a : a small drop or lump of something viscid or thick b : a daub or spot of color
2 : something ill-defined or amorphous