Wednesday, September 29, 2010

Reading SGI STL source code

Reading source code is fun and very useful for programmer. SGI STL is the original STL implementation, and also it is considered to be a paradigm for C++ templateprogramming. By comparison with Linux source code, SGI STL is very small, and sould be very easy to understand.

However, when I start to do it, I really have not ideal where to start. Fortunately, I got a book that explains how to read its code. After a few hourse reading the book, I know some basic of SGI STL, and found its entry.

Although there are some comments on SGI source code, I thought it’s still a challenge for a computing science student if didn’t have help.

Friday, September 24, 2010

New ideas

In cmpt431, distributed system, professor asked us to discuss “new ideas” about “cloud computing”. Although I thought very hard, there were no NEW ideas appear. Also, others classmates’ ” new ideas” were not actually new.

Recently, when I review the hash table, I really though I got a new idea about collision. Traditionally, we use a linked list, ordered or unordered, to store collided elements. In worst case, its perfermance would degrade to O(n), it is very bad. What I thought is uses a balanced tree, such as red-black tree to store collided elements. In worst case, its perfermance would still keep in O(log(n)).

After I read “hast table” in wikipedia, I found that it is not a NEW idea. In wiki, there is detailed discussion on its advantage and disadvantage. However, I thought it might be a good sign because it means I’m on the right track of thinking about data structure.

Tuesday, September 21, 2010

RTFSC

[ Read The F***ing Source Code – allegedly from a very famous IT guy ]

When I wanted to find out how the “map”, which is an associative container, is implemented in C++ STL, things were not going well. I got some information from Internet that map is implemented by red-black tree, but none of wikipedia, Microsoft or www.cplusplus.com confirms it. They just said that “map” is a sorted associative array.

After I did some research on this topic, I confirmed that all informations are right. C++ STL did not define the implementation of “map”. Therefore, C++ compilers and C++ libraries could have their own implementations. In general, all balanced trees are satisified to implement “map”.

After going through the SGI STL source code (http://www.sgi.com/tech/stl/), one thing is sure: “map” in SGI STL is implemented by redblack tree.

Sunday, September 19, 2010

Confusing of “map” (data structure)

According to wikipedia, a map is an associative array, and “an associative array (also associative container, map, mapping, dictionary, finite map, and in query-processing an index or index file) is an abstract data type composed of a collection of unique keys and a collection of values, where each key is associated with one value (or set of values).” we could simply see a map as a colletion of <key, value > pairs. This definition does not define the detail of its implementation.

However, in C++ STL, map is a sorted associative array, and is implemented by red-black tree. Therefore, more precisely, in C++, a map is a container that implemented by red-black tree, and nodes are pairs.

In java, a map is an interface that “maps keys to values.” The abstractmap class implemented map interface. Under the “abstractmap” class, there are hashmap and treemap. Obviously, a hashmap is implemented by hash table, and a treemap is implemented by red-black tree.

Therefore, when we talk about “map”, we have to know the context. However, saying “a hash table” is a map is always not right.

Saturday, September 18, 2010

Hash table (1)

Half year ago, I got a co-op opportunity in SAP. Once our group, 6 software developers, had lunch together, and the lead talk about a new grad he interviewed that morning, and said that girl did not answer well a question about hash table. Then, he asked me the same question: what is a hash table? I said: simply speaking, hash table is an array, plus listed list. Supprisely, our lead said: no, hash table is a map, and suggest me to get a newer data structure text book about.

After that, I did search new text books and online resource for hash table. And the result is still a little bit confusing. When developers use hash talbe containers, specally in perl, it really looks like a map, which contains [key, value] pairs. However, how is it implemented in low level? It’s still an array + linked list + hash function.

Hash table is not a standard contain in C++ STL now, but gcc and VC++ have already implemented it, and it will be in next version of STL standard (C++ TR1).

To make things more complicated, now people call it different name in different language. In STL, it has been call hash_map or unordered_map. In C#, Microsoft calls it “dicationary” container.

For most software developers, maybe they really don’t need to know the low-level different between hash table and map. But, if developers readlly want to make their program elegant and efficient, know more about the implementation is much helpful.

Thursday, September 16, 2010

Resolution of this semester

It is a restart for my blog. I planed to write blog routinely for recording some interesting things and improving my writing long time ago, but can not insist on it.

Now, it is an assignment for CMPT376W, and I hope that I will be accustomed to write blog regularly.

Except classes I take this semester, I will spend more time on find a job because this term is my final term in SFU, and I alread applied for graduation.

Therefore, my blog will focus on data structure and algorithm, the core part to find a job in computing science field. Plus, I hope I could write 3 blogs per week.