Hardcore C programmers

djberg96 on 2002-11-07T14:52:42

I was trying to figure out the best way to implement a form of split() in pure C last night, so I wandered in the #C chatroom on IRC.

I ended up chatting with a guy named Philip, who has apparently been coding in C for 20 years, has a vast library here, thinks C is the easiest language around, and has general disdain for anything that isn't compiled into bytecode. He especially didn't like Perl.

One of the reasons he didn't like Perl is the confusion he felt it created with regards to the terms "hash" and "associative array". His claim was that they're really two different things, although I could never get him to explain what exactly the difference was.

So, one of my goals for the week is to find out exactly what the difference is. :)


the difference...

gav on 2002-11-07T15:28:20

The difference is that they are exactly the same.

associative array

<programming> An array where the indices are not just integers but may be arbitrary strings.

hash

3. The preferred term for a Perl associative array.

-- FOLDOC

Hash vs AA

jdporter on 2002-11-07T16:40:44

They really are different. If nothing else, the two terms are meaningful at different levels of abstraction.

An associative array is a source language level concept. It means an array which is indexed by arbitrary, unordered values, typically (but not necessarily) strings.

A hash table is a data structure in which keys are crunched through a hashing function in order to get the actual index used in the table. This is to handle both the unordered-ness and the sparseness of keys.

Associative arrays need not be implemented with hash tables, and hash tables need not be manifest in the source language (e.g. perl) as associative arrays. This is just how Perl happens to do it.

It is true that "hash" is therefore not an entirely accurate name for the perl variable type, but it's a lot shorter. Also, if you squint, "%" looks rather like "a/a" (as well as like "H").

Re:Hash vs AA

nicholas on 2002-11-07T17:04:28

An associative array is a source language level concept. It means an array which is indexed by arbitrary, unordered values, typically (but not necessarily) strings.
A hash table is a data structure in which keys are crunched through a hashing function in order to get the actual index used in the table. This is to handle both the unordered-ness and the sparseness of keys.

Correct me if I'm wrong, but to add to this I believe important distinction that a hash value is a number created by a function which effectively acts as a signature for the object (eg string). And hash values may collide - ie two or more different objects may happen to generate the same hash value for some hashing function. If implemented a store by generating 8 bit hash values for strings and used that 8 bit hash value as the index in an array, then you're likely to start finding that things collide (ie two different strings want to be in the same array position). Notice that a hashing function is a one way trip - given an object you can calculate the hash value, but given the has value, you can't get an object back

Whereas an associative array is an abstract concept, which doesn't tell you how it's implemented, merely that it will fulfil some properties. One of which is that you won't get different objects trying to be stored in the same place, another is that you can get your objects back out.

People keep talking about hash methods for parrot, and I get frustrated because I don't think that they really just mean methods for calculating hash values. I think they are usually meaning the whole whack about finding unique ways to serialise objects so that distinct objects have 0 chance of collisions, and so that they can get the objects back